Job Closed

This listing is no longer active.

Civica US

We're a global company building smart software that helps improve public services

Senior Site Reliability Engineer

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 51-200Since 2023H1B No SponsorCompany Site LinkedIn

Location

United Kingdom

Posted

84 days ago

Salary

Seniority

Senior

Bachelor DegreeEnglishAnsible AWS Azure Cloud Google Cloud Platform Grafana Java Kubernetes OpenShift Packer Prometheus Python Terraform VMware Go .NET

Job Description

• Designing and implementing for scale & resilience: Architect, implement and continuously improve our existing Data Center and Cloud environments on AWS, Azure, and VMware, ensuring they meet our SLAs and adapt dynamically to demand working alongside the Platform teams providing PaaS/IaaS. • Driving automation: Build and evolve infrastructure as code (Terraform, etc.) and CI/CD pipelines (GitHub Actions, etc.) to ship new features safely and at speed. • Defining and measuring reliability: Partner with teams to set up meaningful SLIs/SLOs, implement real-time observability (Datadog, Prometheus, Grafana, ...) and proactively identify risks before it impacts our users. • Leading incident response: Own the on-call rota, coach teams through blameless post-mortems, and embed a culture of continuous improvement so outages become learning opportunities. • Mentoring & evangelism: Share your deep expertise by pairing with engineers, running brown-bag sessions on reliability best practices, and helping raise the bar across our global engineering organisation. • Securing our stack: Collaborate with our Security team and include security controls into CI/CD, runtime environments and disaster-recovery plans; so, our customers and citizens are always protected.

Job Requirements

Demonstrable experience in a production SRE, DevOps or infrastructure role, ideally within a SaaS or large-scale web environment
Expert in at least one public cloud (AWS, Azure, or GCP) and comfortable designing hybrid migrations from on-prem to cloud
Strong coding/scripting and troubleshooting skills (on either of Go, .NET, Java, Python, etc.) and a passion for building reusable tested libraries and tooling
Proven track record with IaC tools (Terraform, CloudFormation, or similar) and container orchestration (Kubernetes, ECS, AKS, OpenShift)
Proven track record with virtual machine orchestration / provisioning and resiliency strategies (Kubevirt, packer, ansible)
Deep understanding of monitoring, logging, and tracing frameworks (Prometheus/Grafana, ELK/Opensearch, Jaeger, etc.)
Excellent communicator who thrives in cross-functional teams, with passion for translating complex technical issues into clear, actionable plans.

Benefits

25 Days Annual Leave + bank holidays – plus the option to buy up to 10 extra days!
Days of Difference – Up to 3 extra days off for volunteering.
Pension Contributions – 5% employer match to support your future.
Income Protection – Up to 75% salary cover for long-term illness.
Life Assurance – 4x salary tax-free lump sum.
Critical Illness Cover – £25,000 lump sum (extendable to dependents).
Private Medical Insurance – Fast access to private healthcare.
Health Cash Plan – Claim back physio, therapies & more.
Dental Insurance – Cover for routine & emergency care.
Electric Vehicle (EV) Scheme – A wide range of electric & hybrid vehicles.
Affinity Groups – Join employee-led communities.
Bounty Bonus – Refer a friend & get rewarded.

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

DevOps Engineer, Software Configuration

Meduit | Driving Revenue Cycle Performance

DevOps Engineer84 days ago

Full Time RemoteTeam 1,001-5,000Since 2017H1B No Sponsor

Company Site LinkedIn

• Design, implement, and maintain CI/CD pipelines that pull source code from GitHub, build Java applications, and package deployment artifacts • Automate deployments across multiple environments, including Development, QA, UAT, Pilot, and Production • Integrate automated smoke tests and validation checks into deployment pipelines • Establish artifact versioning, tagging, and promotion strategies to support controlled releases and rollbacks • Support application deployments across AWS and Azure virtual machine environments • Assist with the transition of applications toward containerized deployments using Kubernetes • Maintain environment consistency across platforms and deployment stages • Own and enforce source control branching, merging, tagging, and release conventions • Maintain build, deployment, and release documentation, including runbooks and standard operating procedures • Coordinate software releases with Development, QA, and Application Support teams • Assist with troubleshooting build, deployment, and environment-related issues • Participate in incident response related to deployment or configuration failures • Perform other duties as assigned

AWS Azure Java Kubernetes Linux

View details: DevOps Engineer, Software Configuration

North Carolina

$130K - $145K / year

Apply

Team Lead, Site Reliability Engineering - Fleet Management

MongoDB

MongoDB, originally called 10gen, is a software development company. Since 2007, MongoDB has created an open-source, document-oriented database to help clients

DevOps Engineer84 days ago

Full Time RemoteTeam 5,550Since 2008

Company Site

Role Description The Fleet Management team provides the core runtime environment that empowers our developers to build and ship products to delight our customers. We manage the end-to-end lifecycle of our Kubernetes fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper). As our infrastructure scales to support new use cases and products, we are spearheading a migration from Terraform-based Infrastructure as Code (IaC) to an Operator-driven lifecycle management model. This role can be based out of our Austin, Boston, Los Angeles, New York City, Raleigh, or San Francisco offices, or remotely in the United States region. - Manage a team of 6-8 engineers, fostering a positive culture, handling career growth and performance conversations, and proactively removing blockers. - Help develop a clear technical vision and comprehensive roadmap for our runtime environment, balancing long-term strategic infrastructure goals with immediate engineering needs. - Contribute through light hands-on technical work, such as leading architectural design reviews, reviewing PRs, and stepping in to guide the team through complex operational challenges. - Act as the primary liaison for the Fleet Management team, collaborating closely with other engineering leaders to ensure platform alignment and manage stakeholder expectations. Qualifications - 10+ years of experience working on software and operating distributed systems, with 2+ years managing engineering teams. - Possess a customer-focused mindset, treating internal developers as your primary users. - Value efficiency in processes and operations, and have a track record of optimizing team workflows. - Prefer automation over manual processes ("allergic to ops work"), fostering a culture of building software solutions to eliminate toil. - Deep technical familiarity with Kubernetes ecosystems, containerization technologies, and modern IaC tooling (e.g., Terraform, Crossplane, or Operators). - Excel at translating complex business and engineering requirements into actionable, phased technical roadmaps. - High level of empathy, responsibility, ownership, and accountability. - Excellent verbal and written technical communication skills. Requirements - Leading major architectural shifts, such as migrating teams from traditional IaC to Operator-driven lifecycle management. - Managing and scaling infrastructure across multi-cloud environments (AWS, GCP, or Azure). - Designing secure, multi-tenant runtime environments at scale. Benefits - Equity participation. - Participation in the employee stock purchase program. - Flexible paid time off. - 20 weeks fully-paid gender-neutral parental leave. - Fertility and adoption assistance. - 401(k) plan. - Mental health counseling. - Access to transgender-inclusive health insurance coverage. - Health benefits offerings.

View details: Team Lead, Site Reliability Engineering - Fleet Management

United States

$151K - $297K / year

Apply

DevOps Engineer II

Pathward

Forward Thinking

DevOps Engineer84 days ago

Full Time RemoteTeam 1,001-5,000H1B No Sponsor

Company Site LinkedIn

• Works with development teams to improve efficiency in the software deployment process. • Responsible for updating, maintaining, and selecting development tools used for tracking, building and deploying software. • Defines processes used to perform development work supporting the Agile framework. • Develops continuous integration and continuous deployment concepts and tools. • Improves deployment frequency while minimizing business impact. • Improves, updates, and maintains the branching strategy. • Improves processes to lower failure rates of new releases and improve tools and processes to create a continuous Integration environment. • Assists in containerization of software environments. • Provides documentation and training for supported tools. • Other duties as assigned

View details: DevOps Engineer II

Arizona + 5 more

$86K - $145K / year

Apply

Job Closed

DevOps Engineer

Open Function

OpenFn helps scale public health & humanitarian interventions via data integration, automation, and interoperability.

DevOps Engineer84 days ago

Contract RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Build World-Class Deployment, Monitoring, and Instance-Maintenance Tooling • Develop and maintain devops tooling (ansible, terraform, custom CLI programs), deployment runbooks, configuration templates, and documentation that allow the teams and people responsible for deploying, monitoring, and maintaining instances of OpenFn to succeed. • Deliver On-Premise and Local Deployments • Lead and execute OpenFn deployments on government and ministry-managed infrastructure, including air-gapped, low-connectivity, and sovereign-hosting environments. • Configure and maintain containerized deployments using Docker, Docker Compose, and Docker Swarm, and support Kubernetes-based setups where applicable. • Work directly with government IT teams to navigate local infrastructure constraints, security requirements, and network configurations. • Troubleshoot infrastructure and runtime issues in the field, often with limited access to external resources. • Cloud Infrastructure • Maintain and optimize OpenFn deployments on GCP, AWS, and occasionally Azure, including compute, networking, storage, and managed services configuration. • Implement and maintain CI/CD pipelines for services team deployments. • Monitor system performance, set up alerting, and respond to infrastructure incidents across cloud-hosted client environments. • Advise implementation teams on cloud architecture decisions and cost optimization. • Internal Standards and Enablement • Build and maintain internal DevOps standards, deployment guides, and infrastructure-as-code templates that the wider services team can use and build on. • Contribute to pre-sales and scoping conversations by advising on infrastructure feasibility, hosting options, and deployment effort for prospective clients. • Work closely with the Principal Solutions Architect and the CTO to ensure deployment strategy is aligned with solution architecture from the start of each engagement.

Ansible AWS Azure Cloud Docker Google Cloud Platform Grafana Kubernetes Linux Prometheus Terraform

View details: DevOps Engineer

Kenya

Apply

Senior Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer, Software Configuration

Team Lead, Site Reliability Engineering - Fleet Management

DevOps Engineer II

DevOps Engineer