Care without compromise
Senior Site Reliability Engineer
Location
United Kingdom
Posted
31 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer
Lyrebird Health
• Keep production systems online and restore them quickly when they fail • Lead and manage incidents, making high-quality decisions under pressure • Design and implement scalable infrastructure and deployment patterns • Build and improve CI/CD pipelines and release systems • Improve monitoring, telemetry, and observability across the stack • Own cloud infrastructure, security, and access controls • Work closely with engineers to ensure systems are built to scale from day one
Job Requirements
- 5–7 years experience in SRE, platform engineering, or DevOps roles
- Strong AWS experience (ECS/Fargate, EC2, Lambda, SQS, IAM)
- Experience running and scaling production systems
- Strong understanding of distributed systems and scaling approaches
- Hands-on experience with Docker and containerised environments
- Experience with Kubernetes or ECS
Benefits
- None stated
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Build, maintain, and improve CI/CD pipelines to automate software delivery under the guidance of senior engineers and established patterns. • Provision and manage cloud infrastructure on AWS using Infrastructure as Code (IaC) principles and tools like Terraform. • Support the implementation and maintenance of containerization and orchestration solutions, such as Docker, ECS, and Kubernetes (EKS), to support microservices architecture. • Develop and maintain scripts to automate operational processes, monitoring, and alerts to ensure system uptime and performance. • Collaborate closely with software engineering Scrum teams to streamline development workflows, optimize application performance, and troubleshoot production issues. • Support and maintain monitoring, logging, and alerting solutions using tools like AWS CloudWatch, Datadog, or Splunk to proactively identify and resolve system issues. • Provide engineering support to technical team members for deploying, configuring, and supporting systems. • Apply security best practices for access control, network security, and vulnerability management across infrastructure. • Consistently live and model the organization's core values, leadership, and personal competencies. • Stay current with market and industry trends relating to prevailing and emerging technologies in the DevOps space. • Participates in special projects and handles all other assigned duties as required.
• You actively shape the operation and continuous evolution of our Kubernetes and OpenShift environments. • You implement modern deployment workflows using Git and ArgoCD to keep our infrastructure stable and scalable. • You use Grafana to maintain visibility into our clusters and ensure high availability and reliable alerting. • You bridge the gap between traditional and new environments and securely administer our Linux and Windows servers. • You understand the needs of the business units and translate them into precise technical specifications.
• Design and operate scalable, secure cloud infrastructure (AWS, GCP, Azure) • Build and maintain Infrastructure as Code (Terraform, Pulumi, CloudFormation) • Own runtime platforms (Kubernetes, serverless, container platforms) • Containerization of .NET and Windows workloads • Evolve system architecture, scaling strategies, and failure handling to align with distributed, elastic environments • Shift mindset from static provisioning to dynamic, API-driven infrastructure • Design and implement robust CI/CD pipelines for multiple services and teams • Standardize build, test, and deployment workflows • Enable safe, fast releases through automation, testing, and progressive delivery strategies • Improve deployment frequency while reducing change failure rates • Move from primarily rolling deployments to more advanced deployment strategies • Build platform capabilities that make safe deployment strategies the default for engineering teams • Eliminate manual processes through automation and self-service tooling • Build internal developer tooling that reduce friction and cognitive load • Improve developer workflows across build, test, deploy, and operate phases • Drive adoption of platform capabilities across engineering teams • Integrate security into CI/CD pipelines and infrastructure workflows • Implement identity, secrets management, and secure supply chain practices • Ensure compliance with standards (SOC2, HIPAA, etc. where applicable) • Partner with security teams to embed controls into platform tooling • Redefine RBAC strategies across infrastructure, CI/CD, and Kubernetes environments • Act as a technical authority across DevOps, SRE, and platform engineering • Drive architectural decisions and long-term technical strategy • Mentor engineers and influence best practices across teams • Lead cross-team initiatives and platform adoption efforts • Partner with product engineering to improve delivery workflows • Work with SRE, security, and data teams to align platform capabilities • Collaborate with leadership on roadmap, priorities, and tradeoffs • Influence engineering culture toward automation and reliability.
Senior Software Engineer, DevOps
Muck RackMuck Rack is a public relations software company that makes it easy for media, marketing, and public relations professionals to build reports, monitor news and stories performance
• Partner with DevOps and application engineering teams to design and improve standardized infrastructure, deployment workflows, and CI/CD pipelines • Build, operate, and continuously improve our containerized infrastructure, with a focus on reliability, scalability, and cost efficiency • Bring deep expertise in Kubernetes and its ecosystem, helping define best practices and support adoption across engineering teams • Manage and evolve cloud infrastructure on AWS, including compute, networking, and storage systems • Improve observability, monitoring, and incident readiness across production systems • Participate in a shared on-call rotation (approximately 1 week every 5 weeks), responding to and resolving production incidents • Mentor teammates and share knowledge to improve engineering practices and operational maturity across the organization




