Arbor MIS helps schools and MATs work more easily and collaboratively. Join a free webinar: http://bit.ly/Arbor-webinars
Senior DevOps Engineer
Location
United Kingdom
Posted
24 days ago
Salary
£70K - £80K / year
Seniority
Senior
No structured requirement data.
Job Description
Senior DevOps Engineer
Arbor Education
Role Description We are looking for an experienced and dynamic Senior DevOps Engineer to join our Engineering team and help us improve the resilience and performance of the Arbor platform, enabling the business to rapidly scale. The remit and focus of the role is to continuously fix and improve our architecture, infrastructure, and ways of working. This requires an obsessive, metrics-driven approach, a curiosity about system behaviour, and close collaboration with the solution architect and engineering teams. It’s a broad and exciting role, so we’re looking for someone up for a challenge - if you’re a collaborative and curious Engineer, this is the role for you. Core Responsibilities - Work with Head of Platform and Head of SRE to identify improvements within the platform infrastructure and implement plans to address. - Work with the Platform teams to improve the maturity of all components within the system, including ensuring High Availability, and adequate testing and DR plans. - Contribute to improving our CI/CD pipelines and providing patterns of deployment. - Assist in incident response and resolution, and subsequent post-mortems and retrospectives. - Participate in tech-talks and team based learning to ensure knowledge is spread. - Document obsessively, relying on Playbooks/Runbooks and systems documentation to aid knowledge transfer. Qualifications - Extensive experience of DevOps Engineering and operating a large scale platform. - Extensive experience of distributed cloud systems, and specifically Amazon Web Services. - Extensive experience of Infrastructure as Code tooling, such as Terraform, Ansible, Cloudformation etc. - Understanding of relational database technologies and their cloud versions (e.g. AWS Aurora). - Experience with messaging and distributed asynchronous workloads. - Experience with nginx or similar technologies. - Experience with DataDog, Prometheus or similar tools. - A positive and proactive attitude to problem solving. - A team player, willing to muck in and help others when needed, driven personality who asks questions and actively participates in discussions. - Good written and spoken English so you can present your ideas. Bonus Skills - Past experience with enterprise solutions running at scale. - Familiarity with kanban and agile development processes. - Experience with Docker and containerisation. - Familiarity with software best practices such as Refactoring, Clean Code, Domain-Driven Design, Test-Driven Development, etc. Benefits - The chance to work alongside a team of hard-working, passionate people in a role where you’ll see the impact of your work every day. - A dedicated wellbeing team who champion initiatives such as mindfulness, lunch n learns, manager training, mental health first aid training and much more! - 32 days holiday (plus Bank Holidays). This is made up of 25 days annual leave plus 7 extra company wide days given over Easter, Summer & Christmas. - Life Assurance paid out at 3x annual salary. - Comprehensive wellness benefit provided by AIG Smart Health, which provides a 24/7 virtual GP service, Mental health support, Counselling, and personalised Health Checks. - Private Dental Insurance with Bupa. - Salary sacrifice Pension provided by Scottish Widows. - Enhanced maternity and adoption leave (20 weeks full pay) and paternity (6 weeks full pay) pay. - 5 free return to work maternity coaching sessions, helping you adapt to this new exciting time of life! - Access to services such as Calm and Bippit (financial wellbeing coaching). - All of our roles champion flexible working and we are happy to discuss what this means to you. - Social committees that plan team, office and company wide events to bring people together and celebrate success. - Dedicated professional development training budget (CPD courses, upskilling resources, professional memberships etc). - Volunteer with a charity of your choice for a day each year. - Dog friendly offices! Interview Process - Phone screen. - 1st stage. - 2nd stage. Company Description Arbor Education is an equal opportunities organisation. Our goal is for Arbor to be a workplace which represents, celebrates and supports people from all backgrounds, and which gives them the tools they need to thrive - whatever their ambitions may be so we support and promote diversity and equality, and actively encourage applications from people of all backgrounds. Refer a Friend Know someone else who would be good for this role? You can refer a friend, family member or colleague, if they are offered a role with Arbor, we will say thank you with a voucher valued up to £200! Simply email: careers@arbor-education.com Please note: We are unable to provide visa sponsorship at this time.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Team Lead
ZensarAt Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better futures. Whether for our clients, our people, or the world around us, this belief powers everything we do. At the heart of our culture is ONE with Client - a set of four core values that reflect who we are and how we work: One Zensar, Nurturing, Empowering, and Client Focus. Part of the $4.8 billion RPG Group, we’re a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace. All qualified applicants will be considered without regard to race, creed, color, ancestry, religion, sex, national origin, citizenship, age, sexual orientation, gender identity, disability, marital status, family medical leave status, or protected veteran status.
Role Description - Lead the DevOps/platform squad; own Definition of Done/Ready; track velocity and flag risks. - Own and evolve CI/CD pipelines; implement blue/green, canary, and feature-flag progressive delivery. - Manage AKS clusters: node pool sizing, upgrades, namespace isolation, network policies, cost optimisation. - Author and maintain Terraform modules for all Azure infrastructure; enforce IaC review and state lock. - Design and operate observability stack: Prometheus, per-squad Grafana dashboards, distributed tracing. - Manage Kafka clusters and Azure Service Bus; drive pipeline security automation (SAST, scanning, secrets). - Maintain Elasticsearch cluster health; tune ILM policies for 52M+ invoice documents. - Lead on-call rotation; own post-incident reviews; mentor DevOps engineers and run knowledge-sharing sessions. - Coordinate with software delivery squads on platform enablement and pipeline improvements. Qualifications - 8+ yrs DevOps / Platform Engineering / SRE; 2+ yrs in a team lead or tech lead role. - Expert Kubernetes (CKA or equivalent) and Helm; strong Terraform on Azure or AWS. - Production Kafka operations and Elasticsearch cluster management. - CI/CD pipeline design (Jenkins / GitLab CI / GitHub Actions) and GitOps (ArgoCD or Flux). - Python for operational automation; FCA/PCI-DSS cloud security understanding essential. - Fintech or financial services background advantageous; Azure certs (AZ-104, AZ-400, CKA) a plus. - Nice to have: RabbitMQ; Groovy pipeline scripting; GitHub Actions. Requirements - Technical Leadership: Platform squad oversight; sprint task definition; infrastructure architectural decisions. - Containers & Orchestration: Docker; Kubernetes/AKS (RBAC, HPA/KEDA, PDB); Helm (chart authoring, library charts). - IaC: Terraform (Azure — AKS, PostgreSQL, Service Bus, Key Vault, APIM); Atlantis/Terraform Cloud. - CI/CD & GitOps: Jenkins, GitLab CI, GitHub Actions; ArgoCD/Flux; Trivy, Checkov, OWASP dep-check. - Cloud & Observability: Azure AKS/Service Bus/Monitor; Prometheus + Grafana; OpenTelemetry; ELK/OpenSearch. - Messaging & Security: Kafka; RabbitMQ; Azure Key Vault / HashiCorp Vault; mTLS; Trivy/Snyk; FCA/PCI-DSS. - Scripting: Python, Bash, Groovy — pipelines, runbooks, automated incident response. - People: Mentoring/line management of 3–6 DevOps engineers; 1:1s, onboarding, performance reviews. Benefits - Experience-led everything. - Commitment to conceptualizing, designing, engineering, marketing, and managing digital solutions. - Inclusive workplace culture that celebrates individuality and encourages growth. - Equal employment opportunity (EEO) and affirmative action employer.
Role Description We currently have an opportunity for a Remote Site Clinic Paramedic (Tanumbirini) to be a key member of CareFlight’s remote clinical team, reporting to the National Medical Director and General Manager of Clinical Operations. Your key objectives will be to: - Deliver high-quality emergency response, primary health care, and medical support services within a remote industrial environment. - Ensure the provision of timely clinical intervention, support workforce health and wellbeing, and facilitate effective medical evacuation. - Provide frontline clinical care, maintain medical readiness of the site clinic and ambulance assets. - Contribute to a safe, compliant, and clinically governed service aligned with organisational and client expectations. Qualifications - Current registration as a Paramedic with the Australian Health Practitioner Regulation Agency (AHPRA). - Minimum 3–5 years post-graduate clinical experience in emergency, pre-hospital, retrieval, or remote area practice. - Current Advanced Life Support (ALS) certification. - Current Pre-Hospital Trauma Course (PHTC) certification or equivalent trauma qualification. - Current unrestricted manual driver’s license. - Demonstrated experience in emergency response, resuscitation, and acute patient stabilisation. - Proven ability to assess and manage a broad range of primary healthcare presentations in remote environments. - Demonstrated ability to work independently with sound clinical judgement and decision-making capability. - High level communication and interpersonal skills, with the ability to engage effectively with diverse stakeholders. - Strong organisational skills, including management of medical supplies, reporting requirements, and equipment readiness. - Understanding of clinical governance, risk management, and emergency preparedness frameworks. - Demonstrated resilience, adaptability, and capacity to work in demanding remote environments on FIFO rosters. - Ability to work 21 days on/ 21 days off roster. Requirements - Previous experience in remote industrial, mining, oil and gas, or offshore medical services (desirable). - Experience coordinating or supporting medical evacuations in remote or resource-limited settings (desirable). - Experience in aeromedical retrieval or remote emergency operations (desirable). - Qualification in Occupational Health, Remote Area Nursing/Paramedicine, or equivalent (desirable). - Training experience in first aid, emergency response, or workforce health education (desirable). - Knowledge of WHS legislation, emergency management systems, and remote site safety protocols (desirable). - Experience working in culturally diverse and multidisciplinary teams (desirable). How to apply If you would like to be part of our team, you can apply using the link below. Please ensure you attach a current resume and covering letter that addresses the role’s essential criteria. Applications for this opportunity will close on Wednesday 16th May 2026. For additional information, please contact recruitment@careflight.org . CareFlight values diversity in the workplace. Aboriginal and Torres Strait Islanders are encouraged to apply. All employees must comply with CareFlight’s Drug and Alcohol Management Plan (DAMP) as required by CASA and may be subject to random workplace testing. Criminal Record and Working with Children Checks also apply. CareFlight: ‘The next life we save could be yours’
Staff - Lead Site Reliability Engineer
HeartFlowHeartFlow works to enable better care for patients, and allow clinicians to better identify coronary artery disease through its software HeartFlow Analysis. The company is headquar
Staff/Lead Site Reliability Engineer (SRE) San Francisco, California Heartflow is a medical technology company advancing the diagnosis and management of coronary artery disease, the #1 cause of death worldwide, using cutting-edge technology. The flagship product—an AI-driven, non-invasive cardiac test supported by the ACC/AHA Chest Pain Guidelines called the Heartflow FFRCT Analysis—provides a color-coded, 3D model of a patient’s coronary arteries indicating the impact blockages have on blood flow to the heart. Heartflow is the first AI-driven non-invasive integrated heart care solution across the CCTA pathway that helps clinicians identify stenoses in the coronary arteries (RoadMap™Analysis), assess coronary blood flow (FFRCT Analysis), and characterize and quantify coronary atherosclerosis (Plaque Analysis). Our pipeline of products is growing and so is our team; join us in helping to revolutionize precision heartcare. Heartflow is a publicly traded company (HTFL) that has received international recognition for exceptional strides in healthcare innovation, is supported by medical societies around the world, cleared for use in the US, UK, Europe, Japan and Canada, and has been used for more than 500,000 patients worldwide. HeartFlow is transforming cardiovascular care with cutting-edge, non-invasive technology. We are launching a massive Platform Modernization initiative to power the next generation of our life-saving medical products. We're looking for an experienced Site Reliability Engineer (SRE) to join our cloud-native infrastructure team. You will work closely with our Platform engineers and development teams to ensure our critical systems are highly available, scalable, observable, and performant. If you thrive on eliminating toil, automating complex operations, and defining the standards for production excellence, we want to talk to you. Job Responsibilities As our Staff SRE, you'll be the primary expert responsible for our entire compute ecosystem. Your key responsibilities will include: As a Staff SRE, you'll operate at the highest level of technical expertise and influence. You won't just solve problems; you'll prevent them at a fundamental level across organizational boundaries. - Lead the design, implementation, and operation of reliable, scalable cloud infrastructure - Define and begin rollout of SLI/SLO standards across microservices - Develop self-service instrumentation tooling enabling engineering teams to own observability - Establish observability and monitoring using OSS toolchain - Serve as a technical escalation point for critical incidents, perform deep-dive root cause analyses (RCAs), and implement robust corrective measures to prevent recurrence. - Enhance our monitoring, logging, and tracing systems to provide comprehensive visibility into system health. - Set the technical direction and best practices for the entire SRE and engineering organization. Mentor mid-level and senior engineers on design patterns, operational rigor, and reliability principles. We're looking for a leader and a deep technical expert with a proven track record of solving the hardest scaling and reliability challenges. Required Qualifications - 8+ years of progressive experience in Site Reliability Engineering, Production Engineering, or a closely related role. - Deep expertise with: - AWS - Kubernetes, Helm - Observability stack (Prometheus, Grafana, Mimir, Loki, Pixie, Tempo) - CI/CD systems (ArgoCD, Harness) - Fluency in at least one major scripting/programming language for automation and tooling (e.g., Python, Go, or Java). - Hands-on engineering mindset — able to instrument services directly, not just configure tooling - Track record of building or significantly improving incident detection and response systems - Have deep technical familiarity with Kubernetes ecosystems, containerization technologies, and modern IaC tooling (e.g., Terraform, Crossplane, or Operators) so you can effectively guide the team's technical decisions - Exceptional communication skills, capable of explaining complex technical issues to both technical and non-technical audiences. Nice-to-Have - Experience implementing Service Mesh technologies (e.g., Istio, Linkerd). - A strong understanding of security principles and practices in a cloud environment. - Certifications such as CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer). A reasonable estimate of the base salary compensation range is $200,750 to $250,922, cash bonus, and equity. #LI-IB1 #LI-Hybrid
• Design, implement, and maintain Modern Data Platform cloud infrastructure using Azure services • Deployment of Azure Landing Zones in alignment with Microsoft best practices • Collaborate with Data Engineering team to deploy and manage modern data platform components • Collaborate with developers, clients and stakeholders to implement and maintain continuous integration and deployment pipelines • Automate deployment processes using tools such as Azure DevOps, Terraform, BICEP, YAML, and PowerShell • Implement and manage monitoring, logging, and alerting systems to ensure maximum availability and reliability • Continuously optimize the performance and scalability of our cloud infrastructure • Work with cross-functional teams to troubleshoot and resolve issues related to our cloud infrastructure and Data Platform • Stay up-to-date with the latest Azure services, trends, and best practices • Document procedures, configurations, and best practices

