Arbor Education, founded in 2011 and based in London, England, United Kingdom, is the country's fastest-growing management information system provider, serving
Senior DevSecOps Engineer
Location
United Kingdom
Posted
3 days ago
Salary
£75K - £85K / year
Seniority
Senior
Job Description
Senior DevSecOps Engineer
Arbor Education
• Collaborate with stakeholders to pinpoint security enhancements across platform architecture and infrastructure, devising and executing strategic plans for implementation • Work closely with the Platform team to embed robust security processes, controls, and tooling across all system components • Threat model new and existing systems — including AI/LLM-enabled features and agentic workflows — and translate findings into prioritised, actionable work • Strengthen our software supply chain: dependency and base-image hygiene, SBOM generation, artefact signing and provenance, and the pinning of third-party actions and packages • Secure the use of AI across the SDLC, ensuring agentic coding tools, assistants, and MCP integrations operate within safe, well-scoped, and auditable boundaries • Contribute to the evolution of deployment frameworks, emphasising security, deployment speed, and system stability • Elevate platform security through strong secrets management and the safe handling of sensitive information • Play an active role in incident response, resolution, and blameless post-mortems, facilitating continuous improvement • Participate in knowledge-sharing initiatives, including tech-talks and team-based learning sessions • Maintain meticulous, current documentation — playbooks, runbooks, and comprehensive systems documentation — to facilitate knowledge dissemination
Job Requirements
- Extensive experience in cyber security and associated engineering practices
- Vulnerability management and remediation at scale
- Proven track record in DevOps / DevSecOps engineering within large-scale platforms
- Proficiency in distributed cloud systems, particularly Amazon Web Services
- Expertise in Infrastructure as Code (IaC) tooling such as Terraform and CloudFormation
- Experience with languages such as PHP, Bash, or Python
- Experience with Docker and containerisation, with a working understanding of container and runtime security
- Software supply-chain security: SBOMs, dependency scanning, and artefact signing / provenance (e.g. SLSA, Sigstore)
- Secrets management and detection (e.g. Vault, cloud-native secret stores, secret-scanning in CI)
- Security tooling across the SDLC: SAST, DAST, SCA, IaC scanning, and container scanning (e.g. Snyk, Trivy)
- Policy-as-code and guardrails (e.g. OPA / Conftest), with an identity-centric / zero-trust approach to access
- Familiarity with monitoring and detection tooling like DataDog, Prometheus, or similar platforms
- A proactive problem-solving attitude coupled with strong teamwork and communication skills
- Exceptional proficiency in written and spoken English to effectively articulate ideas and concepts.
- Practical understanding of AI/LLM security risks and their mitigations — e.g. prompt injection, jailbreaks, insecure output handling, sensitive-data leakage, and excessive agency (aligned to the OWASP Top 10 for LLM Applications)
- Experience securing AI-assisted and agentic development tooling: scoping permissions, sandboxing, logging and audit, and preventing secret or data exfiltration through AI agents and MCP servers
- Familiarity with AI threat modelling and adversarial techniques (e.g. MITRE ATLAS) and with conducting or supporting AI-aware red teaming
- Awareness of AI governance and assurance frameworks (e.g. NIST AI RMF, ISO/IEC 42001) and how they intersect with data-protection obligations for a multi-tenant platform handling children's data
- Confident, responsible use of AI tooling to accelerate security work — triage, detection engineering, code review, and documentation — while understanding and accounting for its limitations
- Past experience with enterprise solutions running at scale (Bonus)
- Familiarity with kanban and agile development processes (Bonus)
- Familiarity with software best practices such as Refactoring, Clean Code, Domain-Driven Design, Test-Driven Development, etc. (Bonus)
- Experience with compliance frameworks relevant to EdTech (e.g. NIST CSF, ISO 27001, SOC 2, UK GDPR) (Bonus)
- Relevant certifications (e.g. AWS Security Specialty, OSCP, or AI security / governance credentials) (Bonus)
Benefits
- A dedicated wellbeing team who champion initiatives such as mindfulness, lunch n learns, manager training, mental health first aid training and much more!
- 32 days holiday (plus Bank Holidays). This is made up of 25 days annual leave plus 7 extra company wide days given over Easter, Summer & Christmas
- Life Assurance paid out at 3x annual salary
- Comprehensive wellness benefit provided by AIG Smart Health, which provides a 24/7 virtual GP service, Mental health support, Counselling, and personalised Health Checks
- Private Dental Insurance with Bupa
- Salary sacrifice Pension provided by Scottish Widows
- Enhanced maternity and adoption leave (20 weeks full pay) and paternity (6 weeks full pay) pay
- 5 free return to work maternity coaching sessions, helping you adapt to this new exciting time of life!
- Access to services such as Calm and Bippit (financial wellbeing coaching)
- All of our roles champion flexible working and we are happy to discuss what this means to you
- Social committees that plan team, office and company wide events to bring people together and celebrate success
- Dedicated professional development training budget (CPD courses, upskilling resources, professional memberships etc)
- Volunteer with a charity of your choice for a day each year
- Dog friendly offices!
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Staff Site Reliability Engineer (Collaboration Engineering)
Scratch FinancialScratch Financial is the world's simplest patient financing solution.
Company Description NBCUniversal is one of the world's leading media and entertainment companies. We create world-class content, which we distribute across our portfolio of film, television, and streaming, and bring to life through our global theme park destinations, consumer products, and experiences. We own and operate leading entertainment and news brands, including NBC, NBC News, NBC Sports, Telemundo, NBC Local Stations, Bravo, and Peacock, our premium ad-supported streaming service. We produce and distribute premier filmed entertainment and programming through our powerhouse film and television studios, including Universal Pictures, DreamWorks Animation, and Focus Features, and the four global television studios under the Universal Studio Group banner, and operate industry-leading theme parks and experiences around the world through Universal Destinations & Experiences, including Universal Orlando Resort, home to Universal Epic Universe, and Universal Studios Hollywood. NBCUniversal is a subsidiary of Comcast Corporation. Visit www.nbcuniversal.com for more information. Our impact is rooted in improving the communities where our employees, customers, and audiences live and work. We have a rich tradition of giving back and ensuring our employees have the opportunity to serve their communities. We champion an inclusive culture and strive to attract and develop a talented workforce to create and deliver a wide range of content reflecting our world. Job Description The Staff Reliability Engineer (SRE) for Workplace Engineering is responsible for the reliability, performance, security, and operational excellence of enterprise workplace collaboration & endpoint services used globally by employees and partners. This role applies an engineering mindset to operations-defining service level indicators/objectives (SLIs/SLOs), reducing toil through automation, improving observability, and strengthening incident response-to ensure a consistent, high-quality collaboration experience across messaging, meetings, voice, file sharing, knowledge sharing, device management platforms & Copilot / AI engineering. - Microsoft 365: Teams (chat, meetings, webinars, Teams Phone), SharePoint Online, OneDrive, Exchange Online, Microsoft Entra ID (Azure AD), Microsoft Purview, Defender for Office 365, Intune (Endpoint Management). - Hybrid messaging and identity integrations (as applicable): Exchange Server, directory synchronization, mail flow and routing - Collaboration endpoints and devices: Teams Rooms, certified headsets/cameras, conference room AV integrations - Ecosystem integrations: Power Platform (Power Automate/Apps), Graph API, third-party conferencing/messaging where in use (e.g., Zoom/Slack), mail hygiene/security gateways - Architect and optimize global Microsoft Intune and Jamf Pro environments. - Orchestrate Windows Updates for Business (WUfB), third-party application patching, and compliance policies to maintain a hardened security posture - Automated packaging and deployment of Windows applications, maintaining a rigorous cadence for third-party updates. - leverage PowerShell and Graph API to automate repetitive configuration tasks and self-healing remediations. - Partner with Security Operations to remediate vulnerabilities. - Develop and enforce Configuration Profiles, Compliance Policies, and Conditional Access rules - Own the reliability and scaling of Azure Virtual Desktop (AVD) and Windows 365 (Cloud PC), optimizing for both performance and cost-efficiency. - Define and operationalize SLIs/SLOs and error-budget policies for collaboration services (Teams chat/meetings/voice, SharePoint/OneDrive, Exchange) with clear customer-impact measurements. - Own end-to-end reliability engineering: capacity planning, performance tuning, resilience reviews, dependency mapping, and proactive risk reduction for critical collaboration journeys. - Demonstrated expertise in developing, operationalizing, and scaling AI engineering capabilities, including platform design, model lifecycle management, automation, reliability, and enterprise adoption. - Strong knowledge of AI governance frameworks, with experience establishing guardrails for responsible AI use, risk management, security, compliance, data controls, and ongoing operational oversight. - Build and evolve observability for collaboration platforms: health dashboards, telemetry standards, alert strategy (high signal/low noise), and synthetic monitoring aligned to user experience. - Lead incident response for high-severity events: establish incident roles, drive rapid triage/mitigation, coordinate cross-team communication, and produce blameless post-incident reviews with durable corrective actions. - Engineer automation to reduce operational toil: provisioning, policy/config drift detection, lifecycle management, reporting, and remediation using PowerShell and APIs; establish reusable runbooks and self-service patterns. - Strengthen change and release practices: production readiness reviews, controlled rollouts, maintenance windows, validation plans, and rollback strategies to reduce customer impact. - Partner with Security/Compliance to ensure collaboration services meet governance requirements (identity and access, DLP, retention, eDiscovery, information protection), while balancing usability and reliability. - Provide Staff-level technical leadership: set engineering standards, mentor engineers, influence roadmap priorities, and align stakeholders on reliability tradeoffs and investment. - Establish and lead reliability operating mechanisms (on-call standards, incident command readiness, postmortem quality, action-item governance, and quarterly reliability reviews) to improve consistency across teams. - Coach, mentor, and sponsor engineers across levels: provide technical guidance, review designs and postmortems, and raise the bar on documentation, runbooks, and operational readiness. - Drive cross-organization alignment on reliability priorities and investment by presenting trends, risks, and proposals to leadership; secure commitments and ensure delivery against measurable outcomes. - Serve as an escalation point for complex, cross-domain issues spanning identity, messaging, endpoints, and network dependencies; engage vendors as needed and ensure issues are driven to resolution. Qualifications - 12+ years of experience in reliability engineering, systems engineering, DevOps, or large-scale collaboration/communications operations (enterprise or SaaS), including ownership of production services - Deep expertise with collaboration platforms and ecosystems: Microsoft 365 (Teams-including voice/meetings/Rooms-SharePoint Online, OneDrive, Exchange Online) and their dependencies (identity, endpoints, networking) - Hands-on experience defining SLIs/SLOs, building observability (metrics/logs/traces), and operating an incident management program (on-call, severity model, communications, postmortems) - Strong automation skills with PowerShell and APIs (Microsoft Graph preferred); ability to build tooling that improves reliability and reduces toil - Experience with cloud identity and access (Microsoft Entra ID/Azure AD, Conditional Access, MFA, RBAC/PIM) and collaboration governance (Purview, DLP, retention, eDiscovery) preferred - Bachelor's degree in Computer Science/Engineering (or equivalent practical experience) Desired Characteristics - Executive-level written and verbal communication skills; able to translate reliability data into clear decisions, tradeoffs, and action plans - Proven ability to influence across functions (Security, Network, End User Computing, Architecture, Product/Program) without formal authority - Strong systems thinking and customer satisfaction, focuses on user journeys (chat, meetings, voice, file sharing) and measurable experience outcomes - Demonstrated technical leadership through mentorship, sponsorship, and talent development; builds inclusive, high-performing engineering culture - High bar for operational excellence, insists on clear ownership, durable fixes, strong postmortems, and measurable follow-through - Comfort operating in ambiguity and driving large, multi-quarter improvements with measurable results Hybrid: This position currently has a hybrid schedule, which requires contributing from the office a minimum of four days per week. The Company reserves the right to change in-office requirements at any time. Additional Information As part of our selection process, external candidates may be required to attend an in-person interview with an NBCUniversal employee at one of our locations prior to a hiring decision. NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law. If you are a qualified individual with a disability or a disabled veteran and require support throughout the application and/or recruitment process as a result of your disability, you have the right to request a reasonable accommodation. You can submit your request to AccessibilitySupport@nbcuni.com.
• Design enterprise cloud architecture standards across Azure and GCP. • Lead cloud migration strategies and multi-region infrastructure design. • Establish cloud governance, cost optimization, and security frameworks. • Architect Kubernetes platforms (AKS, GKE) with enterprise security and networking. • Implement service mesh solutions and advanced deployment strategies. • Build developer self-service platforms and observability solutions. • Design scalable pipelines using Jenkins, Azure DevOps, GitHub Actions. • Implement GitOps frameworks and infrastructure-as-code practices. • Integrate security scanning, testing automation, and compliance workflows. • Implement Zero Trust architectures and identity management solutions. • Automate compliance for SOC 2, ISO 27001 frameworks. • Drive security-by-design practices across all platforms. • Lead technical teams and drive DevOps culture transformation. • Mentor senior engineers and establish centers of excellence. • Oversee architecture reviews and technology roadmaps.
DevOps/Observability Engineer
QuantiphiPioneering AI-first solutions, solving complex business challenges through expertise, cloud, data engineering, and AI.
Role Description We are seeking a highly experienced Senior DevOps/Observability Engineer with over 8 years of experience to lead the design and implementation of our next-generation, unified observability platform. This pivotal role will focus on architecting a sophisticated observability pipeline from the ground up, leveraging a modern, open-source-centric stack on Amazon Web Services (AWS). The ideal candidate will have deep expertise in designing and deploying observability solutions, with a strong emphasis on OpenTelemetry (OTel) and Kubernetes observability. You will be responsible for deploying, configuring, and integrating a suite of tools including Prometheus, Grafana, and Splunk to provide comprehensive insights into our complex, distributed systems. This is a hands-on role for a technical leader who is passionate about building scalable, reliable, and efficient monitoring and logging systems. Qualifications - Unified Pipeline Architecture: Proven ability to design and implement end-to-end observability pipelines using OpenTelemetry, Prometheus, and Grafana on centralized infrastructure. - Cross-Account AWS Observability: Deep expertise in centralizing AWS telemetry, including multi-account CloudTrail organization trails, cross-account CloudWatch metrics/logs, and VPC Flow Logs. - Log Aggregation & Routing: Strong experience designing log aggregation strategies, implementing noise reduction/filtering at the collector level, and configuring Splunk HTTP Event Collector (HEC) integrations. - Advanced Alerting & Dashboarding: Hands-on experience building comprehensive alerting frameworks using Alertmanager and CloudWatch Alarms, coupled with advanced dashboard engineering in Grafana (using PromQL). - Infrastructure as Code (IaC): Advanced proficiency in writing Terraform modules specifically for deploying and managing observability stacks and EC2 infrastructure. Requirements - Enterprise Scale Log Management: Demonstrated experience managing, routing, and optimizing log pipelines at massive scale (TB/day). - Kubernetes/Container Observability: Experience deploying Prometheus and OTel within Kubernetes (EKS) or containerized (ECS) environments. - Cost Optimization: Proven track record of reducing observability spend through strategic metric dropping, log filtering, and efficient storage tiering. Benefits - Join one of the world’s fastest-growing AI-first digital engineering companies and make a real impact at scale. - Lead and collaborate with a high-energy team of talented, driven individuals solving complex, meaningful challenges. - Work with Fortune 500 companies and disruptive innovators in a research-driven environment with 60+ patents. - Stay ahead of the curve by gaining hands-on experience with cutting-edge AI, ML, data, and cloud technologies while continuously upskilling. - If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!
Azure DevSecOps Engineer
Cherry BekaertRanked among the largest public accounting firms in the United States, Cherry Bekaert provides digitally driven and industry-aligned solutions to elevate clients to market leaders
Role Description We are seeking a highly skilled Azure DevSecOps Engineer to design, implement, and support secure, automated cloud infrastructure using Infrastructure as Code (IaC) principles. This role will be responsible for driving automation, embedding security into the software delivery lifecycle, and enabling scalable, compliant Azure environments. The ideal candidate combines deep expertise in Terraform, Azure DevOps, and CI/CD automation with strong knowledge of cloud security, governance, and operational support. Key Responsibilities - Infrastructure as Code (IaC) – Terraform Focus - Design, build, and maintain reusable Terraform modules for Azure infrastructure provisioning (networking, compute, identity, storage) - Ensure all infrastructure is version-controlled, auditable, and deployed via automated pipelines - Implement policy-as-code and security baselines within Terraform configurations - Perform code reviews and enforce IaC standards across engineering teams - DevSecOps & CI/CD Automation - Design and maintain secure CI/CD pipelines using Azure DevOps, GitHub Actions, or similar tools - Integrate automated security scanning (SAST, DAST, IaC scanning) into deployment pipelines - Build and support automated deployment orchestration (blue/green, canary, rollback strategies) - Automate provisioning, configuration, and deployment workflows to reduce manual effort - Azure Cloud Engineering - Architect, deploy, and manage secure Azure cloud environments - Implement governance controls including RBAC, Azure Policy, and identity management - Design scalable and resilient infrastructure aligned with business and security requirements - Optimize cloud environments for performance, cost, and reliability - Security & Compliance - Embed security controls and compliance checks into infrastructure and pipelines - Conduct vulnerability assessments and remediate risks proactively - Manage secrets, certificates, and keys using secure vault solutions (e.g., Azure Key Vault) - Ensure adherence to regulatory and organizational security standards - Automation Support & Operational Excellence - Provide automation and platform support for build, release, and infrastructure pipelines - Troubleshoot CI/CD, IaC deployments, and cloud infrastructure issues - Develop and maintain self-service automation tools for engineering teams - Monitor systems, respond to incidents, and continuously improve reliability - Collaboration & Enablement - Partner with Dev, Sec, and IT teams to integrate security into development workflows - Provide guidance and best practices on DevSecOps and IaC adoption - Support onboarding of applications into standardized DevSecOps pipelines - Document processes, patterns, and reusable frameworks Qualifications - 5+ years experience in DevOps / DevSecOps / Cloud Engineering - Strong hands-on experience with: - Terraform (required) - Azure (IaaS, PaaS, identity, networking) - CI/CD tools (Azure DevOps, GitHub Actions, Jenkins) - Experience implementing Infrastructure as Code in enterprise environments - Proficiency in scripting/automation (PowerShell, Bash, or Python) - Experience with security integration in CI/CD pipelines - Strong understanding of cloud security, IAM, and compliance frameworks Preferred Qualifications - Experience with: - Azure Kubernetes Service (AKS), containers, or microservices - Policy-as-code tools (OPA, Sentinel, Checkov) - Monitoring tools (Azure Monitor, Log Analytics, Prometheus) - Certifications: - Microsoft Azure certifications (e.g., AZ-400, AZ-500) - HashiCorp Terraform Associate - Experience in regulated environments a plus (SOX, SOC2, etc.) Key Skills - Infrastructure as Code (Terraform) - Azure DevOps / CI-CD automation - Cloud security & DevSecOps practices - Scripting & automation - Monitoring, troubleshooting, and incident response - Cross-team collaboration and communication What Success Looks Like - Fully automated, secure Azure infrastructure deployments - Reduced manual provisioning and faster release cycles - Embedded security controls across pipelines and IaC - Improved reliability and scalability of cloud platforms - Strong adoption of DevSecOps and automation practices across teams Benefits - Competitive compensation packages based on performance - Comprehensive, high-quality benefits program including: - Annual bonus - Medical, dental, and vision care - Disability and life insurance - Generous Paid Time Off - Retirement plans - Paid Care Leave - Flexibility to do impactful work and enjoy life outside of work - Opportunities to connect and learn from professionals from different backgrounds and cultures



