Associate Director, Global DevOps, Observability Platform
Location
Florida + 1 moreAll locations: Florida | Virginia
Posted
5 days ago
Salary
$143K - $286K / year
Seniority
Senior
Job Description
Associate Director, Global DevOps, Observability Platform
Carrier
• Unify Global Innovation: Identify, celebrate, and integrate localized "pockets of innovation" from global teams into the core centralized platform • Define and execute the multi-year strategic roadmap for the Global DevOps and Observability Platforms • Act as a primary liaison and advocate across global business units to drive platform adoption • Lead, mentor, and scale a global team of platform and reliability engineers • Own the overarching architecture, administration, and compliance framework for GitHub Enterprise across the global footprint • Oversee the development of a centralized, reusable GitHub Actions workflow • Define and lead the architecture of the enterprise Observability Platform to provide full-stack visibility • Collaborate closely with Cloud Platform teams to seamlessly integrate CI/CD and Observability platforms • Partner with Global Security and Compliance organizations to bake automated governance directly into the platform infrastructure
Job Requirements
- Bachelor’s degree with 10+ years of experience in DevOps, Observability, Platform Engineering, or Software Delivery
- At least 3+ years in a dedicated leadership/management capacity within a global matrixed enterprise
- 5+ years of strategic or hands-on experience with GitHub Enterprise administration, GitHub Actions
- Deep familiarity with advanced CI/CD architectural patterns
- Proven experience defining strategy and scaling enterprise-grade Observability tools (e.g., LogicMonitor, Datadog, Dynatrace, New Relic, OpenTelemetry, Prometheus/Grafana, or Splunk) across hybrid and multi-cloud architectures
- Strong foundational knowledge of cloud-native architectures (Kubernetes, Serverless), Infrastructure as Code (Terraform)
- Exceptional communication and presentation skills, with a proven ability to influence executive stakeholders and technical individual contributors alike.
Benefits
- Medical, Dental, Vision
- Wellness incentives
- Retirement Benefits
- Paid vacation days, up to 15 days
- Paid sick days, up to 5 days
- Paid personal leave, up to 5 days
- Paid holidays, up to 13 days
- Birth and adoption leave
- Parental leave
- Family and medical leave
- Bereavement leave
- Jury duty leave
- Military leave
- Purchased vacation
- Short-term and long-term disability
- Life Insurance and Accidental Death and Dismemberment
- Health Savings Account
- Health Care Spending Account
- Dependent Care Spending Account
- Tuition Assistance
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevSecOps Engineer
TrueMLTrueML is a fintech company building software to create positive experiences for consumers seeking financial health.
• Security Automation & CI/CD Integration (Core Focus): Embed security controls and scanners (SAST, SCA, DAST, IaC, Container Security) into CI/CD pipelines (GitHub Actions, Jenkins, GitLab CI, Azure DevOps). • Design and maintain automated security workflows across build, test, and deploy stages. • Implement security gates, policy enforcement, and compliance checks within pipelines. • Cloud Security (AWS Focus): Secure cloud-native architectures across AWS (IAM, VPC, ECS/EKS, Lambda, S3, API Gateway). • Integrate and operationalize CNAPP/CSPM tools (e.g., Wiz, Prisma Cloud). • Enforce least privilege access, secrets management, and runtime protections. • Own Cloud Security: Define and maintain security policies for our AWS environment, specifically focusing on containerized workloads (EKS/ECS) and serverless architectures (Lambda). • Automate Compliance: Move beyond manual checks by building real-time monitoring and automated remediation for AWS resources, ensuring we stay 'audit-ready' for frameworks like PCI and ISO 27001. • Lead Threat Modeling: Perform deep-dive threat modeling exercises on applications and designs, turning theoretical risks into actionable engineering plans. • Innovate with AI: Develop security standards for Generative AI leveraging AI-powered tools to explore our attack surface. • Guard the Infrastructure: Secure our Infrastructure as Code (IaC) templates (Terraform/CloudFormation) and manage cloud primitives like IAM, KMS, and WAF to ensure a 'least privilege' environment.
Site Reliability Engineer
VyncaCommitted to empowering individuals, their loved ones, and their care teams with solutions delivered in their homes.
• Design, provision, and manage AWS infrastructure using Terraform • Operate, maintain, and scale production workloads running on Kubernetes • Package, deploy, and manage applications using Helm and infrastructure automation tools • Build, operate, and improve distributed and event-driven systems • Define, monitor, and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets • Develop automation for deployment, scaling, monitoring, incident response, and operational workflows • Own platform observability by implementing and maintaining metrics, logging, tracing, monitoring, and alerting solutions • Lead incident response efforts, facilitate blameless postmortems, and drive long-term corrective actions • Partner with Product and Engineering teams on capacity planning, performance optimization, and resilient system design • Implement and maintain security best practices to support HIPAA, SOC 2, and other compliance requirements • Participate in an on-call rotation and provide operational support for production systems
Azure DevOps Engineer – Hub-Remote: DC or Philly Metro
Element 84Accelerating and scaling impactful projects with great software and design. Geospatial, cloud, and petabyte-scale data.
• Collaborating with development teams for the design and implementation of robust, scalable, and secure cloud-native solutions on Azure and AWS. • Developing and maintaining infrastructure-as-code to manage and provision infrastructure across numerous Azure and AWS accounts, ensuring consistency and efficiency. • Maintaining and optimizing CI/CD automation pipelines to facilitate rapid and reliable software deployments. • Collaborating with security experts to translate organizational security requirements into secure and compliant cloud implementations. • Participate in all aspects of the software development lifecycle from user story generation, through design, development, automated testing and operational support • Improve quality by actively participating in code-reviews and adhering to team quality standards. • Own execution of small-medium sized features with higher-level technical support • You describe the details of your work fluidly and accurately to technical peers
• Scaling and maintaining our infrastructure and services using AI (Claude Code) as a first-class collaborator in your daily development workflow. • Being opinionated on technical direction and strategy (and documenting those opinions for others to be able to follow). • Leading and mentoring other engineers on the team • Owning and resolving the most complex infrastructure failures — Kubernetes scheduling edge cases, networking degradation, cross-service cascading failures, and AWS platform issues that other engineers escalate • Participating in a shared on-call rotation (roughly one week every six to eight weeks on call) • Estimating schedules, breaking tasks down to reasonable 1-3 day tasks. • Driving cloud cost efficiency by identifying over-provisioned resources, rightsizing EC2 and container workloads, and building tooling to surface cost anomalies before they compound




