Build, reward, and retain your clinical workforce
Senior Manager, DevOps
Location
Arizona + 15 moreAll locations: Arizona | California | Colorado | Florida | Illinois | New Jersey | New York | North Carolina | Ohio | Massachusetts | Michigan | Minnesota | Missouri | Pennsylvania | Texas | Wisconsin
Posted
2 days ago
Salary
$133.1K - $221.9K / year
Seniority
Senior
Job Description
Senior Manager, DevOps
Clinician Nexus
• Lead a team of DevOps engineers, including performance management, growth planning, and career development • Own the DevOps team roadmap in partnership with the Director of Platform Engineering, including quarterly priorities and capacity planning • Drive technical decisions and architecture reviews for CI/CD, infrastructure automation, and platform tooling • Collaborate with engineering, data platform, data governance, and ITOps leadership on cross-functional initiatives and shared standards • Coach engineers through code review, design feedback, and incident retrospectives • Represent the team in executive forums, including roadmap reviews, FinOps reporting, and architecture councils • Partner with the Director of Platform Engineering on AI tooling governance, including standardization on approved platforms, usage policy, and measuring engineering productivity impact • Design, build, and maintain CI/CD pipelines using GitHub Actions, including reusable workflows, self-hosted runners, and security controls • Implement and operate infrastructure as code using Terraform across multi-account AWS environments • Manage Kubernetes (EKS) clusters, including ArgoCD-based GitOps delivery, ingress, observability, and security policies • Operate secrets management with HashiCorp Vault, including dynamic credentials, JWT/OIDC auth, and External Secrets Operator integration • Build and maintain observability tooling with Grafana, OpenTelemetry, and Kubernetes-native monitoring stacks • Lead incident response and post-incident reviews, including authoring runbooks and reliability improvements • Implement security controls, governance processes, and compliance validation across the platform • Contribute to AWS network architecture, including PrivateLink, VPCs, and cross-account access patterns
Job Requirements
- 7+ years of hands-on infrastructure, DevOps, or platform engineering experience, with at least 2 years in a technical lead or team lead capacity
- Demonstrated experience as a working manager or tech lead who balances IC delivery with team leadership
- Deep experience with infrastructure as code using Terraform, including modules, state management, and multi-account patterns within AWS Organizations
- Production experience operating Kubernetes (EKS preferred) and GitOps delivery, ideally with ArgoCD or Flux
- Strong CI/CD experience with GitHub Actions, including security scanning, artifact management, and self-hosted runners
- Strong scripting experience with Python, Bash, or similar
- Working knowledge of HashiCorp Vault or comparable secrets management platforms
- Experience with cloud-based networking, including VPCs, private connectivity, DNS, and identity-based access
- Experience with observability tooling such as Grafana, Prometheus, OpenTelemetry, or equivalents
- BS in a relevant field or equivalent professional experience
Benefits
- Medical and dental coverage at no premium cost for employees
- 401(k) and profit-sharing retirement plans
- Flexible spending accounts
- Paid time off (PTO)
- Company-paid holidays
- Gender-neutral parental leave
- Bereavement and pet leave
- Continuing education and professional accreditation sponsorship
- Life and AD&D insurance
- Short- and long-term disability
- Employee assistance program
- Mental health support program
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer
Cognitive Medical Systems, Inc.Our purpose is to empower people and organizations to optimize healthcare through innovative technology solutions.
• Monitor, support, and maintain production applications to ensure system availability, reliability, and performance • Review application, server, and system logs to proactively identify, troubleshoot, and resolve issues • Perform root cause analysis and implement corrective actions to prevent recurring incidents • Establish operational monitoring, alerting, and support procedures • Manage and maintain Microsoft SQL Server environments supporting enterprise applications • Lead and manage application deployments across Development, Test, Staging, and Production environments • Support an Agile, Lean, and SAFe-based environment utilizing DevSecOps, CI/CD, and related methodologies • Collaborate with development teams to improve application performance, maintainability, and deployment efficiency • Support JavaScript-based application development efforts as needed
• Manage enterprise storage on Hitachi VSP Gx00 and 5x00 , covering LUN and volume provisioning, troubleshooting, and replication with GAD, UR, and SI • Storage NetApp AFF and FAS , deliver SAN and NAS services, oversee provisioning and health, resolve issues, and configure SnapMirror replication • Handle Purestorage X90 for day to day management, provisioning, and incident resolution • Operate Brocade GEN5 and later switches and directors, performing zoning, pathing, and diagnostics • Use Hitachi OPS Center to monitor Hitachi arrays and analyze performance, NetApp Active IQ Unified Manager to track NetApp health and capacity, and Brocade BNA to administer Brocade fabrics and report on events and performance
• You'll own reliability across our Azure-based cloud platform — but how you get there is changing. • We expect our SREs to actively use AI tools to work smarter: faster root cause analysis, intelligent alerting, automated runbooks, predictive scaling. • You'll deep-dive production issues, build automation that sticks, and partner closely with engineering teams to ship and run resilient services at scale — for regulated, compliance-heavy clients who can't afford surprises. • You'll also join our on-call rotation.
• Maintain and support core infrastructure systems with deep knowledge of Linux (Debian/Ubuntu preferred). • Work close to the metal: BIOS, IPMI, RAID setups, and hardware-level diagnostics are part of your comfort zone. • Design and maintain scalable networks using VLANs, L2/L3 routing, VPNs, and especially UniFi equipment. • Automate infrastructure provisioning and operations with Ansible, Bash/Python, and Git-based workflows. • Set up and manage observability stacks, including Prometheus/Grafana for metrics and Graylog, ELK, or Loki for log centralization. • Build tooling for server discovery, config auto-generation, automated OS deployments, PXE/Preseed/Cloud-init, and strong MAAS-based provisioning. • Integrate and/or develop internal APIs for tracking compute and GPU resource allocation, as well as external APIs (billing, monitoring, OpenStack, etc.). • Deploy and maintain virtualization and orchestration systems such as OpenStack (preferably with Kolla-Ansible), Proxmox VE, or VMware ESXi. • Support container-based workloads and isolate services efficiently.




