Job Closed
This listing is no longer active.
The End-to-End Platform for Risk Adjustment, Quality Improvement, and Member Management
Director of DevOps
Location
United States
Posted
86 days ago
Salary
$180K - $220K / year
Seniority
Lead
Job Description
Director of DevOps
Reveleer
• Lead, mentor, and grow a team of DevOps engineers, setting clear goals and fostering a collaborative, high-performing culture • Define and evolve DevOps and platform engineering strategy across multiple product lines • Establish clear ownership boundaries between DevOps, IT, and Security to reduce friction and improve delivery focus • Build and scale the DevOps operating model to support a growing engineering organization • Own and evolve our cloud infrastructure, ensuring high availability, security, scalability, and cost efficiency • Define standards for infrastructure-as-code and environment management across teams • Drive consistency in cloud architecture and deployment patterns across products • Own and standardize CI/CD practices across engineering teams to improve delivery speed and predictability • Reduce deployment friction and improve developer productivity through automation and tooling • Drive adoption of best practices across build, test, release, and deployment workflows • Own our observability strategy using New Relic, ensuring meaningful alerting, dashboards, and SLOs/SLAs are in place • Lead incident response processes, post-mortems, and reliability improvements • Drive proactive performance tuning and capacity planning • Partner with Security to ensure infrastructure and deployment practices meet compliance and security standards • Implement and enforce controls across environments aligned with company and regulatory requirements • Own and improve engineering delivery metrics such as deployment frequency, lead time for changes, change failure rate, and mean time to recovery • Drive measurable improvements in system reliability, availability, and operational efficiency • Partner with Finance and Engineering to improve cloud cost visibility and optimization • Work closely with Product, Data, and Engineering leadership to align platform capabilities with business priorities • Support customer-facing reliability and SLA commitments in partnership with Support and LiveOps teams.
Job Requirements
- 8+ years of experience in DevOps, infrastructure, or site reliability engineering, with 3+ years in a leadership role
- Proven experience leading DevOps or platform teams in a cloud-first environment while remaining hands-on
- Current experience supporting engineering teams implementing scalable cloud architectures using CaC, IaC, and Kubernetes.
- Strong proficiency and hands-on experience with AWS (EC2, ECS/EKS, RDS, IAM, VPC, S3, and related services)
- Demonstrated experience guiding and leading other DevOps, cloud, and software engineers, leveling up technical proficiency and overall cloud capabilities.
- Deep knowledge of Kubernetes and container orchestration, including experience running production EKS clusters
- Solid understanding of networking and cloud networking concepts, including VPCs, subnets, routing, security groups, load balancers, DNS, and VPN/Direct Connect connectivity
- Experience managing CI/CD pipelines, preferably with Bitbucket Pipelines or similar tools (GitHub Actions, GitLab CI)
- Proficiency with infrastructure-as-code tools such as Terraform or AWS CloudFormation
- Extensive experience leveraging observability tooling for logging, monitoring, alerting, incident management, and escalation (New Relic/Datadog/etc.)
- Familiarity with web application architecture concepts, such as databases, message queues, serverless and event-driven architectures, and caching mechanisms. Ability to work with Engineering teams to identify and resolve performance constraints
- Hands-on software development background — prior experience writing and shipping production code in any language, enabling effective collaboration with engineering teams and informed decision-making on tooling, pipelines, and developer workflows.
Benefits
- Medical, Dental and Vision benefits
- 401k match
- Generous PTO plan
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Staff Site Reliability Engineer
FabricThe national pay range for this role is $165,000.00 - $210,000.00 per year. Actual compensation will be determined by factors such as the candidate's geographic market, experience, skills, and qualifications. Certain roles may also be eligible for additional compensation. If your compensation requirement is greater than our posted range, please still consider applying; a determination can be made based on unique qualifications. Expected compensation ranges for this role may change over time.
About the Role As a Staff Site Reliability Engineer, you will own and evolve the infrastructure powering healthcare experiences for millions of patients. This role bridges the gap between traditional infrastructure excellence and the future of AI-driven operations. You will act as a primary architect for our AWS and Kubernetes (EKS) environment, ensuring the platform is resilient, scalable, and compliant while exploring how agentic workflows can modernize SRE practices. What You'll Do As a Staff Site Reliability Engineer, you will be a steward of Fabric’s production integrity, leading the strategy for infrastructure automation, observability, and system resilience. Your primary responsibilities include: - Infrastructure & Kubernetes Orchestration - Designing, deploying, and maintaining production Kubernetes (EKS) clusters to ensure enterprise-grade availability for our users. - Eliminating manual configuration by building and managing a scalable infrastructure state entirely through Terraform. - Optimizing the AWS footprint—specifically EC2, RDS, and S3—to balance high performance with cost-efficiency and reliability. - AI-Assisted Operations & Automation - Exploring and deploying agentic workflows for AI-assisted runbooks that automate complex operational decisions and repetitive tasks. - Building and evolving deployment pipelines using GitHub Actions or Semaphore to ensure delivery is both rapid and safe. - Focusing on toil reduction by developing internal tools that replace manual operational work with intelligent, autonomous systems. - Observability & Incident Management - Driving the evolution of the observability stack in Datadog by implementing the sophisticated metrics, traces, and logs needed to meet SLOs. - Leading incident response efforts and facilitating the blameless postmortems that help systematically reduce recovery time (MTTR). - Defining and monitoring the SLIs and SLOs that ensure the platform consistently meets rigorous healthcare performance standards. - Compliance & Collaboration - Ensuring every piece of infrastructure remains fully compliant with HIPAA and other critical healthcare regulatory requirements. - Mentoring engineers across the company on reliability best practices and contributing a clinical-safety perspective to cross-functional design reviews. Why You Might Be a Good Fit - You are a deeply proficient engineer who excels at the intersection of cloud infrastructure, automation, and system design. - You possess a meticulous approach to observability and a passion for finding the "root cause" rather than just applying a patch. - You enjoy exploring the "next frontier" of SRE, including how AI and agentic tools can make operations more efficient. - You thrive in fast-paced environments where technical rigor is balanced with pragmatism and clinical-grade safety. This Might Not Be The Right Fit If... - You prefer working on static infrastructure rather than evolving systems through code and automation. - You are uncomfortable with the "agile" pace of tech-driven platform development or integrating AI tools into your daily workflow. - You prefer a siloed role that does not involve active participation in incident response or collaborative postmortems. Your Qualifications - 8+ years of experience in SRE, DevOps, or Platform roles managing production environments at scale. - Expert technical depth in AWS (EKS, EC2, RDS, S3) and production-grade Kubernetes management. - Proficiency with modern tooling including Terraform (IaC), Datadog (Observability), and CI/CD systems. - Deeply proficient coding and scripting skills in Python, Bash, Ruby, or Go. - Preferred experience building agentic workflows or AI-assisted tooling to drive operational efficiency. - A "rigor-first" mindset with a dedication to HIPAA-compliant, high-availability architecture. The national pay range for this role is $140,000.00 – $170,000.00 per year. Actual compensation will be determined by factors such as the candidate's geographic market, experience, skills, and qualifications. Certain roles may also be eligible for additional compensation, including a comprehensive benefits package such as medical, dental, vision, unlimited PTO, and a 401(k) plan, stock options and bonuses. If your compensation requirement is greater than our posted range, please still consider applying; a determination can be made based on unique qualifications. Expected compensation ranges for this role may change over time.
Senior DevSecOps Engineer
SkySafeSecuring the airspace, protecting the public, enabling the commercial drone industry.
• Design, implement, and maintain scalable and secure cloud infrastructure supporting SkySafe’s SaaS platform • Build and maintain infrastructure automation using Infrastructure-as-Code tools such as Terraform or similar • Develop and maintain CI/CD pipelines to enable safe, reliable, and repeatable deployments • Ensure infrastructure and operational practices align with SOC2 and government security requirements • Implement and maintain monitoring, logging, and alerting systems for production infrastructure • Improve system reliability, observability, and performance across the platform • Collaborate with engineering teams to design infrastructure that supports new services and features • Manage secrets, identity, and access controls using industry best practices • Help define operational standards and DevOps processes across the engineering organization • Support incident response and root-cause analysis for production issues
Senior DevOps Engineer
HealthieHealthie is the world’s leading API-first, ONC-Certified EHR for healthcare delivery outside of the hospital. We provide the powerful infrastructure every scaling organization needs—EHR, scheduling, patient engagement, billing, and more—all accessible via modern APIs and a white-labeled UI. Today, over 1 billion API calls are made to Healthie every month, as thousands of organizations—working with more than 13 million patients in total—rely on Healthie to deliver care across a spectrum of specialties, from preventative health and wellness to complex chronic care management. Healthie is backed by leading investors, and while we've raised $42M to date, more importantly, we operate with fiscal responsibility and have been profitable for more than half of our time as a company.
Our Mission We’re building infrastructure for modern healthcare delivery. Traditional healthcare is plagued with outdated, monolithic EHRs designed to maximize billing outcomes. Patient outcomes and provider experiences have been afterthoughts, as these systems have bolted on non-API-first solutions. None of this is built for how clinically excellent healthcare is actually delivered—longitudinally and collaboratively, with the patient at the center. Healthie is the world’s leading API-first, ONC-Certified EHR for healthcare delivery outside of the hospital. We provide the powerful infrastructure every scaling organization needs—EHR, scheduling, patient engagement, billing, and more—all accessible via modern APIs and a white-labeled UI. Our platform makes it simple for organizations of any size to launch, customize, and scale their care delivery models without reinventing the wheel. Today, over 1 billion API calls are made to Healthie every month, as thousands of organizations—working with more than 13 million patients in total—rely on Healthie to deliver care across a spectrum of specialties, from preventative health and wellness to complex chronic care management. We believe in the power of technology to improve access to healthcare—and we’re building the rails that make this a reality. We work fast and with quality because we provide business-critical, healthcare-critical software that clinicians and patients need for a better healthcare system. We’re customer-obsessed, operate with lightning-fast processes and responses, make our product roadmap public so customers can see what we’re building, and remain relentlessly focused on how care gets delivered. Healthie is backed by leading investors, and while we've $42M raised to date, more importantly, we operate with fiscal responsibility and have been profitable for more than half of our time as a company. Learn more at https://www.gethealthie.com/ About the role We are hiring for a DevOps engineer to join our Platform Engineering team at Healthie! In this role, you’ll partner closely with platform, infrastructure, and core engineering teams to improve the reliability of our CI/CD pipeline, implement developer tooling that helps the rest of the engineering team move faster, and make changes to our product to make it easier to manage. This is a hands-on role, ideal for someone who is excited to improve the developer efficiency and quality of life in a fast-moving startup environment and help shape the future of security at Healthie. You should be able to design, scope, and implement tooling independently. If you're passionate about building impactful systems, driving innovation, and making a difference in healthcare — we’d love to hear from you. What You'll Do - Automate infrastructure via tools such as terraform. - Administer our software platform using tools like CircleCI, PostgreSQL, Depot, Shipyard, Github actions. - Work with engineers on the product, customer engineering, data, and platform teams to develop solutions to SDLC slowdowns and inefficiencies. - Develop solutions that improve quality of life and reliability for the engineering organization as a whole.. - Measure performance and make improvements to our ecosystem, evaluate tooling, propose and implement new tools when needed. Details, details - This is a full-time, remote position - U.S. work authorization is required. - The salary range is $180,000 - $200,000 plus equity, annual bonus, & benefits About you - 5+ years of experience in a DevOps/Infrastructure engineering environment - Familiarity with and experience administering continuous integration and deployment pipelines. - You have a desire to problem solve and drive results. - Have a working knowledge of and experience with containerization tools such as docker. - You have excellent communication skills, and enjoy working with other teams directly. - The ideal candidate will have experience with a RoR ecosystem. - Bonus if you have experience with observability platforms such as prometheus/grafana. Interview Process - Quick chat with someone from our Talent team (15 minutes) - Interview with Chris, Director of Platform Engineering (30 minutes) - Pair Coding interview (1hr) - Talk with folks from the platform team (30 minutes) - Interview with Cavan, CTO + cofounder (20 minutes) - Reference checks To learn more about Working at Healthie & our benefits, click here. Healthie participates in e-verify
About Clear Labs Clear Labs (CL) harnesses the power of next-generation sequencing (NGS) to simplify complex diagnostics for clinical and applied markets. By creating a fully automated platform that brings together DNA sequencing, robotics, and cloud-based analytics, Clear Labs democratizes genomics applications to deliver better clarity. Clear Labs’ turnkey platform accelerates outcomes and improves accuracy from food-borne pathogens to infectious diseases. Position Summary We are a fast-moving, lean engineering team building complex instrument software that bridges physical lab hardware with the cloud. We are looking for a proactive DevOps Engineer to take ownership of our infrastructure. You won't be starting from scratch, nor will you be left alone. You will be taking the reins from our outgoing Senior Architect (who will remain available in an advisory capacity) and will work closely with our core development team to modernize our CI/CD pipelines, secure our GCP environments, and prepare our infrastructure for SOC2 compliance. If you are a hungry mid-level US engineer looking for the autonomy to own a hybrid edge-to-cloud stack, or an experienced nearshore engineer looking for a direct integration with a US-based hardware/software team, this is your launchpad. Reports to: Senior Vice President of Engineering Location: Remote (US / PST Time zone) OR On-Site (San Carlos, CA). Onsite presence is required 3 days per week for employees within a reasonable commuting distance, with additional days onsite possible as business needs require. Primary Responsibilities - Cloud Infrastructure: Manage, monitor, and scale our cloud environment using Infrastructure-as-Code (Terraform) across GCP and GKE (Kubernetes). - CI/CD & Developer Velocity: Maintain and optimize our Jenkins build and deployment pipelines to help our developers ship code faster and more reliably. Build and maintain Docker images and manage containerized applications. - Security & Compliance: Lead the hardening of our clusters, manage secrets, and implement the technical controls necessary for our upcoming SOC2/ISO 27001 audits. - Database & Messaging Operations: Ensure the stability, backup automation, and scaling of our MySQL databases, BigQuery datasets, and message brokers (RabbitMQ, Cloud PubSub). - High-Availability Support: Act as the primary point of contact for infrastructure stability, ensuring continuous operational overlap during core PST working hours. - Release Management: Handle deployment and version control of multiple systems and releases within our blue-green environment Note that job duties and responsibilities may evolve based on company needs and technological advancements. Travel: Travel to company headquarters may be requested occasionally, typically 2–4 times per year. Physical Requirements Able to sit or stand at a computer for extended periods and use monitors and related hardware comfortably.



