Job Closed

This listing is no longer active.

Natera logo
Natera

Founded in 2004 and led by CEO Steve Chapman, Natera is a company in the biotechnology market that offers genetic testing and diagnostics on a global scale. Ope

Senior DevOps/SRE Engineer

Location

United States

Posted

130 days ago

Salary

$140.2K - $175.2K / year

Seniority

Senior

Job Description

Senior DevOps/SRE Engineer

Natera

• Own the entire Laboratory Operations Software release process execution, ensuring smooth and timely software releases with minimal downtime. • Continuously monitor the effectiveness of the release process and implement improvements to increase efficiency, reduce errors, and enhance overall quality. • Act as an internal consultant and subject matter expert, coaching individual product teams on best-in-class DevOps practices, including infrastructure-as-code (IaC), monitoring, logging, and security integration. • Embed with development teams to assess and improve DevOps maturity, delivery practices, and operational readiness. • Design and implement a variety of projects to support extreme growth of complexity of applications as well as to enable innovation. • Provide hands-on guidance in CI/CD, cloud infrastructure usage, Kubernetes operations, and observability. • Help teams adopt existing infrastructure, platforms, and tooling provided by central Cloud / Platform teams. • Promote and reinforce technical standards, guardrails, and best practices that allow teams to operate autonomously while remaining compliant and secure. • Guide teams in applying organizational expectations around reliability, security, and cost management through automation rather than manual controls. • Serve as a feedback channel to central platform and cloud teams, sharing adoption challenges and improvement opportunities. • Continuously improve and automate infrastructure provisioning, configuration management, application deployment, and testing using tools like Terraform, Kubernetes and CI/CD. • Advocate for automation-first approaches to reduce operational toil and risk. • Partner with teams to define and implement Service Level Indicators (SLIs), Service Level Objectives (SLOs), and operational dashboards for their services. • Guide teams through incident response, post-incident reviews, and reliability improvements. • Identify systemic reliability issues and escalate platform-level concerns to the appropriate owning teams. • Drive capacity planning and performance tuning activities to ensure scalability and efficiency. • Provide expert-level support for complex infrastructure and deployment issues escalated by the product teams. • Assist teams in root cause analysis and long-term remediation. • Create and maintain clear documentation, runbooks, release process, CI/CD pipelines, and regression testing procedures. • Maintain comprehensive documentation of the release process, CI/CD pipelines, and regression testing procedures. • Share best practices and lessons learned across teams to raise overall DevOps maturity.

Job Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
  • 7+ years of professional software engineering experience building production-grade systems with emphasis on automation, integrations and infrastructure tooling.
  • Excellent problem-solving skills with the ability to troubleshoot complex issues in a fast-paced environment.
  • Excellent communication, coaching, and collaboration skills, with the ability to work effectively across teams and convey technical concepts to non-technical stakeholders.
  • Deep understanding of Site Reliability Engineering (SRE) principles, including SLIs, SLOs, error budgets, and toil reduction.
  • Expertise in setting up and managing comprehensive monitoring, logging, and alerting systems.
  • Proven experience with incident response and leading post-incident review (post-mortem) processes.
  • Experience with capacity planning, performance analysis, and optimization of distributed systems.
  • Strong expertise in CI/CD tools (e.g., Jenkins, GitLab CI).
  • Practical experience building complex CI/CD pipelines.
  • Proficiency in at least one programming language (e.g., Java, Python).
  • Strong command of AWS stack.
  • Proficiency in Docker, Kubernetes and Helm.
  • Experience working with databases (SQL, MySQL, PostgreSQL).
  • Version control systems (e.g., Git).
  • Experience working with Terraform.

Benefits

  • Comprehensive medical, dental, vision, life and disability plans for eligible employees and their dependents.
  • Free testing for employees and their immediate families.
  • Fertility care benefits.
  • Pregnancy and baby bonding leave.
  • 401k benefits.
  • Commuter benefits.
  • Generous employee referral program.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer130 days ago
OtherRemoteTeam 1,001-5,000Since 1961H1B Sponsor

• Work with technical lead to ensure solutions are aligned with the ATIS enterprise architecture. • Design, build, and maintain CI/CD pipelines using technical resources that integrate secure code scanning. • Implement DevSecOps best practices to enable continuous delivery with Army and program-specific security controls. • Automate infrastructure provisioning and configuration using Infrastructure as Code (IaC) tools. • Integrate and manage security tools within the CI/CD pipeline. • Collaborate with cross-functional teams to align DevSecOps capabilities with Agile delivery. • Monitor pipeline and environment performance, perform troubleshooting, and resolve integration and deployment issues. • Enforce compliance with DoW Risk Management Framework (RMF), NIST SP 800-53, and STIG requirements.

United States
$120K - $180K / year
Job Closed
Bitcoin Depot logo

DevOps Manager

Bitcoin Depot

Bringing Bitcoin to the Masses

DevOps Engineer130 days ago
OtherRemoteTeam 51-200Since 2016H1B No Sponsor

• Lead and manage the DevOps team and management of Bitcoin Depot cloud infrastructure. • Migration of applications/services from cloud providers to AWS. • Work alongside the software engineering team to develop CI/CD pipelines. • Development of Terraform scripts for deployment of application/services in AWS infrastructure. • Monitoring and assisting in resolving system issues as they arise. • Ensure high availability, performance and cost efficiency of cloud infrastructure. • Collaborate with Software Engineering, IT Operations, and Security teams.

United States
Job Closed
ScienceLogic logo

Senior Site Reliability Engineer, Observability

ScienceLogic

We are a leader in AIOps providing modern IT operations with actionable insights to predict and resolve problems faster.

DevOps Engineer130 days ago
OtherRemoteTeam 501-1,000Since 2010H1B Sponsor

• Be a key contributor on an Agile development team, collaboratively realizing business value through iterative software development lifecycle • Build and execute the monitoring strategy for ScienceLogic SaaS infrastructure • Define, deploy, and maintain system and service monitors • Be the authority for various monitoring technologies like Prometheus, AWS Cloudwatch, Scylla manager, New Relic to provide next generation monitoring solutions for ScienceLogic SaaS • Employ advanced monitoring practices and technologies to detect and automatically resolve platform issues before they impact the customer’s experience. • Participate in architecture and operations reviews • Identify and automate measurement of operations SLAs, SLOs using SLIs • Triage incident response, document SOPs, Runbooks and train NOC team members • Participate in shared on-call manager rotation for escalations during incidents and outages, occasionally during off hours • Provide dash boarding and analytics solutions to internal teams based on requirements

Virginia
Job Closed
Pottencial Seguradora S.A logo

SRE Analyst, Senior

Pottencial Seguradora S.A

Somos a maior insurtech do Brasil e líderes no mercado de Seguro Garantia!

DevOps Engineer130 days ago
Full TimeRemoteTeam 501-1,000Since 2010H1B No Sponsor

• Collaborate with development, infrastructure, and security teams to design, build, and maintain reliable and scalable systems; • Participate in planning and executing load, chaos, and failover tests, focusing on risk mitigation and identification of bottlenecks; • Develop and maintain automation tools for monitoring, deployment, rollback, and incident response; • Monitor and respond to critical incidents, conducting root cause analysis (RCA) and proposing preventive actions; • Support the evolution of CI/CD processes, infrastructure as code (IaC), and security; • Lead automation, observability, and performance initiatives for critical systems; • Design, implement, and evolve monitoring, metrics, distributed tracing, and logging solutions; • Conduct incident reviews (postmortems) with root cause analysis and structured action plans; • Identify and apply continuous improvements to SLOs, SLIs, and SLAs; • Act as a focal point for failure mitigation, recovery, and continuity plans; • Drive a culture of reliability and resilience across the organization; • Mentor junior and mid-level professionals, promoting technical training and best practices.

Brazil
Job Closed