Job Closed

This listing is no longer active.

CVS Health logo
CVS Health

Bringing our heart to every moment of your health.

Senior Site Reliability Engineer – Metrics and Observability

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 10,001+Since 1963H1B No SponsorCompany SiteLinkedIn

Location

Louisiana + 4 moreAll locations: Louisiana | Montana | North Carolina | Mississippi | Rhode Island

Posted

65 days ago

Salary

$83.4K - $203.9K / year

Seniority

Senior

Bachelor Degree5 yrs expExperience acceptedEnglishAWSAzureDockerGCPGrafanaKubernetesPrometheus

Job Description

Senior Site Reliability Engineer – Metrics and Observability

CVS Health

• Define, implement, and maintain key performance metrics, SLOs, and SLIs to measure system reliability and performance • Manage error budgets effectively, collaborating with development teams to balance reliability and feature delivery • Design and implement comprehensive monitoring solutions to provide real-time visibility into system health • Develop and implement automated quality gates that ensure all releases meet defined reliability and performance standards • Assist in incident response efforts by providing insights from metrics and monitoring tools • Drive initiatives to enhance monitoring, observability, and reliability practices

Job Requirements

  • 5+ years of experience in Site Reliability Engineering and/or DevOps
  • 3+ years of experience defining and implementing metrics, SLOs, and SLIs
  • 2+ years of experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack)
  • 2+ years of experience with cloud platforms (AWS, Azure, GCP) and container orchestration (Docker, Kubernetes)

Benefits

  • Affordable medical plan options
  • 401(k) plan (including matching company contributions)
  • Employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation, and weight management programs
  • Confidential counseling and financial coaching
  • Paid time off
  • Flexible work schedules
  • Family leave
  • Dependent care resources
  • Colleague assistance programs
  • Tuition assistance
  • Retiree medical access

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Perforce Software logo

Principal DevOps Engineer

Perforce Software

The DevOps Edge for the Outperformers: Enable teams to build, manage & maintain apps — from code to business-ready.

DevOps Engineer65 days ago
Full TimeRemoteTeam 1,001-5,000Since 1995H1B Sponsor

• Responsible for building platforms and frameworks to create consistent, verifiable, and automatic management of applications and infrastructure between non-production and production environments • Mentored exceptional engineers and DevOps developers on Cloud technology and practice. • Implement application security best practices throughout the agile SDLC • Foster and advocate for a DevOps culture at Perforce to ensure efficient testing, delivery, and deployment of all software artifacts • Lead the development and enhancements of our CI/CD pipeline infrastructure/tools. • Establish technical design principles and practices and drive them across all product portfolios to make operation design a must-have phase of the development lifecycle • Your daily tasks will include developing a technical design for our cloud platform, developing a framework, and working with • Dev and DevOps teams to ensure that the product meets our quality standards.

Massachusetts
$120K - $150K / year
Job Closed
May Mobility logo

Senior Autonomy Release Engineer

May Mobility

Transforming cities through autonomous technology to create a safer, greener, more accessible world.

DevOps Engineer65 days ago
Full TimeRemoteTeam 51-200Since 2017H1B Sponsor

• Release ownership and release execution end-to-end across: • Major autonomy releases • Incremental/performance releases • Hotfix/safety patches • Manage branching strategy, versioning, and release cut processes • Drive release readiness and go/no-go decisions • Partner cross-functionally with Autonomy, Infra, Validation, and Fleet Ops • Act as a system owner for release readiness • Investigate and resolve complex issues arising from: • Software/hardware interactions • Distributed systems behavior • On-vehicle vs simulation discrepancies • Develop deep understanding of: • Sensor stack, middleware, autonomy stack • Compute platforms, networking, configurations • Be the go-to for: • “Why does this fail in the real world?” • “What changed between releases?” • Enforce stage-gated release framework: • Feature Complete → Code Freeze → Validation → Release Candidate • Integrate validation signals: • Simulation corpus results • Regression testing • Vehicle testing (HIL / on-road) • Ensure safety-critical issues are identified, tracked, and gated • Take initiative to find and permanently solve challenging system level issues caused by the interplay between different software and hardware components. • Collaborate and lead system-wide improvements when working with other teams without having direct ownership or management responsibility. • Assess and develop approaches that scale and improve performance in a variety of ways (e.g. CPU performance, memory usage, disk usage, network usage).

United States
$176K - $253K / year
Ikatec logo

DevOps – On-Call

Ikatec

Possibilitar às organizações a potencializar seus resultados

DevOps Engineer65 days ago
Full TimeRemoteTeam 51-200Since 2011H1B No Sponsor

• Implements complex solutions • Mentors junior engineers • Manages technical risks • Demonstrates a clear understanding of business impact

Brazil
Job Closed
Loopback Analytics logo

Cloud Operations Engineer

Loopback Analytics

Where Health Systems & Life Sciences Connect.

DevOps Engineer65 days ago
Full TimeRemoteTeam 51-200Since 2009H1B No Sponsor

• Monitor and analyze cloud spend across Azure, Databricks, and Snowflake (if applicable). • Build cost tracking and reporting dashboards using native tools (Azure Cost Management, Databricks Billing APIs & Dashboards, Power BI) and SQL/Python. • Develop tagging and cost-allocation frameworks to support chargeback and forecasting models. • Partner with Engineering and Infrastructure to optimize compute usage, storage tiers, and licensing. • Recommend query tuning, partitioning and caching strategies to improve performance and reduce cost-per-job on Databricks. • Support monthly budget reviews and identify trends, anomalies, and optimization opportunities. • Manage governance policies for resource provisioning and cost alerts. • Partner with Information Security to ensure FinOps practices and cost controls align with SOC2 and HIPAA compliance requirements. • Provide insights into FinOps maturity and recommend process improvements.

Texas
Job Closed