Job Closed

This listing is no longer active.

Benchmark logo
Benchmark

When It Matters®

Senior Software Engineer, Site Reliability

DevOps EngineerDevOps EngineerOtherRemoteSeniorTeam 10,001+Since 1979Company SiteLinkedIn

Location

United States

Posted

129 days ago

Salary

0

Seniority

Senior

Bachelor Degree5 yrs expEnglishAWSGrafanaPrometheusPythonTerraform

Job Description

Senior Software Engineer, Site Reliability

Benchmark

• Contribute to the design, development, and delivery of features that enhance system reliability and scalability. • Define, measure, and improve SLIs, SLOs, and error budgets in collaboration with engineering teams. • Participate in building a culture of reliability through knowledge sharing, documentation, and process improvements. • Implement and improve observability tooling and practices to monitor the health and performance of production systems. • Participate in incident management, including on-call rotations, root cause analysis, and postmortem reviews. • Lead smaller initiatives or components of larger projects, ensuring technical quality and operational readiness. • Collaborate with software engineering, security, and product teams to ensure resilient and secure system design. • Mentor junior engineers, sharing expertise in SRE principles and AWS best practices. • Contribute to automation efforts to reduce toil and improve efficiency of operational processes.

Job Requirements

  • 5+ years of experience in Site Reliability Engineering, DevOps, or Software Engineering with a focus on production operations.
  • Strong knowledge of AWS cloud services and cloud-native architectures.
  • Proficiency in scripting or programming languages (e.g., Python, Bash).
  • Experience with observability tools (e.g., CloudWatch, Datadog, Prometheus, Grafana).
  • Familiarity with infrastructure-as-code tools (e.g., Terraform, CloudFormation) and CI/CD pipelines.
  • Strong problem-solving skills and ability to work cross-functionally.
  • Some experience mentoring or coaching junior engineers.

Benefits

  • Health insurance
  • Retirement plans
  • Paid time off
  • Flexible work arrangements
  • Professional development

Related Categories

Related Job Pages

More DevOps Engineer Jobs

RecruityTalent logo

Senior DevOps Engineer, AWS

RecruityTalent

Connecting top IT and Executive talents with great companies in EMEA/LATAM through tailored recruitment solutions.

DevOps Engineer129 days ago
Full TimeRemoteTeam 1-10Since 2024H1B No Sponsor

• Lead operations for multi-tenant SaaS workloads on AWS, ensuring scalability, high availability, and cost efficiency • Design, implement, and maintain reliable infrastructure for production, data, and AI/ML workloads • Own incident response, postmortems, and operational runbooks to improve system reliability and reduce MTTR • Manage and enhance CI/CD pipelines supporting both application and ML deployment workflows • Build and maintain infrastructure automation using Infrastructure as Code (AWS CDK or Terraform) • Enable self-service capabilities for engineering and data science teams • Monitor and optimize cloud usage across compute, GPU, and storage resources, implementing cost controls and forecasting • Support and automate ML pipelines, including training, testing, and deployment using AWS SageMaker, Kubeflow, or MLflow • Manage GPU and compute clusters (EKS, ECS, EC2) for model training and inference workloads • Develop and maintain monitoring, alerting, observability, and security best practices • Collaborate closely with Engineering, Data, AI/ML, and PlatformOps teams to ensure smooth cross-team delivery

Bulgaria
Job Closed
Saalex Corporation logo

DevOps Engineer

Saalex Corporation

An Employee-Owned Company

DevOps Engineer129 days ago
OtherRemoteTeam 501-1,000Since 1999H1B No Sponsor

Spalding, a Saalex Company is seeking a DevOps Engineer in Patuxent River, MD.  Spalding, a Saalex Company is a professional services company delivering cutting-edge solutions to the Department of Defense since 2001. Our expert-level solutions include software development, information technology, program management, financial management and business intelligence services.  Spalding, a Saalex Company offers competitive compensation, career development, flexible work schedules and excellent benefits. Position Type: Full-Time Salary: $115K-$135K (depending on experience) Work Location: This is a remote position. **On-Site Requirements: On-boarding will require 1-2 visits to Patuxent River, MD for candidates that are local to the area. Candidates out of state will be onboarded virtually. Training will be virtual and telework maximized/permitted to the greatest extent possible, however for local candidates, training/tasking may require on-site work a few hours per week. Future on-site/telework requirements/schedules may change as additional client direction is received. Essential Functions: - Develops DevOps functionality for CI/CD pipeline solutions. - Improves and maintains GitLab pipeline configurations. - Collaborates and assists software engineers with the design, configuration, implementation, and maintenance of CI/CD pipelines. - Assist with GitLab upgrades as received from the vendor (i.e. bi-weekly, monthly, etc.; requires evening support) - Onboards new applications/customers to the CI/CD environment. - Provides recommendations for technology advancement to streamline CI/CD tools and processes. - Provides technical assistance and troubleshooting to applications and systems deployed within a DevOps CI/CD pipeline. - Identifies, troubleshoots, and resolves pipeline issues. - Other duties as assigned or required.

Maryland
Job Closed
PostHog logo

SRE – Clickhouse Team

PostHog

Product analytics, session replay, feature flags, A/B testing, data warehouse, CDP, surveys. PostHog does that.

DevOps Engineer129 days ago
OtherRemoteTeam 11-50Since 2020H1B No Sponsor

• Manage large fleets of EC2-based VMs, disks, and networking for data-intensive workloads • Improving operational tooling around deploys, schema changes, backups, restores, and incident response • Working closely with ClickHouse engineers to turn database-level needs into infra-level solutions • Reducing operational load by identifying repeat pain points and eliminating them through code and self-healing automation • Participating in on-call and incident response, with a strong focus on making incidents rarer over time • You’ll have room to design and automate, not just respond to alerts.

United States
Job Closed
Full TimeRemoteTeam 1,001-5,000Since 2008H1B No Sponsor

• Seeking a Lead AI DevOps Engineer to oversee design and delivery of advanced AI/ML/GenAI solutions. • Combines cloud engineering and automation with hands-on leadership in deploying and integrating LLM/SLM models into enterprise applications. • Leading architecture and deployment of AI/ML/GenAI solutions (LLM/SLM at scale). • Driving automation of infrastructure, model lifecycle and inference pipelines. • Overseeing CI/CD processes for AI/ML/GenAI workloads. • Designing secure, scalable cloud infrastructures (Azure-focused). • Acting as technical advisor for stakeholders and client-facing solution design. • Mentoring engineers, promoting best practices, and fostering innovation in GenAI adoption. • Coordinating cross-functional teams to align AI engineering with business outcomes. • Ensuring cost optimization, monitoring and compliance across environments.

India