Job Closed

This listing is no longer active.

Benchmark

When It Matters®

Senior Software Engineer, Site Reliability

DevOps EngineerDevOps EngineerOther Remote SeniorTeam 10,001+Since 1979Company Site LinkedIn

Location

United States

Posted

129 days ago

Salary

Seniority

Senior

Bachelor Degree5 yrs expEnglishAWS Grafana Prometheus Python Terraform

Job Description

• Contribute to the design, development, and delivery of features that enhance system reliability and scalability. • Define, measure, and improve SLIs, SLOs, and error budgets in collaboration with engineering teams. • Participate in building a culture of reliability through knowledge sharing, documentation, and process improvements. • Implement and improve observability tooling and practices to monitor the health and performance of production systems. • Participate in incident management, including on-call rotations, root cause analysis, and postmortem reviews. • Lead smaller initiatives or components of larger projects, ensuring technical quality and operational readiness. • Collaborate with software engineering, security, and product teams to ensure resilient and secure system design. • Mentor junior engineers, sharing expertise in SRE principles and AWS best practices. • Contribute to automation efforts to reduce toil and improve efficiency of operational processes.

Job Requirements

5+ years of experience in Site Reliability Engineering, DevOps, or Software Engineering with a focus on production operations.
Strong knowledge of AWS cloud services and cloud-native architectures.
Proficiency in scripting or programming languages (e.g., Python, Bash).
Experience with observability tools (e.g., CloudWatch, Datadog, Prometheus, Grafana).
Familiarity with infrastructure-as-code tools (e.g., Terraform, CloudFormation) and CI/CD pipelines.
Strong problem-solving skills and ability to work cross-functionally.
Some experience mentoring or coaching junior engineers.

Benefits

Health insurance
Retirement plans
Paid time off
Flexible work arrangements
Professional development

Related Categories

DevOps Engineer

Related Job Pages

Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior DevOps Engineer, AWS

RecruityTalent

Connecting top IT and Executive talents with great companies in EMEA/LATAM through tailored recruitment solutions.

DevOps Engineer129 days ago

Full Time RemoteTeam 1-10Since 2024H1B No Sponsor

Company Site LinkedIn

• Lead operations for multi-tenant SaaS workloads on AWS, ensuring scalability, high availability, and cost efficiency • Design, implement, and maintain reliable infrastructure for production, data, and AI/ML workloads • Own incident response, postmortems, and operational runbooks to improve system reliability and reduce MTTR • Manage and enhance CI/CD pipelines supporting both application and ML deployment workflows • Build and maintain infrastructure automation using Infrastructure as Code (AWS CDK or Terraform) • Enable self-service capabilities for engineering and data science teams • Monitor and optimize cloud usage across compute, GPU, and storage resources, implementing cost controls and forecasting • Support and automate ML pipelines, including training, testing, and deployment using AWS SageMaker, Kubeflow, or MLflow • Manage GPU and compute clusters (EKS, ECS, EC2) for model training and inference workloads • Develop and maintain monitoring, alerting, observability, and security best practices • Collaborate closely with Engineering, Data, AI/ML, and PlatformOps teams to ensure smooth cross-team delivery

AWS Amazon EC2 Grafana Kubernetes MongoDB Python Splunk Terraform

View details: Senior DevOps Engineer, AWS

Bulgaria

Apply

Job Closed

DevOps Engineer

Saalex Corporation

An Employee-Owned Company

DevOps Engineer129 days ago

Other RemoteTeam 501-1,000Since 1999H1B No Sponsor

Company Site LinkedIn

Spalding, a Saalex Company is seeking a DevOps Engineer in Patuxent River, MD. Spalding, a Saalex Company is a professional services company delivering cutting-edge solutions to the Department of Defense since 2001. Our expert-level solutions include software development, information technology, program management, financial management and business intelligence services. Spalding, a Saalex Company offers competitive compensation, career development, flexible work schedules and excellent benefits. Position Type: Full-Time Salary: $115K-$135K (depending on experience) Work Location: This is a remote position. **On-Site Requirements: On-boarding will require 1-2 visits to Patuxent River, MD for candidates that are local to the area. Candidates out of state will be onboarded virtually. Training will be virtual and telework maximized/permitted to the greatest extent possible, however for local candidates, training/tasking may require on-site work a few hours per week. Future on-site/telework requirements/schedules may change as additional client direction is received. Essential Functions: - Develops DevOps functionality for CI/CD pipeline solutions. - Improves and maintains GitLab pipeline configurations. - Collaborates and assists software engineers with the design, configuration, implementation, and maintenance of CI/CD pipelines. - Assist with GitLab upgrades as received from the vendor (i.e. bi-weekly, monthly, etc.; requires evening support) - Onboards new applications/customers to the CI/CD environment. - Provides recommendations for technology advancement to streamline CI/CD tools and processes. - Provides technical assistance and troubleshooting to applications and systems deployed within a DevOps CI/CD pipeline. - Identifies, troubleshoots, and resolves pipeline issues. - Other duties as assigned or required.

View details: DevOps Engineer

Maryland

Apply

Job Closed

SRE – Clickhouse Team

PostHog

Product analytics, session replay, feature flags, A/B testing, data warehouse, CDP, surveys. PostHog does that.

DevOps Engineer129 days ago

Other RemoteTeam 11-50Since 2020H1B No Sponsor

Company Site LinkedIn

• Manage large fleets of EC2-based VMs, disks, and networking for data-intensive workloads • Improving operational tooling around deploys, schema changes, backups, restores, and incident response • Working closely with ClickHouse engineers to turn database-level needs into infra-level solutions • Reducing operational load by identifying repeat pain points and eliminating them through code and self-healing automation • Participating in on-call and incident response, with a strong focus on making incidents rarer over time • You’ll have room to design and automate, not just respond to alerts.

Ansible AWS Amazon EC2 Linux Terraform

View details: SRE – Clickhouse Team

United States

Apply

Job Closed

AI DevOps Engineer

Lingaro

DevOps Engineer129 days ago

Full Time RemoteTeam 1,001-5,000Since 2008H1B No Sponsor

Company Site LinkedIn

• Seeking a Lead AI DevOps Engineer to oversee design and delivery of advanced AI/ML/GenAI solutions. • Combines cloud engineering and automation with hands-on leadership in deploying and integrating LLM/SLM models into enterprise applications. • Leading architecture and deployment of AI/ML/GenAI solutions (LLM/SLM at scale). • Driving automation of infrastructure, model lifecycle and inference pipelines. • Overseeing CI/CD processes for AI/ML/GenAI workloads. • Designing secure, scalable cloud infrastructures (Azure-focused). • Acting as technical advisor for stakeholders and client-facing solution design. • Mentoring engineers, promoting best practices, and fostering innovation in GenAI adoption. • Coordinating cross-functional teams to align AI engineering with business outcomes. • Ensuring cost optimization, monitoring and compliance across environments.

Ansible Azure Jenkins Kubernetes Linux macOS Python Terraform

View details: AI DevOps Engineer

India

Apply

Senior Software Engineer, Site Reliability

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior DevOps Engineer, AWS

DevOps Engineer

SRE – Clickhouse Team

AI DevOps Engineer