Job Closed

This listing is no longer active.

Red Hat logo
Red Hat

The leading provider of enterprise open source solutions.

Senior Site Reliability Engineer, Azure Red Hat OpenShift

DevOps EngineerDevOps EngineerOtherRemoteSeniorTeam 10,001+Since 1993H1B SponsorCompany SiteLinkedIn

Location

California + 1 moreAll locations: California | Oregon

Posted

134 days ago

Salary

$139.6K - $230.2K / year

Seniority

Senior

Job Description

Senior Site Reliability Engineer, Azure Red Hat OpenShift

Red Hat

• Contribute code to increase the scalability and reliability of the service • Contribute software tests and participate in peer review to increase the quality of our codebase • Help and develop peers’ capabilities through knowledge sharing, mentoring, and collaboration • Participate in a regular on-call schedule, including occasional paid weekends and holidays • Practice sustainable incident response and blameless postmortems • Resolve customer issues escalated from the Red Hat Global Support team • Work within a small agile team to develop and improve SRE software, support your peers, plan and self-improve • Explore and experiment with emerging AI technologies relevant to software development, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.

Job Requirements

  • Bachelor's degree in Computer Science or related technical field, or equivalent experience
  • 5+ years of experience managing Linux servers running Red Hat Enterprise Linux (RHEL), CentOS, or Fedora hosted at a cloud provider such as Amazon Web Services (AWS), Google Compute Engine (GCE), or Microsoft Azure
  • 3+ years of experience with enterprise systems monitoring; knowledge of Prometheus is a plus
  • 3+ years of experience with enterprise configuration management software like Ansible by Red Hat, Puppet, or Chef
  • 2+ years of experience programming with at least one object-oriented language; Golang, Java or Python preferred
  • 2+ years of experience delivering a hosted service
  • Demonstrated ability to quickly and accurately troubleshoot system issues
  • Solid understanding of standard TCP/IP networking and common protocols like DNS and HTTP
  • Solid communications skills and experience working directly with and presenting to customers.

Benefits

  • Comprehensive medical, dental, and vision coverage
  • Flexible Spending Account - healthcare and dependent care
  • Health Savings Account - high deductible medical plan
  • Retirement 401(k) with employer match
  • Paid time off and holidays
  • Paid parental leave plans for all new parents
  • Leave benefits including disability, paid family medical leave, and paid military leave
  • Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Famedly GmbH logo

Senior Site Reliability Engineer – m/f/d

Famedly GmbH

Famedly is a complete medical collaboration platform delivered as a single decentralized application.

DevOps Engineer134 days ago
Full TimeRemoteTeam 11-50Since 2019H1B No Sponsor

• Responsibility for the reliability, observability, and performance of backend systems • Design and implement SRE practices • Maintain infrastructure as code • Work closely with development teams • Automate incident detection and remediation • Contribute to architecture and roadmap

Germany
€60K - €70K / year
Full TimeRemoteTeam 1,001-5,000Since 2014H1B Sponsor

• Analyze systemic failure patterns and design improvements that prevent incident recurrence • Define and maintain SLO/SLA frameworks; use error budgets to guide reliability investments • Build tooling and automation to reduce incident response toil and scale team impact • Own Rootly configuration, workflows, and integrations with PagerDuty, Jira, Confluence, and Slack • Analyze reliability data to identify systemic improvements; build dashboards that drive action • Explore AI-assisted approaches to documentation quality and incident analysis • Design scalable reliability standards that reduce reactive workload over time. • Own standards, practices, and continuous improvement of incident response • Define incident commander eligibility criteria and manage the rotation • Available as escalation IC when incidents exceed a team's management chain • Develop and deliver training programs for engineering teams at all levels • Coach teams through post-mortems and on developing actionable corrective actions • Edit and review customer-facing incident documents to ensure quality and clarity • Drive turnaround SLAs while maintaining technical accuracy • Ensure clear explanation of what happened, why, and how we'll prevent recurrence • Partner with engineering leaders to elevate reliability practices • Be the expert who teams proactively engage for guidance

India
Veradigm® logo

DevOps Engineer

Veradigm®

Driving value through its unique combination of platforms, data, expertise, connectivity, and scale.

DevOps Engineer134 days ago
Full TimeRemoteTeam 1,001-5,000H1B No Sponsor

• Veradigm is expanding its DevOps Engineering team and is seeking a highly skilled and enthusiastic DevOps Engineer to support and evolve our platforms and systems. • This role is critical to the success of our VEHR/VPM/VIE products and will be responsible for building and deploying solutions and services in On-premises and Hosted environment. • Simultaneously, it will also support Azure environments used by the Dev/QA teams. • Knowledge of secure DevOps practices (secrets management, compliance, scanning tools). • Exposure and understanding of container technologies like Docker and/or Kubernetes. • Experience with Configuration Management tools (e.g., Ansible, Chef, etc.) is a plus. • Able to work with developers supporting both modern and legacy applications. • Comfortable with CI/CD, including debugging build failures and deployment issues. • Self-driven and motivated, with the ability to work independently and prioritize tasks effectively. • Strong communication and interpersonal skills, with the ability to collaborate and communicate effectively with cross-functional teams. • Excellent troubleshooting and problem-solving skills, with keen attention to detail. • Excellent documentation skills.

India
Job Closed
Full TimeRemoteTeam 51-200Since 2007H1B No Sponsor

• Lead the installation, automation, and operational reliability of modern open-source data and integration platform. • Install, configure, upgrade, and maintain distributed open-source components including Apache Airflow, Apache NiFi, Apache Spark, Apache Kafka, PostgreSQL, MQTT brokers. • Ensure platform stability, scalability, high availability, and fault tolerance. • Design, deploy, and operate containerized workloads using Docker and Kubernetes. • Implement Infrastructure as Code (IaC) using Terraform and build configuration management and automation workflows using Ansible. • Deploy and operate workloads on public cloud platforms (AWS, Azure, GCP) and private/on-prem infrastructure. • Design and implement comprehensive monitoring, logging, and alerting for infrastructure and applications. • Implement security best practices across containers, Kubernetes, and networks.

India
Job Closed