Senior Site Reliability Engineer, SRE

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 10,001+H1B SponsorCompany SiteLinkedIn

Location

Brazil

Posted

4 days ago

Salary

0

Seniority

Senior

Bachelor DegreePortugueseAWSCloudDockerKubernetesPython

Job Description

Senior Site Reliability Engineer, SRE

Compass

• Ensure the reliability, availability, scalability, and performance of production systems; • Define, monitor, and evolve SLIs, SLOs, SLAs, and Error Budgets; • Implement and enhance observability practices, including logs, metrics, tracing, and alerts; • Participate in response to critical incidents, conduct root cause analyses (RCA), and lead blameless post-mortems; • Automate operational processes to reduce manual work and increase efficiency; • Collaborate with Development, DevOps, and Architecture teams to prevent systemic failures; • Plan and validate strategies for high availability, scalability, capacity planning, and disaster recovery; • Support technical decisions through analysis of reliability, performance, and utilization metrics; • Contribute to the continuous evolution of a reliability culture and operational excellence.

Job Requirements

  • Bachelor's degree in Computer Science, Software Engineering, Information Systems, or a related field;
  • Proven experience in SRE, IT Operations, Cloud, or Software Engineering;
  • Experience with critical, distributed, and high-availability environments;
  • Experience with monitoring, incident management, and operational reliability;
  • Experience with large-scale AWS environments;
  • Advanced knowledge of Docker and Kubernetes;
  • Experience with observability, monitoring, and troubleshooting tools;
  • Automation skills using Python and Shell scripting;
  • Knowledge of resilience concepts, disaster recovery, capacity planning, and security;
  • Experience with Chaos Engineering;
  • Knowledge of OpenTelemetry and distributed observability.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer

Unissant

Unissant is a data-driven digital solutions provider dedicated to “Keeping Our Nation Healthy and Safe” by delivering innovative technology and analytics services to federal ag

DevOps Engineer4 days ago

Role Description We are seeking a DevOps Lead to join our team and support our clients in the Washington DC-Baltimore area. The ideal candidate will be responsible for providing design recommendations based on long-term IT organization strategy and viewed both internally and externally as a technical expert and critical technical resource across multiple disciplines. - Architect, implement, and maintain end-to-end CI/CD pipelines leveraging CMS-approved tools (e.g., GitHub, Jenkins) to support continuous integration, automated testing, and continuous delivery. - Standardize pipeline design across all application components to ensure consistency, traceability, and repeatability. - Implement automated build, test, security scan, and deployment stages aligned with CMS Target Life Cycle (TLC) requirements. - Optimize pipeline performance to reduce build and deployment times while maintaining quality and compliance. - Enable parallel development and multi-release support, ensuring seamless handling of concurrent releases. - Lead the design and implementation of Infrastructure as Code (IaC) solutions (e.g., Terraform, AWS CloudFormation) to provision and manage AWS environments. - Ensure all infrastructure is version-controlled, auditable, and reproducible. - Establish automated environment provisioning processes for development, test, staging, and production environments. - Implement configuration management and environment consistency controls across all environments. - Support environment scaling, failover, and disaster recovery capabilities. - Collaborate with the System Architect to modernize infrastructure using AWS-native and serverless technologies where appropriate. - Ensure all deployments meet CMS security requirements (ARS, FISMA, HIPAA) prior to release. - Establish real-time alerting for system health, performance degradation, and failures. - Develop and maintain operational dashboards for system performance, availability, and pipeline metrics. - Support incident response by providing diagnostic tools, logs, and root cause analysis capabilities. Qualifications - Minimum 10 years of experience in software engineering, DevOps, or cloud engineering roles. - Minimum 5 years leading DevOps teams supporting enterprise-scale systems. - Demonstrated experience designing and implementing CI/CD pipelines using tools such as Jenkins, GitHub Actions, or equivalent. - Strong hands-on experience with AWS cloud services, including compute, storage, networking, and monitoring. - Proven experience implementing Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or equivalent. - Demonstrated experience implementing automated testing frameworks (BDD, TDD, regression automation). - Experience integrating security practices into CI/CD pipelines (DevSecOps). - Demonstrated experience supporting high-availability production systems with zero/low downtime deployments. - Experience implementing monitoring, logging, and observability solutions. - Experience supporting federal or regulated environments with compliance requirements. - AWS DevOps Engineer certification or equivalent (required or strongly preferred). - Enthusiastic, proactive, positive attitude and high integrity. - Excellent organizational skills, strong attention to detail and ability to effectively manage architectures supporting multiple users. - Strong desire to mentor, coach team members while providing oversight for all aspects of relevant programs. - Ability to think and act strategically and proactively approach projects and issues. - Able to work under pressure and to be flexible with changing priorities. - Able to find innovative ways to solve problems. - A genuine interest in looking for opportunities to add value and grow your area of responsibility. Education - Bachelor's Degree in Computer Science, Information Systems, or a related field is required. Certificates, Licenses and Registrations - Relevant certifications or credentials are preferred. Communication Skills - Strong writing, listening and presentation skills. - Solid ability to interface, inspire and motivate at various levels of the organization. - Experience communicating effectively across internal and external organizations. Travel - This is a remote position. Environmental Requirements - Mainly sedentary; in an office environment. - May be required to lift to ten (10) pounds. - Flexible in working extended hours.

United States
SitusAMC logo

Site Reliability Engineer – AWS

SitusAMC

We're helping our clients identify and capture opportunities across the entire lifecycle of their real estate activity.

DevOps Engineer4 days ago
Full TimeRemoteTeam 5,001-10,000H1B Sponsor

• Support products transitioned from on-prem data center into AWS Cloud • Implement cloud best practices for newly transitioned products • Maintain operational coverage of environments • Enhance automation capabilities and process improvement • Collaborate with development teams for secure migration of changes • Design and implement scalable and reliable solutions • Improve observability of Applications running in Cloud • Lead strategic initiatives for seamless application integration

United States
$110K - $140K / year
UPS logo

GBS Enhancement Ownership Resolution Agent

UPS

UPS is committed to providing a workplace free of discrimination, harassment, and retaliation.

DevOps Engineer4 days ago
Full TimeRemoteTeam 10,001+Since 1907H1B Sponsor

Role Description This position provides a one-to-one customer service experience for UPS customers. He/She analyzes and resolves problems, solves tracing and claim inquiries, and maintains ownership of customer situations through resolution which may include follow-ups and making outbound calls. Responsibilities: - Builds and maintains relationships with related internal functions (i.e., Billing, Brokerage, Delivery Information and Operations). - Analyzes and resolves customer issues. - EO Agent needs to have strong written and verbal communication skills in handling inquiries from internal and external customers. - Responsible for recording communications with customers in a database. - Needs to have flexibility and must be able to adapt to change with a positive attitude. - Analyzes and reviews internal audit reviews and customer surveys. Qualifications - Technicians or above in administrative/customer service/finance careers - Strong oral and written communication skills - Strong problem-solving skills - Previous customer service experience in UPS - English 50% or above Requirements - Shift: Monday to Friday - From 1:00 am to 1:00 pm including Colombian holidays - Shift flexibility - 100% Teleworker - Grade 8 Benefits - Permanent employee status

Colombia
Capgemini logo

Lead DevOps Engineer

Capgemini

Get the Future You Want

DevOps Engineer4 days ago
Full TimeRemoteTeam 10,001+Since 1967H1B Sponsor

Role Description In this role, you will act as a bridge between our platform engineers and the developer community, helping users understand the platform’s value, driving engagement, and fostering adoption through: - Thought leadership - Technical content - Hands-on engagements when needed - Community building Qualifications - Hands-on experience with CI/CD tools, DevSecOps practices, and cloud-native technologies. - Familiarity with tools like GitLab CI, Jenkins, GitLab, GitHub and GitHub Actions, Kubernetes, GitOps and declarative operations, and similar platforms. - Verifiable written and verbal communication skills. - Comfortable presenting to both technical and non-technical audiences. - Strong storytelling ability to articulate complex technical concepts in an engaging way. - Experience working with developer communities or open-source projects is a plus. - Proven track record of engaging with internal teams and communities of practice through social media, forums, and blogs. - Prior experience speaking at conferences, meetups, or webinars is preferred. - Ability to create engaging technical presentations and hands-on demos. - MS or BS degree in computer science, Information Technology, or a related field (or equivalent experience). - 5+ years of experience in DevSecOps, software development, or related fields. Company Description

Worldwide