Senior Site Reliability Engineer, SRE
Location
Brazil
Posted
4 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer, SRE
Compass
• Ensure the reliability, availability, scalability, and performance of production systems; • Define, monitor, and evolve SLIs, SLOs, SLAs, and Error Budgets; • Implement and enhance observability practices, including logs, metrics, tracing, and alerts; • Participate in response to critical incidents, conduct root cause analyses (RCA), and lead blameless post-mortems; • Automate operational processes to reduce manual work and increase efficiency; • Collaborate with Development, DevOps, and Architecture teams to prevent systemic failures; • Plan and validate strategies for high availability, scalability, capacity planning, and disaster recovery; • Support technical decisions through analysis of reliability, performance, and utilization metrics; • Contribute to the continuous evolution of a reliability culture and operational excellence.
Job Requirements
- Bachelor's degree in Computer Science, Software Engineering, Information Systems, or a related field;
- Proven experience in SRE, IT Operations, Cloud, or Software Engineering;
- Experience with critical, distributed, and high-availability environments;
- Experience with monitoring, incident management, and operational reliability;
- Experience with large-scale AWS environments;
- Advanced knowledge of Docker and Kubernetes;
- Experience with observability, monitoring, and troubleshooting tools;
- Automation skills using Python and Shell scripting;
- Knowledge of resilience concepts, disaster recovery, capacity planning, and security;
- Experience with Chaos Engineering;
- Knowledge of OpenTelemetry and distributed observability.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer
UnissantUnissant is a data-driven digital solutions provider dedicated to “Keeping Our Nation Healthy and Safe” by delivering innovative technology and analytics services to federal ag
Role Description We are seeking a DevOps Lead to join our team and support our clients in the Washington DC-Baltimore area. The ideal candidate will be responsible for providing design recommendations based on long-term IT organization strategy and viewed both internally and externally as a technical expert and critical technical resource across multiple disciplines. - Architect, implement, and maintain end-to-end CI/CD pipelines leveraging CMS-approved tools (e.g., GitHub, Jenkins) to support continuous integration, automated testing, and continuous delivery. - Standardize pipeline design across all application components to ensure consistency, traceability, and repeatability. - Implement automated build, test, security scan, and deployment stages aligned with CMS Target Life Cycle (TLC) requirements. - Optimize pipeline performance to reduce build and deployment times while maintaining quality and compliance. - Enable parallel development and multi-release support, ensuring seamless handling of concurrent releases. - Lead the design and implementation of Infrastructure as Code (IaC) solutions (e.g., Terraform, AWS CloudFormation) to provision and manage AWS environments. - Ensure all infrastructure is version-controlled, auditable, and reproducible. - Establish automated environment provisioning processes for development, test, staging, and production environments. - Implement configuration management and environment consistency controls across all environments. - Support environment scaling, failover, and disaster recovery capabilities. - Collaborate with the System Architect to modernize infrastructure using AWS-native and serverless technologies where appropriate. - Ensure all deployments meet CMS security requirements (ARS, FISMA, HIPAA) prior to release. - Establish real-time alerting for system health, performance degradation, and failures. - Develop and maintain operational dashboards for system performance, availability, and pipeline metrics. - Support incident response by providing diagnostic tools, logs, and root cause analysis capabilities. Qualifications - Minimum 10 years of experience in software engineering, DevOps, or cloud engineering roles. - Minimum 5 years leading DevOps teams supporting enterprise-scale systems. - Demonstrated experience designing and implementing CI/CD pipelines using tools such as Jenkins, GitHub Actions, or equivalent. - Strong hands-on experience with AWS cloud services, including compute, storage, networking, and monitoring. - Proven experience implementing Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or equivalent. - Demonstrated experience implementing automated testing frameworks (BDD, TDD, regression automation). - Experience integrating security practices into CI/CD pipelines (DevSecOps). - Demonstrated experience supporting high-availability production systems with zero/low downtime deployments. - Experience implementing monitoring, logging, and observability solutions. - Experience supporting federal or regulated environments with compliance requirements. - AWS DevOps Engineer certification or equivalent (required or strongly preferred). - Enthusiastic, proactive, positive attitude and high integrity. - Excellent organizational skills, strong attention to detail and ability to effectively manage architectures supporting multiple users. - Strong desire to mentor, coach team members while providing oversight for all aspects of relevant programs. - Ability to think and act strategically and proactively approach projects and issues. - Able to work under pressure and to be flexible with changing priorities. - Able to find innovative ways to solve problems. - A genuine interest in looking for opportunities to add value and grow your area of responsibility. Education - Bachelor's Degree in Computer Science, Information Systems, or a related field is required. Certificates, Licenses and Registrations - Relevant certifications or credentials are preferred. Communication Skills - Strong writing, listening and presentation skills. - Solid ability to interface, inspire and motivate at various levels of the organization. - Experience communicating effectively across internal and external organizations. Travel - This is a remote position. Environmental Requirements - Mainly sedentary; in an office environment. - May be required to lift to ten (10) pounds. - Flexible in working extended hours.
Site Reliability Engineer – AWS
SitusAMCWe're helping our clients identify and capture opportunities across the entire lifecycle of their real estate activity.
• Support products transitioned from on-prem data center into AWS Cloud • Implement cloud best practices for newly transitioned products • Maintain operational coverage of environments • Enhance automation capabilities and process improvement • Collaborate with development teams for secure migration of changes • Design and implement scalable and reliable solutions • Improve observability of Applications running in Cloud • Lead strategic initiatives for seamless application integration
GBS Enhancement Ownership Resolution Agent
UPSUPS is committed to providing a workplace free of discrimination, harassment, and retaliation.
Role Description This position provides a one-to-one customer service experience for UPS customers. He/She analyzes and resolves problems, solves tracing and claim inquiries, and maintains ownership of customer situations through resolution which may include follow-ups and making outbound calls. Responsibilities: - Builds and maintains relationships with related internal functions (i.e., Billing, Brokerage, Delivery Information and Operations). - Analyzes and resolves customer issues. - EO Agent needs to have strong written and verbal communication skills in handling inquiries from internal and external customers. - Responsible for recording communications with customers in a database. - Needs to have flexibility and must be able to adapt to change with a positive attitude. - Analyzes and reviews internal audit reviews and customer surveys. Qualifications - Technicians or above in administrative/customer service/finance careers - Strong oral and written communication skills - Strong problem-solving skills - Previous customer service experience in UPS - English 50% or above Requirements - Shift: Monday to Friday - From 1:00 am to 1:00 pm including Colombian holidays - Shift flexibility - 100% Teleworker - Grade 8 Benefits - Permanent employee status
Role Description In this role, you will act as a bridge between our platform engineers and the developer community, helping users understand the platform’s value, driving engagement, and fostering adoption through: - Thought leadership - Technical content - Hands-on engagements when needed - Community building Qualifications - Hands-on experience with CI/CD tools, DevSecOps practices, and cloud-native technologies. - Familiarity with tools like GitLab CI, Jenkins, GitLab, GitHub and GitHub Actions, Kubernetes, GitOps and declarative operations, and similar platforms. - Verifiable written and verbal communication skills. - Comfortable presenting to both technical and non-technical audiences. - Strong storytelling ability to articulate complex technical concepts in an engaging way. - Experience working with developer communities or open-source projects is a plus. - Proven track record of engaging with internal teams and communities of practice through social media, forums, and blogs. - Prior experience speaking at conferences, meetups, or webinars is preferred. - Ability to create engaging technical presentations and hands-on demos. - MS or BS degree in computer science, Information Technology, or a related field (or equivalent experience). - 5+ years of experience in DevSecOps, software development, or related fields. Company Description



