Job Closed

This listing is no longer active.

SRE/DevOps Professional, AWS

Location

Brazil

Posted

54 days ago

Salary

0

Seniority

Senior

Job Description

SRE/DevOps Professional, AWS

Internas

• Design and build complex, mission-critical cloud architectures, ensuring security and cost optimization. • Develop and implement custom monitoring and observability solutions, including creating dashboards in DataDog. • Provide training, mentoring, and promote SRE best practices to raise the maturity of the technical community. • Participate in discussions and define standards for Infrastructure as Code (IaC), continuous integration, and automation via GitOps. • Collaborate on the continuous improvement of the development platform, offering technical feedback and suggestions for enhancement. • Serve as a technical reference, supporting squads in resolving complex cloud infrastructure and observability issues.

Job Requirements

  • Strong expertise in AWS architectures (multi-account setups, isolated VPCs, security, and cost controls).
  • Advanced experience with ECS, EKS, and other cloud compute services.
  • Infrastructure automation using GitOps practices and tools such as Terraform.
  • Development of advanced monitoring and observability solutions (DataDog, structured logging, distributed tracing).
  • Implementation of disaster recovery and high-availability strategies in the cloud, including custom instrumentation for application and infrastructure observability.

Benefits

  • We value the continuous growth of Zuppers, encouraging each individual to pursue paths that drive their professional development.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Klar logo

DevOps Engineer

Klar

We believe everyone should live well

DevOps Engineer54 days ago
Full TimeRemoteTeam 51-200H1B Sponsor

• Manage SLAs that require sub-hour attention and respond to incidents on a daily basis • React, resolve, and escalate incidents as necessary, collaborating with the technical and business teams. • Build and implement incident processes to ensure the organization's ability to operate our connection at scale, working alongside the business and technology teams. • Take ownership, plan, and execute both planned and on-demand maintenance of our systems, collaborating with impacted and dependent technology and business teams. • Continuously improve our product, including observability and integration with internal services and processes.

Argentina
Job Closed
Dotmatics logo

Senior DevOps Engineer

Dotmatics

Founded in 2005, Dotmatics is self-described as the world’s largest research and development scientific software platform, used by leading researchers in biopharma, academia, and

DevOps Engineer54 days ago

• Maintain and enhance our cloud-based infrastructure • Spearhead our ISO 27001 effort and maintain compliance with corporate security requirements • Work in a small yet highly effective and efficient specialised team • Maintain and operate the software build server and continuous integration pipelines for cross-platform desktop and HPC applications • Manage release processes and versioned software distribution • Support and maintain license control systems and related backend services • Manage and configure AWS services, including: EC2, RDS, S3, ECR, IAM, WAF, CloudFront, Identity, CloudTrail, Security Lake • Ensure security, scalability, and reliability of cloud infrastructure • Support operational tooling across AWS and other cloud providers • Administrator for a diverse set of services (Office, AWS, Github, Mailchimp, etc)

Ireland
Job Closed
ClickHouse logo

Database Reliability Engineer – Core Team

ClickHouse

ClickHouse, Inc. is a database management system that allows users to generate analytical reports using real-time SQL queries. The company’s technology works faster than traditio

DevOps Engineer54 days ago

• Continuously improve the reliability and performance of ClickHouse core. • Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers. • Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements. • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers. • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities. • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact.

United States
ClickHouse logo

Database Reliability Engineer – Core Team

ClickHouse

ClickHouse, Inc. is a database management system that allows users to generate analytical reports using real-time SQL queries. The company’s technology works faster than traditio

DevOps Engineer54 days ago

• Continuously improve the reliability and performance of ClickHouse core. • Improve and create metrics and alerts for ClickHouse to be able to identify and prevent problems in production before they affect customers. • Dig deeper into the most common problems encountered by customers in Clickhouse Core to identify the root cause of problems and submit bug fixes, issue reports and suggest improvements. • Enhance and refine incident response processes and post-mortem analysis for ClickHouse core related outages including working with support and Cloud teams to communicate to the impacted customers. • Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities. • Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize customer impact.

Netherlands