Oowlish logo
Oowlish

We make innovation simple, convenient and right...we just make it HAPPEN

Senior Site Reliability Engineer, SRE

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 51-200Since 2017H1B No SponsorCompany SiteLinkedIn

Location

Brazil

Posted

4 days ago

Salary

0

Seniority

Senior

Bachelor Degree5 yrs expEnglishCloudPythonTypeScriptGo

Job Description

Senior Site Reliability Engineer, SRE

Oowlish

• Own the reliability, availability, and operational excellence of business-critical production systems. • Define how reliability is measured. • Lead incident response during production outages. • Drive observability strategy. • Continuously improve operational practices across high-availability environments. • Managing SLOs and leading major incidents.

Job Requirements

  • 5+ years of experience in Site Reliability Engineering, Production Engineering, Reliability Engineering, or similar roles.
  • Proven experience operating production systems in high-availability environments.
  • Hands-on experience defining and managing SLOs, SLIs, and Error Budgets.
  • Experience leading production incident response and Incident Command.
  • Strong observability and monitoring experience.
  • Strong software engineering skills using Python, Go, or TypeScript.
  • Experience working with cloud platforms.
  • Strong written and verbal English communication skills.

Benefits

  • Home office;
  • Competitive compensation based on experience;
  • Career plans to allow for extensive growth in the company;
  • International Projects;
  • Oowlish English Program (Technical and Conversational);
  • Oowlish Fitness with Total Pass;
  • Games and Competitions;

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Full TimeRemoteTeam 501-1,000H1B Sponsor

• Implement, and support Google Workspace small to mid deployments and migrations. • Assist in the design, implementation, and support Google Workspace Enterprise deployments and migrations. • Participate in technical discussions with customer executives that drive decisions and implementation • Manage cloud networking services and connect to client networks • Execute tasks related to SaaS configuration, development and integrations • Configure and manage Cloud Platform services and APIs • Develop custom scripts as necessary using Powershell, Python or Appscript • Train our clients’ end users and their technical personnel as required per project scope • Write technical documentation for deployed solutions that includes end-user guides and process manuals • Write, Contribute, Publish content to our LMS for Google Workspace content • Ability to travel to client sites as needed and requested

Illinois
Acuity, Inc. logo

Senior DevOps Engineer

Acuity, Inc.

Evolve. Enable. Automate.

DevOps Engineer4 days ago
Full TimeRemoteTeam 201-500Since 2002H1B No Sponsor

• Build and manage Azure resources (VNets, Load Balancers, Key Vault, Container Registry, etc.) through Terraform and Bicep. • Support deployment, scaling, and troubleshooting of AKS clusters. • Implement and enhance pipelines in GitHub Actions and Azure DevOps, integrating automated testing and security scanning. • Contribute to GitOps workflows using ArgoCD or Flux for consistent deployments. • Improve metrics, alerting, and dashboards via Prometheus, Grafana, ELK, and Azure Monitor. • Develop automation scripts (Python, Bash, PowerShell) for infrastructure, CI/CD, and operational processes. • Participate in production incident management, troubleshooting, and blameless postmortems. • Collaborate with development and QA teams to implement DevOps best practices and self-service capabilities.

United States
$135K - $150K / year
Job Closed
Assured logo

Staff Database Reliability Engineer, DBRE

Assured

Assured is a claims automation insurtech backed by leading Silicon Valley investors.

DevOps Engineer4 days ago
Full TimeRemoteTeam 11-50H1B Sponsor

• Scale and optimize our database infrastructure for performance and reliability, starting with PostgreSQL and Amazon Aurora. • Design and implement robust monitoring, tuning, and scaling strategies to support our expanding SaaS platform. • Build automation and tooling to streamline database management, focusing on consistency and repeatability. • Drive optimization initiatives that enhance overall system health and uptime. • Evolve into broader SRE responsibilities beyond the database layer, shaping our infrastructure and reliability culture as we grow.

United States
$165K - $185K / year
Full TimeRemoteTeam 51-200H1B No Sponsor

• Projetar, implementar e gerenciar infraestruturas em Google Cloud Platform (GCP), com foco em Google Kubernetes Engine (GKE). • Garantir ambientes escaláveis, resilientes e preparados para suportar crescimento contínuo. • Gerenciar capacidade e performance dos sistemas, planejando scaling horizontal e vertical. • Otimizar sistemas de cache (Redis) e CDN para garantir alta performance e baixa latência. • Gerenciar e configurar DNS para assegurar disponibilidade e resiliência. • Atuar em incidentes críticos, conduzindo troubleshooting e análises pós-incidente (post-mortem). • Desenvolver e manter automações para melhorar a eficiência operacional e reduzir tarefas manuais. • Liderar iniciativas de automação de deploy, scaling e recuperação de falhas. • Promover eficiência operacional através da padronização de processos e práticas de SRE. • Garantir observabilidade ponta a ponta utilizando métricas, logs e traces. • Trabalhar com ferramentas como Prometheus, Grafana e Cloud Monitoring. • Atuar com sistemas de mensageria como Google Cloud Pub/Sub e soluções de busca e análise como Elasticsearch. • Identificar oportunidades de otimização de custos em cloud, aplicando práticas de FinOps.

Brazil