We make a difference, solve outstanding problems and make the digital transformation of our clients possible.
SRE Specialist I
Location
Brazil
Posted
7 days ago
Salary
0
Seniority
Senior
Job Description
SRE Specialist I
Inmetrics
• Technical Leadership and Best Practices: Serve as a technical reference for the team, supporting development and promoting Site Reliability Engineering (SRE) best practices. • Availability and Performance: Ensure the availability, scalability, performance and security of the company's systems and infrastructure. • Maintain a stable, reliable and secure environment for all users and services. • DevOps Culture and Integration: Promote a DevOps culture, encouraging collaboration between development, infrastructure and information security teams. • Automation and Monitoring: Implement and manage tools and processes for automation, monitoring and orchestration of infrastructure and applications. • Incident Management and Continuous Improvement: Analyze incidents, identify root causes and propose preventive solutions to avoid recurrence. • Related Activities: Perform other duties inherent to the role, contributing to the efficiency and continuous improvement of services and processes.
Job Requirements
- Infrastructure as Code (IaC): Proficiency in Terraform (preferred), with knowledge of Pulumi or CloudFormation.
- APIs: Experience and expertise with APIGEE.
- Observability: Experience with Datadog (preferred), as well as tools such as Dynatrace, Splunk, Prometheus, Grafana or ChaosSearch.
- Cloud Automation: Programming knowledge in Python or Golang.
- CI/CD: Experience with continuous integration and delivery tools such as GitHub, Jenkins or ArgoCD.
- Cloud: Senior-level experience with Google Cloud Platform (GCP) — AWS or Azure also accepted.
- Managed Kubernetes: Advanced expertise with GKE (preferred), with knowledge of EKS or AKS.
- Certifications in Cloud, DevOps or Kubernetes will be considered a plus.
- Experience with cost optimization and governance in multicloud environments.
Benefits
- Bradesco Health Plan (30% copayment);
- Bradesco Dental (no employee contribution);
- Life Insurance;
- Wellhub (Gympass);
- Childcare Allowance;
- Allowance for Children with Special Needs;
- Payroll-deductible Loan;
- Private Pension;
- Pet Plan;
- SESC benefits;
- Conexa Telemedicine.
- Flexible Benefits:
- Financial Assistance;
- Food/Meal Allowance;
- Multi-benefit Card;
- Medical plan upgrade.
- Company Differentials:
- We are a civic-minded company: extended maternity and paternity leave;
- INMaterna Program: support program for pregnant employees;
- Birth welcome kit and the book "Acontecia quando eu nascia";
- Professional Development: courses available through the internal university;
- 100% remote or hybrid, depending on project needs.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Design, build, and maintain CI/CD pipelines in Azure DevOps and Jenkins (YAML pipelines, Jenkinsfile, shared libraries) from commit to production • Manage and scale Kubernetes clusters on AKS across dev, staging, and production; author Helm charts; manage namespaces, RBAC, configmaps, secrets, and ingress • Provision and version infrastructure with Terraform/Bicep and manage configuration with Ansible across multiple Azure environments • Enforce security gates in pipelines — Trivy container scans, Checkov IaC policies, SonarQube SAST quality gates; manage secrets via HashiCorp Vault and Azure Key Vault • Own observability — Prometheus/Grafana dashboards, ELK/OpenSearch log aggregation, OpenTelemetry instrumentation; define and track SLOs and error budgets • Manage network configuration — DNS, VPN, ingress controllers, load balancers, and network policies; lead blameless post-mortems.
Site Reliability Engineer - Remote
OptumOptum, part of the UnitedHealth Group family of businesses, is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together. At Optum, we support your well-being with an understanding team, extensive benefits and rewarding opportunities. By joining us, you’ll have the resources to drive system transformation while we help you take care of your future. We recognize the power of connection to drive change, improve efficiency and make a difference in health care. Join a team where your skills and ideas can make an impact and where collaboration is key to creating technology that produces healthier outcomes.
Requisition Number: 2358259 For those who want to invent the future of health care, here's your opportunity. We're going beyond basic care to health programs integrated across the entire continuum of care. Join us to start Caring. Connecting. Growing together. Our Optum Serve IT team develops cutting-edge solutions that help people live healthier lives and help make the health system work better for everyone. From advanced data analytics and AI to cybersecurity, we use innovative approaches to solve some of healthcare's most complex challenges. To support this mission, OSIT has initiated a multi-year modernization program aimed at updating and enhancing enterprise technology systems in accordance with modern design standards The Site Reliability Engineer will architect, develop, and maintain Optum Serve's cloud environment in both the commercial and government clouds. The role will work closely with software engineers, architects, and DevOps engineers to architect and maintain a secure, resilient and high-performance cloud infrastructure. You'll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges. For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week. Primary Responsibilities: - Build, operate, and support IaaS and PaaS infrastructure in Azure and AWS commercial and government cloud environments under established architecture and standards - Partner with development teams to help define, track, and report on SLIs, SLOs, and SLAs - Contribute to the development and support of platform services, including provisioning, configuration, deployment, and day to day operations - Integrate applications and platforms with centralized logging, monitoring, metrics, and incident management systems - Configure and maintain observability tools (dashboards, APMs, alerts) to help engineering teams safely operate applications in production - Participate in an on-call rotation to support software and cloud infrastructure, following documented runbooks and escalation paths - Support root cause analysis efforts and assist with remediation by implementing automation, monitoring improvements, and reliability fixes - Maintain and enhance operational tooling, scripts, and frameworks used for platform and service support - Execute performance and resiliency testing for platform services using existing frameworks and tools - Configure and tune alerts related to performance, availability, cost, security, and compliance signals - Follow and help improve operational processes, contributing automation to reduce manual and repetitive support activities You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in. Required Qualifications: - 4+ years of experience working in a Site Reliability Engineering, Cloud Engineering, or DevOps role - Hands-on experience supporting Kubernetes (managed or bare metal) clusters in production environments - Some hands-on experience with monitoring and observability tools (e.g., Azure Monitor, Splunk, Dynatrace, Grafana, Prometheus) - Experience using Infrastructure as Code (IaC) tools such as Terraform or Pulumi - Experience supporting infrastructure and applications in production cloud environments - Experience interacting with or supporting systems that expose RESTful APIs - Solid working knowledge of at least one major cloud service provider (Azure preferred, AWS acceptable) - Working knowledge of networking fundamentals and common internet protocols - Understanding of identity and access management (IAM) concepts and best practices - Basic understanding of security concepts including encryption, PKI, and common application security risks (e.g., OWASP) - Familiarity with Kubernetes deployment and GitOps tools such as Helm, ArgoCD, or Flux - Familiarity with IDEs and source control tools such as Visual Studio Code, GitHub, GitLab - Ability to participate in a 24/7 on-call rotation following documented procedures and escalation paths - United States Citizenship - If you are offered this position, you will be required to provide extensive personal information to obtain and maintain a suitability or determination of eligibility for a Confidential/Secret or Top Secret security clearance as a condition of your employment *All employees working remotely will be required to adhere to UnitedHealth Group's Telecommuter Policy Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. The salary for this role will range from $72,800 to $130,000 annually based on full-time employment. We comply with all minimum wage laws as applicable. Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants. At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission. OptumCare is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations. OptumCare is a drug-free workplace. Candidates are required to pass a drug test before beginning employment.
Forward Deployment Engineer, Generative AI
Tiger AnalyticsAI & Analytics for today’s business challenges.
• The Forward Deployment Engineer (FDE) drives the on-site deployment, integration, and scaling of our enterprise Generative AI solutions. • This role embeds directly within customer engineering teams to operationalize Large Language Models (LLMs) and retrieval systems across multi-cloud environments (AWS, Azure, GCP). • You will bridge the gap between AI research and production-grade cloud infrastructure. • You will collaborate with cross-functional teams and business partners and will have the opportunity to drive current and future strategy by leveraging your analytical skills as you ensure business value and communicate the results.
Ingeniero DevOps – Sector Financiero/Bancario
DevsuDevsu is a technology agency that provides software development services, IT augmentation and staffing.
• Construir pipelines de creación y aprovisionamiento de ambientes de desarrollo, pruebas y producción en cloud/on-prem. • Soportar, vía capacitación, resolución de dudas e incidentes, a las células ágiles/equipos de desarrollo en el aprovisionamiento de ambientes. • Guiar a los equipos de desarrollo en el uso de todos los componentes previamente desplegados cloud/on-prem y aprobados para el despliegue de iniciativas/proyectos (e.g., configuraciones a utilizar como virtual machines, sistemas operativos, versiones disponibles, servicios de contenedores, serverless, todo lo desplegado en ambientes del banco). • Brindar recomendaciones respecto a herramientas de despliegue, su uso, mejores prácticas y configuraciones específicas que pudieran requerirse en alineamiento con los requerimientos de seguridad. • Soportar la resolución de incidentes y/o problemas en el desarrollo de pipelines alineado a las prácticas/procesos implementados en el banco (e.g., llamadas erróneas a los servicios, servicios no disponibles, falta de acceso a los servicios, entre otros). • Soportar los procesos de cambios que deben estar alineados a las prácticas de ITIL del banco cuando se requieran modificaciones en los ambientes o servicios integrados en los pipelines de desarrollo. • Construir y actualizar continuamente la documentación relacionada con el aprovisionamiento de ambientes y despliegue en cloud/on-prem (e.g., pipelines creados, código escrito, landing zones, usuarios con acceso).




