Intermediate Site Reliability Engineer, Environment Automation at GitLab

Back to jobs

Apply

Job Closed

This listing is no longer active.

GitLab

Build software faster. The One DevOps Platform enables your entire org to collaborate around your code. We're hiring.

Intermediate Site Reliability Engineer, Environment Automation

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 1,001-5,000Since 2014H1B No SponsorCompany Site LinkedIn

Location

India

Posted

84 days ago

Salary

Seniority

Senior

Bachelor DegreeEnglishAnsible Distributed Systems Kubernetes Terraform

Job Description

• Contribute to automating operational tasks across many GitLab environments, from initial provisioning and configuration updates to upgrades and routine maintenance, helping reduce manual work and improve reliability at scale under the guidance of senior team members. • Help build and refine the observability stack for multi-tenant GitLab environments so we monitor the right signals across Kubernetes, cloud services, and GitLab applications, supporting early issue detection and basic capacity tracking. • Assist in responding to platform alerts and incidents, collaborating with Environment Automation SREs and engineering teams to troubleshoot production issues across multiple tenants and document findings. • Support planning and implementation of infrastructure changes, capacity expansions, and new service rollouts for Dedicated and other managed GitLab environments, contributing to efforts that improve resource efficiency and environment isolation. • Develop and maintain scripts, automation tools, and infrastructure-as-code workflows that manage parts of the GitLab environment lifecycle, enabling more repeatable, self-service operations over time. • Apply and help implement best practices for running GitLab on Kubernetes and cloud platforms, focusing on day-to-day reliability, performance, and security while learning how to keep environments consistent. • Participate in the on-call rotation for production GitLab environments with appropriate support, helping triage and mitigate incidents across clusters and cloud providers and contributing to post-incident reviews. • Document operational tasks, runbooks, and lessons learned so they become clear, repeatable processes and can be candidates for future automation, improving shared knowledge and reducing manual toil across the team.

Job Requirements

Experience working as an SRE or in a similar role operating production infrastructure, with an interest in automating the lifecycle of many environments or tenants in parallel, even if you have not yet done so at large scale.
Hands-on experience with Golang (required) and the ability to read, understand, and modify infrastructure tools written in Go.
Hands-on experience running Kubernetes-based workloads in production, including basic understanding of deployments, rollouts, and debugging common issues like crash loops, failed health checks, and scheduling problems.
Familiarity with infrastructure automation and configuration management tools such as Terraform and Ansible, including experience working with modules, variables, and managing state safely for multiple environments.
Solid understanding of Git-based workflows and infrastructure-as-code practices, with the ability to contribute to reusable modules, templates, and pipelines that make automation safer and more consistent.
Experience working in distributed systems or cloud-based production environments, ideally in SaaS or managed service settings, with comfort participating in incident response and on-call rotations under guidance from more senior team members.
A proactive mindset focused on automation and documentation—you look for opportunities to remove manual steps, improve runbooks, and turn repetitive tasks into reliable, self-service tools.
Comfort working asynchronously across distributed teams and a desire to contribute to GitLab's values of collaboration, transparency, and iteration.

Benefits

Benefits to support your health, finances, and well-being
Flexible Paid Time Off
Team Member Resource Groups
Equity Compensation & Employee Stock Purchase Plan
Growth and Development Fund
Parental leave
Home office support

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

DevSecOps Engineer

AP MAX INC

Brello is a wellness-first brand that makes access to science-backed compounded medications feel effortless — never clinical or confusing. We connect individuals to licensed providers through Telegra, with prescriptions fulfilled by trusted 503A pharmacies. Our mission is to simplify, humanize, and demystify wellness solutions for longevity and weight management, with an authentic voice that is friendly, empowering, and transparent.

DevOps Engineer84 days ago

Full Time Remote

Role Description The DevSecOps Engineer plays a critical role in enabling secure, scalable software delivery across Allia Health Group’s cloud infrastructure. This role operates at the intersection of DevOps and security, embedding security controls directly into CI/CD pipelines and engineering workflows. This is a mission-critical hire supporting the organization’s SOC 2 compliance timeline. The ideal candidate brings a balanced skill set across infrastructure, automation, and security tooling, and is comfortable working in a fast-paced, evolving environment. - Operate as a bridge between DevOps and Security to integrate security into the software development lifecycle - Implement CI/CD security controls including SAST, DAST, SCA, and container scanning - Implement controls aligned with SOC 2 change management and vulnerability management requirements - Manage secrets lifecycle using cloud-native tools - Build and maintain infrastructure security controls using Terraform - Generate audit-ready change management evidence - Integrate vulnerability scanning into compliance workflows - Enforce secure development practices and pipeline protections - Collaborate with GRC teams to align technical controls with compliance requirements Qualifications - Minimum 3+ years of experience in DevOps, DevSecOps, or platform engineering - Experience with cloud platforms such as Google Cloud Platform (GCP) - Strong experience with CI/CD tools such as GitHub Actions - Hands-on experience with Terraform and infrastructure as code - Knowledge of application security and container security tools - Familiarity with SOC 2 or similar compliance frameworks Requirements - Experience with compliance platforms such as Drata or similar tools - Knowledge of HIPAA technical safeguards - Experience with policy-as-code tools - Relevant cloud or security certifications Benefits - Full benefits package including medical, vision, dental, 401(k) with company match, PTO, Flex days, holidays, and more - Working in Madeira in a shared office space, remote in Portugal, or remote in a Portuguese time zone-friendly location - Opportunity to build security-first infrastructure and systems - High-impact role within a growing technology organization - Benefits package designed to meet local market standards and legal requirements

View details: DevSecOps Engineer

Portugal

€55K - €70K / year

Apply

Job Closed

Senior DevOps Engineer – SRE

easybill GmbH

Mehr als nur eine Rechnungssoftware

DevOps Engineer84 days ago

Full Time RemoteTeam 11-50Since 2007H1B No Sponsor

Company Site LinkedIn

• Co-responsibility for system availability: You actively contribute to the availability, reliability, and efficiency of our complex system architecture, which consists of around 70 servers hosted at Hetzner. • Maintenance and automation: You support the maintenance and automation of our existing infrastructure based on technologies such as Ubuntu, Percona MySQL Cluster, MinIO, Elasticsearch, Redis, NGINX, HAProxy, TiDB, ClickHouse and Kubernetes. • Monitoring and analysis: You improve our monitoring strategies and perform comprehensive incident and fault analyses. • High availability: You are prepared, in exceptional cases, to be available outside normal hours, including at night, to ensure our systems run smoothly. • Software development: Several years of experience in one or more programming languages (e.g., Rust, Java, Go, TypeScript) are required.

Elasticsearch HAProxy Java Kubernetes MySQL Nginx Redis Rust TypeScript

View details: Senior DevOps Engineer – SRE

Germany

Apply

Job Closed

DevOps Specialist

Coinscrap Finance

Lideramos el futuro del análisis de datos transaccionales

DevOps Engineer84 days ago

Full Time RemoteTeam 11-50Since 2016H1B No Sponsor

Company Site LinkedIn

• Administrar y mantener la infraestructura de la empresa en Google Cloud Platform (GCP). • Implementar y mantener soluciones de infraestructura como código con Terraform y Terragrunt. • Gestionar la estructura de proyectos, carpetas, VPCs, IAM y recursos en GCP. • Crear y mantener contenedores con Docker, charts de Helm y orquestación con Kubernetes (GKE). • Gestionar artefactos en GCP Artifact Registry. • Administrar esquemas y migraciones de bases de datos con Atlas. • Gestionar el versionado del código con Gitlab y Git. • Implementar y mantener procesos de CI/CD utilizando Gitlab CI/CD. • Monitorizar y optimizar la infraestructura con Cloud Monitoring, Cloud Logging y Cloud Trace. • Colaborar estrechamente con el equipo de desarrollo para la implementación y despliegue de aplicaciones.

AWS Docker GCP Kubernetes Linux Terraform

View details: DevOps Specialist

Spain

€28K / year

Apply

Senior SRE – Site Reliability Engineer

Raiô Benefícios

Um ecossistema completo de benefícios corporativos.

DevOps Engineer84 days ago

Contract RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Ensure the reliability, stability, performance and cost-efficiency of Raiô's platforms, working closely with engineering and product teams. • Define, monitor and evolve reliability and performance indicators (SLIs/SLOs), establishing effective alerts and continuous improvement routines. • Manage production incidents, conducting root cause analysis and implementing corrective and preventive plans, with a focus on learning and system evolution. • Design, deploy and evolve observability practices (logs, metrics and tracing), improving predictability and reducing time to resolution for failures. • Develop and maintain automations and infrastructure as code, ensuring consistent, secure and reproducible environments. • Structure and evolve operational practices: routines, playbooks/runbooks, change management, metrics review and capacity planning. • Lead, together with engineering teams, technical and architectural decisions aimed at resilience, scalability and cost optimization. • Drive continuous improvements in operational processes, security, availability and cost control in cloud environments. • Promote best practices in reliability, operations and engineering, raising the overall technical level of the team.

AWS GCP Terraform

View details: Senior SRE – Site Reliability Engineer

Brazil

Apply

Job Closed

Intermediate Site Reliability Engineer, Environment Automation

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevSecOps Engineer

Senior DevOps Engineer – SRE

DevOps Specialist

Senior SRE – Site Reliability Engineer