Veeam Software logo
Veeam Software

Your Single Backup and Data Management Platform for Cloud, Virtual and Physical

Senior Site Reliability Engineer – Government, Sovereign Cloud

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 1,001-5,000Since 2006H1B SponsorCompany SiteLinkedIn

Location

California

Posted

84 days ago

Salary

$138.9K - $231.4K / year

Seniority

Senior

Job Description

Senior Site Reliability Engineer – Government, Sovereign Cloud

Veeam Software

• Get up to speed on the full platform — all VDC workloads, dependencies, and risk areas. Much of this will happen through code, docs, and conversations rather than direct environment access. • Work with SMEs across the org to fill knowledge gaps and build onboarding material for the team. • Write and maintain runbooks, architecture docs, and operational guides. • Design infrastructure for high availability and fault tolerance on Azure (including Azure Government). • Define SLIs, SLOs, and error budgets where none exist today. • Run incident response and blameless postmortems. Turn incidents into improvements. • Identify reliability risks across modern and legacy workloads and build practical remediation plans that work within compliance constraints. • Close observability gaps — define instrumentation requirements and drive implementation. • Set alerting, telemetry, and monitoring standards with partner teams. • Build automation to reduce toil and support fleet management. • Participate in on-call rotations. • Work with IaC, CI/CD, deployment automation, and config management — including in air-gapped or compliance-restricted environments. • Build and maintain testing, canary deployment, and release validation pipelines. • Integrate chaos engineering and monitoring tools, adapting choices to meet regulatory requirements. • Work across product, platform, security, legal, compliance, and operations teams. • Own problems end-to-end — identify gaps, drive solutions, don't wait for direction. • Mentor other engineers and help spread SRE practices across the org.

Job Requirements

  • 7+ years in Software Engineering, with 3+ years in SRE, Platform Engineering, or similar — across multi-service platforms, not just single-service environments.
  • Experience with Government or Sovereign Cloud (e.g., Azure Government, AWS GovCloud).
  • Experience in regulated compliance environments — government (FedRAMP, CMMC, IL2/IL4/IL5), financial (PCI-DSS, SOX), or healthcare (HIPAA, HITRUST). You understand how compliance shapes architecture and operations.
  • Strong experience building and running production services on cloud infrastructure (Azure preferred, including Azure Government).
  • Able to learn large, complex platforms quickly with limited guidance — comfortable building understanding from code, docs, and architecture artifacts when direct environment access is restricted.
  • Can investigate systems independently and produce clear docs, risk assessments, and improvement plans.
  • Comfortable working across teams — engineering, product, security, compliance, operations.
  • Programming skills in one or more of: TypeScript/JS, Go, Java, C#, or similar.
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry, ELK stack).
  • Experience with IaC (Terraform, Terragrunt, Pulumi) and container orchestration (Kubernetes).
  • Experience with CI/CD and GitOps tooling — GitHub Actions, Azure DevOps, GitLab CI, ArgoCD, FluxCD, or Dagger.
  • Solid grasp of distributed systems, networking, and cloud-native architecture.
  • Clear written and verbal communication skills.

Benefits

  • Unlimited paid time off, 12 paid holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
  • Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents
  • Medical, dental, and vision coverage starting on your first day
  • Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program
  • 401(k) retirement plan with company matching contributions
  • Fertility, adoption, and surrogacy support through Maven, plus paid volunteer time
  • AirVet: 24/7 virtual veterinary care at no cost
  • Legal services, identity protection, and supplemental health insurance options
  • Tax-advantaged spending accounts for healthcare, dependent care, and commuting
  • Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning

Related Categories

Related Job Pages

More DevOps Engineer Jobs

We Met at Kubecon!

vCluster Labs

vCluster Labs is a venture-backed tech startup headquartered in San Francisco, California, with a distributed, remote-first team spanning eight time zones. Founded following a Seri

DevOps Engineer85 days ago

We aren't just shipping software; we are defining the state-of-the-art of tomorrow. Whether we chatted at the booth, grabbed a coffee, or geeked out over the latest vCluster release, we want to stay connected. Who You Are (Our Ideal Talent Profile) We hire based on Intangibles, Skills, and Experience. You might be a fit if you resonate with: - Ownership: You connect your daily actions to the broader success of our customers and the company. - The Wow: You measure success by the experience you generate, both for teammates and for our users. - AI Literacy: You naturally integrate AI tools to accelerate your delivery and brainstorm at scale. - Idea Meritocracy: You value the quality of an idea over the status of the author and view feedback as an "upgrade path". - Stars on the Rise: You have a consistent track record of rapid career progression and quantifiable achievements. What We Offer - Remote-First Culture: Work from where you are most productive with a globally distributed team. - Impact: Act as "Customer Zero" or a "Supportive Elevator"—everyone here plays a pivotal role in our aggressive ARR growth. - Collaboration: Regular global offsites to mix serious business with intentional connection. #LI-DNI About vCluster Labs We are a venture-backed tech startup striving to be the leading force in enabling platform engineers. We raised +$30M from top-tier VCs such as Khosla Ventures (first investor in OpenAI, GitLab, Stripe, Doordash) and are in a hyper-growth phase looking for motivated people to complement our team. Our headquarters are in San Francisco (Salesforce Tower), but our team is distributed around the globe and we have a remote-first work culture. We're the company behind vCluster, an open-source technology for virtualizing Kubernetes (+10k GitHub stars). Open source is part of our DNA. The adoption of our commercial product based on vCluster has grown extremely fast (multi-million dollar revenue) and our customer base includes some of the biggest companies in the world, including 6 Global Fortune 500 companies as well as some of the fastest-growing tech unicorns. Benefits We offer the following benefits: - Competitive Salary: We offer a competitive compensation package, including equity. - Platinum-Level Insurance: Health, dental, vision, and life Insurance, including plans for you and eligible dependents (benefits vary depending on country). - Flexible Working Schedule: You have a doctor’s appointment or need to head to the supermarket to get groceries at 2pm? We won’t have an issue with that. To us, results matter more than clocking in and out at the same time every day. - Workplace Flexibility: We’re very flexible about where you work. We know things can change in life and we’re happy to adjust the work environment for you along the way. Culture & Values At vCluster Labs, we value and stand for: - Open Source, Open Mind: We are actively contributing to and maintaining open-source projects. Internally, we foster meritocracy — the strongest ideas win, no matter who or where they come from. - Build Tomorrow’s Standards, Intentionally: We don't just ship software; we define the state-of-the-art of tomorrow. We are fearless in tearing down old approaches to build something better, but we are disciplined in how we do it because we know our users rely on our technology to run mission-critical infrastructure platforms. - Create Wow: We measure success by the experience we generate, both inside and outside the company. For our customers, this means impressive speed and intuitive experiences. For our team, this means going the extra mile to support one another and to continuously drive each other to new heights. - Own the Outcome: We understand that our responsibility doesn't end when a task is checked off; it ends when the value is delivered. We connect our daily individual actions to the broader success of the company and our customers.

United States
Job Closed

Role Description Nos entusiasma la innovación, los desafíos constantes y trabajar con los mejores talentos del mercado. Serás el líder técnico y estratega clave para asegurar la infraestructura cloud y las soluciones de Inteligencia Artificial de nuestros clientes. Buscamos un/a DevOps Tech Lead para liderar proyectos de transformación cloud, enfocados en plataformas de identidad (IAM) modernas basadas en contenedores. Serás responsable de: - Dirigir técnicamente equipos DevOps y de desarrollo, asegurando la correcta implementación de soluciones en la nube bajo estándares de seguridad, automatización y alta disponibilidad. - Definir y validar arquitecturas cloud en AWS, asegurando soluciones seguras, escalables y resilientes, alineadas a buenas prácticas y estándares de la industria. - Liderar la toma de decisiones técnicas, supervisando la calidad de las soluciones implementadas y garantizando su correcta integración con entornos existentes, incluyendo sistemas on-premise. - Coordinar y liderar equipos multidisciplinarios (DevOps y backend), asignando tareas, haciendo seguimiento de la ejecución y promoviendo la mejora continua del equipo. - Actuar como referente técnico frente al cliente, acompañando definiciones estratégicas y asegurando una comunicación clara y efectiva. - Diseñar, implementar y gobernar pipelines de CI/CD, junto con la definición de estándares de Infraestructura como Código (IaC) y prácticas de GitOps. - Supervisar despliegues en AWS (EKS, redes, bases de datos), garantizando eficiencia operativa y estabilidad en los entornos productivos. - Definir e implementar estrategias de monitoreo, logging y alertas, participando activamente en la resolución de incidentes críticos y asegurando el cumplimiento de SLAs. - Liderar la implementación de soluciones de identidad (como Keycloak) y aplicar estándares de autenticación moderna (OAuth2, OIDC), velando por el cumplimiento de buenas prácticas de seguridad. - Capacitar a equipos internos y del cliente, promoviendo la adopción de buenas prácticas, y documentar soluciones y procesos operativos de manera clara y estructurada. Qualifications - +5 años de experiencia en roles DevOps / Cloud, con al menos +2 años liderando equipos técnicos. - Experiencia sólida en entornos cloud sobre AWS, incluyendo servicios como EKS (Kubernetes), VPC, networking, seguridad y bases de datos (RDS / Aurora), así como en arquitecturas híbridas (cloud + on-premise). - Dominio en contenedores y orquestación, con manejo avanzado de Kubernetes y experiencia en Docker para la construcción y gestión de aplicaciones containerizadas. - Experiencia en prácticas DevOps, incluyendo implementación de pipelines CI/CD, junto con un fuerte conocimiento en Infraestructura como Código (IaC) utilizando Terraform (excluyente), y experiencia o familiaridad en prácticas de GitOps (deseable). - Conocimientos en seguridad e identidad, incluyendo protocolos de autenticación moderna (OAuth 2.0, OIDC) y experiencia en soluciones IAM (idealmente Keycloak), así como integración con LDAP o Active Directory. - Experiencia en desarrollo e integración (deseable), con conocimientos en Java, consumo y diseño de APIs REST, e integración con sistemas legacy. - Experiencia en observabilidad, incluyendo definición e implementación de monitoreo, logging y alertas, utilizando herramientas como Prometheus, Grafana y CloudWatch. - Manejo de herramientas del ecosistema DevOps y Cloud como AWS (EKS, EC2, IAM, VPC, ALB, RDS/Aurora, CloudWatch), Terraform, Git (GitHub, GitLab o Bitbucket), ArgoCD, Jenkins, GitHub Actions o GitLab CI, Kubernetes (kubectl, Helm) y herramientas de gestión de secretos (como AWS Secrets Manager). - Excelentes habilidades de comunicación para interactuar con equipos técnicos y stakeholders, pensamiento analítico, orientación a la resolución de problemas y capacidad para desenvolverse con eficacia en entornos de alta exigencia. Requirements - Experiencia en proyectos IAM / Identity. - Certificaciones AWS. - Experiencia en entornos telco. - Conocimientos de SRE. Benefits - Tiempo para vos: contás con días Nubiral para disfrutar como vos quieras. - Día de cumpleaños off. - Licencia por adopción de mascotas. - Formación continua: certificaciones afines cubiertas. - Acompañamiento en momentos importantes para tu familia: regalo por nacimiento, cumple de hijos/as off.

United States + 1 moreAll locations: United States | Canada
Job Closed
DriveWealth logo

Senior Site Reliability Engineer

DriveWealth

Innovative Investing Experiences

DevOps Engineer85 days ago
Full TimeRemoteTeam 201-500Since 2012H1B Sponsor

• Engineering & Automation: Design and develop internal tools and SRE platforms to eliminate repetitive tasks (toil) and improve developer velocity. • Infrastructure as Code: Architect and maintain modular, reusable IaC using Terraform and manage GitOps workflows via ArgoCD. • Observability & Reliability: Implement OpenTelemetry standards and the Grafana stack (Alloy, Loki, Tempo, Mimir) to provide deep insights into system health. Define and manage SLIs, SLOs, and Error Budgets. • Platform Governance: Review software architecture and Kubernetes metrics to ensure high availability, capacity planning, and cost-optimization across AWS regions. • Incident Engineering: Lead incident response, perform complex root-cause analysis (RCA), and champion a blameless post-mortem culture. • Collaboration: Partner with engineering teams to foster the adoption of new tools, security standards, and reliability best practices.

United States
$150K - $170K / year
Job Closed
Fairsource logo

DevOps, Kubernetes Consultant

Fairsource

Fairness & Transparenz in IT-Projekten

DevOps Engineer85 days ago
Full TimeRemoteTeam 11-50H1B No Sponsor

• Deployment and configuration of applications on Kubernetes • Installation and testing of infrastructure components • Automation of installation processes • Preparation of accompanying documentation

Germany
€110K - €140K / year