BillingPlatform logo
BillingPlatform

One Platform, Infinite Possibilities

Senior Site Reliability Engineer

DevOps EngineerDevOps EngineerContractRemoteSeniorTeam 201-500Since 2012H1B SponsorCompany SiteLinkedIn

Location

Serbia

Posted

74 days ago

Salary

0

Seniority

Senior

Job Description

Senior Site Reliability Engineer

BillingPlatform

• Own and improve on-call processes, incident response playbooks, and post-mortem culture • Define, track, and manage SLOs, SLIs, and error budgets for critical services • Lead blameless post-mortems and drive systematic reliability improvements • Respond to production incidents and coordinate cross-functional resolution • Design, build, and maintain scalable AWS infrastructure using IaC (Terraform, Pulumi) • Manage Kubernetes clusters and containerized workloads in production • Build and maintain CI/CD pipelines to improve deployment speed and reliability • Evaluate and implement tooling to enhance developer productivity and system stability • Implement monitoring, alerting, and distributed tracing (Prometheus, Grafana, Datadog, Jaeger) • Identify and resolve performance bottlenecks across services, networks, and databases • Build dashboards and runbooks for self-service operational insights • Partner with engineering teams to embed reliability practices (load testing, capacity planning, chaos engineering) • Conduct architecture reviews with a focus on reliability and operability

Job Requirements

  • 5+ years of experience in SRE, DevOps, or infrastructure engineering
  • Deep expertise with AWS and cloud-native architectures
  • Strong experience with Kubernetes and container orchestration at scale
  • Hands-on experience with infrastructure-as-code tools (Terraform or Pulumi)
  • Proficiency in Python, Go, or Bash
  • Experience with observability tools (Prometheus, Grafana, Datadog, or similar)
  • Strong understanding of SLOs, SLIs, and error budgets
  • Experience with service mesh technologies (Istio, Linkerd)
  • Familiarity with chaos engineering tools (Chaos Monkey, Gremlin, LitmusChaos)
  • Background in Oracle database reliability and administration
  • Contributions to open-source infrastructure projects
  • Experience in a high-growth SaaS or product-led environment
  • Excellent English communication skills (written and spoken).

Benefits

  • A high-impact role at a growing SaaS company that values personal growth, accountability, and teamwork
  • A culture of open collaboration and problem-solving
  • 100% remote
  • Competitive pay

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Role Description - Apoiar a implementação, administração e evolução de ambientes em cloud AWS, garantindo estabilidade e alta disponibilidade; - Atuar na operação e gestão de ambientes produtivos, realizando monitoramento, troubleshooting e melhorias contínuas; - Implementar e manter infraestrutura como código (IaC) e automações para provisionamento e configuração de recursos; - Apoiar iniciativas de modernização de aplicações e migração para a nuvem; - Trabalhar em conjunto com times de arquitetura, engenharia, segurança e desenvolvimento para garantir alinhamento técnico; - Garantir que os ambientes sejam seguros, escaláveis, resilientes e eficientes em custo; - Participar da evolução da arquitetura cloud e plataforma, propondo melhorias e boas práticas; - Implementar e manter pipelines de CI/CD, contribuindo para automação de deploys e processos; - Monitorar ambientes e serviços, atuando na análise de incidentes e performance; - Apoiar a implementação de práticas de governança, segurança e conformidade em cloud; - Contribuir para a disseminação de boas práticas de DevOps e cultura cloud no time. Qualifications - Experiência prática com Amazon Web Services (AWS); - Experiência na administração de ambientes cloud em produção; - Conhecimento em sistemas operacionais Linux; - Vivência com ambientes distribuídos e arquitetura em nuvem; - Experiência com automação de infraestrutura; - Conhecimento em práticas de DevOps (CI/CD, automação, versionamento); - Capacidade de atuar com troubleshooting e análise de incidentes; - Boa comunicação e colaboração com times multidisciplinares; - Perfil analítico, organizado e orientado à melhoria contínua. Requirements - Experiência prática com Amazon Web Services (AWS); - Experiência na administração de ambientes cloud em produção; - Conhecimento em sistemas operacionais Linux; - Vivência com ambientes distribuídos e arquitetura em nuvem; - Experiência com automação de infraestrutura; - Conhecimento em práticas de DevOps (CI/CD, automação, versionamento); - Capacidade de atuar com troubleshooting e análise de incidentes; - Boa comunicação e colaboração com times multidisciplinares; - Perfil analítico, organizado e orientado à melhoria contínua. Experiences - #remote

Worldwide
Job Closed
VSolvit logo

DevSecOps Administrator

VSolvit

Where Opportunity...Meets Solution

DevOps Engineer74 days ago
Full TimeRemoteTeam 201-500Since 2006H1B Sponsor

• Design, implement, and maintain Continuous Integration/Continuous Delivery (CI/CD) pipelines to automate software builds, testing, and deployments • Integrate security tools and practices directly into CI/CD pipelines to ensure secure code delivery • Develop and manage Infrastructure as Code (IaC) scripts using tools such as Terraform, Ansible, or CloudFormation to automate infrastructure provisioning • Implement security measures throughout the software development lifecycle, including static code analysis, dynamic application security testing (DAST), and vulnerability scanning • Utilize and manage a modern security stack including GitLab Premium, Invicti, Trivy, AWS ECR managed signing, AWS GuardDuty, and DefectDojo • Manage AWS GovCloud environments and containerized applications using Docker and Kubernetes • Ensure secure configurations for all cloud resources and container orchestration platforms • Implement monitoring tools to track system performance, security, and availability • Respond to incidents promptly, conduct root cause analysis, and implement corrective actions • Maintain detailed documentation of DevSecOps processes, configurations, and security controls • Work closely with development, operations, and security teams to align practices with organizational goals • Utilize ticketing and project management software including ServiceNow and Jira.

Virginia
$125K - $160K / year
Job Closed
Veeam Software logo

System Reliability Engineer III – ISE Apps and Services

Veeam Software

Your Single Backup and Data Management Platform for Cloud, Virtual and Physical

DevOps Engineer74 days ago
Full TimeRemoteTeam 1,001-5,000Since 2006H1B Sponsor

• Administer and support the entire suite of Microsoft 365 services across the organization (Exchange Online, Teams, SharePoint, Power Automate, etc.) • Manage and maintain a hybrid mail environment, including on-premises Exchange and Exchange Online • Administer Single Sign-On (SSO) and Identity Management based on Microsoft Entra ID • Configure and maintain Conditional Access, MFA, Enterprise Applications, and related Entra ID services • Serve as an escalation point and provide Level 3 technical support for services in responsible area • Participate in IT infrastructure and system implementation projects • Monitor and respond to infrastructure and service alerts, managing incidents from detection to resolution • Create, update, and maintain technical documentation, procedures, and knowledge base articles

Romania
Job Closed
Suralink logo

DevOps Engineer

Suralink

The leading request list management solution for audit, accounting, and professional services firms

DevOps Engineer74 days ago
Full TimeRemoteTeam 51-200Since 2014H1B No Sponsor

• Implement a DevOps culture, fostering collaboration between development and operations teams • Design monitoring solutions around cloud infrastructure management • Have a passion for working with Engineering teams to ship products and maintain high SLAs for mission-critical SaaS applications • Optimizing system reliability, scalability, and performance through DevOps methodologies • Scaling platforms on AWS with a thorough understanding of reference architectures, showcasing your ability to design and implement cloud solutions that align with industry best practices • Ability to work across multiple tech stacks deployed in AWS, including but not limited to EKS, Terraform, React, Node.js, Typescript, PHP, RDS, SNS, SQS

United States
Job Closed