One Platform, Infinite Possibilities
Senior Site Reliability Engineer
Location
Serbia
Posted
74 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer
BillingPlatform
• Own and improve on-call processes, incident response playbooks, and post-mortem culture • Define, track, and manage SLOs, SLIs, and error budgets for critical services • Lead blameless post-mortems and drive systematic reliability improvements • Respond to production incidents and coordinate cross-functional resolution • Design, build, and maintain scalable AWS infrastructure using IaC (Terraform, Pulumi) • Manage Kubernetes clusters and containerized workloads in production • Build and maintain CI/CD pipelines to improve deployment speed and reliability • Evaluate and implement tooling to enhance developer productivity and system stability • Implement monitoring, alerting, and distributed tracing (Prometheus, Grafana, Datadog, Jaeger) • Identify and resolve performance bottlenecks across services, networks, and databases • Build dashboards and runbooks for self-service operational insights • Partner with engineering teams to embed reliability practices (load testing, capacity planning, chaos engineering) • Conduct architecture reviews with a focus on reliability and operability
Job Requirements
- 5+ years of experience in SRE, DevOps, or infrastructure engineering
- Deep expertise with AWS and cloud-native architectures
- Strong experience with Kubernetes and container orchestration at scale
- Hands-on experience with infrastructure-as-code tools (Terraform or Pulumi)
- Proficiency in Python, Go, or Bash
- Experience with observability tools (Prometheus, Grafana, Datadog, or similar)
- Strong understanding of SLOs, SLIs, and error budgets
- Experience with service mesh technologies (Istio, Linkerd)
- Familiarity with chaos engineering tools (Chaos Monkey, Gremlin, LitmusChaos)
- Background in Oracle database reliability and administration
- Contributions to open-source infrastructure projects
- Experience in a high-growth SaaS or product-led environment
- Excellent English communication skills (written and spoken).
Benefits
- A high-impact role at a growing SaaS company that values personal growth, accountability, and teamwork
- A culture of open collaboration and problem-solving
- 100% remote
- Competitive pay
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Role Description - Apoiar a implementação, administração e evolução de ambientes em cloud AWS, garantindo estabilidade e alta disponibilidade; - Atuar na operação e gestão de ambientes produtivos, realizando monitoramento, troubleshooting e melhorias contínuas; - Implementar e manter infraestrutura como código (IaC) e automações para provisionamento e configuração de recursos; - Apoiar iniciativas de modernização de aplicações e migração para a nuvem; - Trabalhar em conjunto com times de arquitetura, engenharia, segurança e desenvolvimento para garantir alinhamento técnico; - Garantir que os ambientes sejam seguros, escaláveis, resilientes e eficientes em custo; - Participar da evolução da arquitetura cloud e plataforma, propondo melhorias e boas práticas; - Implementar e manter pipelines de CI/CD, contribuindo para automação de deploys e processos; - Monitorar ambientes e serviços, atuando na análise de incidentes e performance; - Apoiar a implementação de práticas de governança, segurança e conformidade em cloud; - Contribuir para a disseminação de boas práticas de DevOps e cultura cloud no time. Qualifications - Experiência prática com Amazon Web Services (AWS); - Experiência na administração de ambientes cloud em produção; - Conhecimento em sistemas operacionais Linux; - Vivência com ambientes distribuídos e arquitetura em nuvem; - Experiência com automação de infraestrutura; - Conhecimento em práticas de DevOps (CI/CD, automação, versionamento); - Capacidade de atuar com troubleshooting e análise de incidentes; - Boa comunicação e colaboração com times multidisciplinares; - Perfil analítico, organizado e orientado à melhoria contínua. Requirements - Experiência prática com Amazon Web Services (AWS); - Experiência na administração de ambientes cloud em produção; - Conhecimento em sistemas operacionais Linux; - Vivência com ambientes distribuídos e arquitetura em nuvem; - Experiência com automação de infraestrutura; - Conhecimento em práticas de DevOps (CI/CD, automação, versionamento); - Capacidade de atuar com troubleshooting e análise de incidentes; - Boa comunicação e colaboração com times multidisciplinares; - Perfil analítico, organizado e orientado à melhoria contínua. Experiences - #remote
• Design, implement, and maintain Continuous Integration/Continuous Delivery (CI/CD) pipelines to automate software builds, testing, and deployments • Integrate security tools and practices directly into CI/CD pipelines to ensure secure code delivery • Develop and manage Infrastructure as Code (IaC) scripts using tools such as Terraform, Ansible, or CloudFormation to automate infrastructure provisioning • Implement security measures throughout the software development lifecycle, including static code analysis, dynamic application security testing (DAST), and vulnerability scanning • Utilize and manage a modern security stack including GitLab Premium, Invicti, Trivy, AWS ECR managed signing, AWS GuardDuty, and DefectDojo • Manage AWS GovCloud environments and containerized applications using Docker and Kubernetes • Ensure secure configurations for all cloud resources and container orchestration platforms • Implement monitoring tools to track system performance, security, and availability • Respond to incidents promptly, conduct root cause analysis, and implement corrective actions • Maintain detailed documentation of DevSecOps processes, configurations, and security controls • Work closely with development, operations, and security teams to align practices with organizational goals • Utilize ticketing and project management software including ServiceNow and Jira.
System Reliability Engineer III – ISE Apps and Services
Veeam SoftwareYour Single Backup and Data Management Platform for Cloud, Virtual and Physical
• Administer and support the entire suite of Microsoft 365 services across the organization (Exchange Online, Teams, SharePoint, Power Automate, etc.) • Manage and maintain a hybrid mail environment, including on-premises Exchange and Exchange Online • Administer Single Sign-On (SSO) and Identity Management based on Microsoft Entra ID • Configure and maintain Conditional Access, MFA, Enterprise Applications, and related Entra ID services • Serve as an escalation point and provide Level 3 technical support for services in responsible area • Participate in IT infrastructure and system implementation projects • Monitor and respond to infrastructure and service alerts, managing incidents from detection to resolution • Create, update, and maintain technical documentation, procedures, and knowledge base articles
DevOps Engineer
SuralinkThe leading request list management solution for audit, accounting, and professional services firms
• Implement a DevOps culture, fostering collaboration between development and operations teams • Design monitoring solutions around cloud infrastructure management • Have a passion for working with Engineering teams to ship products and maintain high SLAs for mission-critical SaaS applications • Optimizing system reliability, scalability, and performance through DevOps methodologies • Scaling platforms on AWS with a thorough understanding of reference architectures, showcasing your ability to design and implement cloud solutions that align with industry best practices • Ability to work across multiple tech stacks deployed in AWS, including but not limited to EKS, Terraform, React, Node.js, Typescript, PHP, RDS, SNS, SQS



