Mastering Cloud. Accelerating Business.
Senior Site Reliability Engineer – m/f/x
Location
Germany
Posted
64 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer – m/f/x
SysEleven GmbH
• Ensure the reliability, availability, and performance of our Database- and Observability-as-a-Service products • Manage container-based applications in Kubernetes with a strong focus on security and resilience • Lead incident response, root cause analysis, and sustainable remediation efforts • Apply GitOps principles using Helm and Argo CD • Develop API services and tooling in Go to deliver stable SaaS products • Build and optimize CI/CD pipelines to improve deployment safety and system stability • Design and manage scalable infrastructure using IaC tools (e.g., Terraform) in cloud environments
Job Requirements
- Several years of experience operating highly available systems in Linux and Kubernetes environments
- Strong understanding of observability concepts (monitoring, logging, tracing)
- Practical development experience in Go (knowledge of Python or Rust is a plus)
- Experience with Infrastructure-as-Code tools such as Terraform or OpenTofu
- Hands-on experience in incident management and structured root cause analysis
- Familiarity with CI systems, especially GitLab CI
- Strong problem-solving skills and good communication skills in German and English (minimum B2 level)
Benefits
- Blameless culture
- Open communication
- Knowledge sharing
- Autonomy to drive initiatives
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Collaborate with cross-functional teams to assist with design and assist with implementing efficient DevSecOps pipelines tailored to Power & Controls embedded software development • Help develop automated test, continuous integration, and continuous delivery processes to enhance efficiency and accelerate software development cycles • Help select, configure, and integrate tools to enhance software development flow, optimizing productivity and performance • Work closely with the DevOps Manager and Enterprise Digital Product Life Cycle (DPLC) team to understand and help adapt DevOps practices • Help design, develop, and maintain dashboards displaying DevSecOps metrics, measures, performance, and continuous improvement
• Design and maintain container-based deployment setups using Docker and related technologies • Define repeatable deployment and runtime models across cloud and hybrid environments • Build and maintain deployment and environment automation to ensure reliable and traceable operations • Support development teams with CI/CD pipelines, deployment automation, and release processes • Design and improve observability solutions (monitoring, logging, tracing) for reliable operations • Advise on infrastructure aspects such as networking, security, hardware requirements, and scalability • Work closely with engineering and product teams to ensure operability is considered early • Use AI-assisted tools to improve operational efficiency while critically validating their outputs
• Monitor, maintain, and improve system availability in a cloud production environment. • Ensure the stability and availability of cloud production systems. • Perform monitoring, alerting, and incident response. • Automate recurring operational tasks and contribute to infrastructure improvements. • Troubleshoot complex issues related to performance, system reliability, networking, and service integrations. • Collaborate with development and operations teams to enhance system performance and reduce operational risks. • Participate in on-call rotations and continuous improvement initiatives.
• Manage and administer our operations tools (e.g., Sentry, CheckMK, GitLab) • Ensure monitoring via CheckMK and continuously develop it • Maintain and enhance our Proxmox and Azure systems • Support the on-premises and corporate networks (routing, port forwarding, VLAN management, VPNs) • Administer access rights and implement automated access management concepts • Design and implement improvements to network architecture and security • Adapt and further develop GitLab CI/CD pipelines • Maintain and optimize AWX playbooks (Ansible) • Implement Infrastructure as Code (Terraform, Ansible) • Administer permissions (Entra ID / Azure AD, third-party applications)




