Job Closed
This listing is no longer active.
High quality consulting. On demand. Delivered by top professionals.
Senior DevOps Engineer, mk8s
Location
Germany
Posted
5 days ago
Salary
0
Seniority
Senior
Job Description
Senior DevOps Engineer, mk8s
Interval Group
• Lead the validation of deployment artifacts from an operational standpoint • Monitor system health, performance metrics, and service availability • Direct root cause analyses and implement corrective and preventive actions • Reduce operational toil by automating recurring remedial processes • Implement comprehensive logging and monitoring strategies
Job Requirements
- At least 5 years of dedicated operational experience with self-managed Kubernetes clusters
- Profound knowledge and implementation experience with CI/CD processes
- Deep structural understanding of networking concepts
- Fundamental comprehension of ITSM & SRE principles
- Hands-on experience with logging and monitoring tools
- Proven experience documenting technical procedures and enforcing actionable runbooks
- Professional proficiency in both spoken and written English and German
Benefits
- Flexible working hours
- Freedom to choose your own projects
- Competitive pay
- Access to exciting projects in various industries
- Dedicated team support
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Design, development, and implementation of Confidential Computing at the hypervisor level. • Design and maintain Exoscale’s operating systems and hypervisor fleets, from kernels to the network stacks, including security filtering layers. • Contribute to the routing & security automation systems implementations. • Help shape our operating system images strategy, from bare metal to Virtual Machines. • Maintain and improve the system container runtimes. • Improve our provisioning and deployment systems. • Help improve bare metal and hypervisor systems performance. • Contribute to the overall design and the architecture of the Exoscale platform systems. • Contribute to internal tooling development. • Improve our systems and processes to be scalable and highly available, helping achieve outstanding SLAs. • Participate in code & changes reviews. • Take part in the on-call rotation after a training period.
• Design and implement end-to-end observability solutions across applications, infrastructure, and cloud environments. • Develop dashboards, alerts, and telemetry frameworks to provide real-time visibility into system health and performance. • Build automation solutions to eliminate repetitive operational tasks and improve efficiency. • Enable runbook automation, self-healing capabilities, and automated incident triage workflows. • Define and implement SLIs, SLOs, and alerting strategies to improve service reliability. • Drive improvements in MTTD and MTTR through actionable alerts and telemetry-driven insights. • Implement proactive monitoring, anomaly detection, and predictive alerting to identify issues before customer impact. • Leverage AIOps capabilities for alert correlation and intelligent incident response. • Integrate observability platforms with CI/CD pipelines, cloud services, and ITSM tools such as ServiceNow. • Collaborate with engineering, product, and operations teams to establish observability standards and operational readiness practices.
• Design, deploy, and manage enterprise-grade infrastructure on Azure platform with multi-region and high-availability architecture • Build and maintain Kubernetes clusters, including deployment, scaling, monitoring, and troubleshooting of containerized applications • Create and manage containerized workloads from image build through production deployment • Develop and implement CI/CD pipelines using Azure DevOps to automate software delivery and deployment processes • Create Infrastructure-as-Code (IaC) solutions using Terraform/OpenTofu, ARM templates, or Bicep for reproducible infrastructure • Lead vulnerability assessments and implement remediation strategies across cloud infrastructure and container environments • Establish and enforce security policies, compliance standards, and best practices for cloud operations • Implement security scanning and container registry scanning tools to identify and manage vulnerabilities in images and dependencies • Develop automation scripts and tools to reduce manual operational tasks and improve system reliability • Monitor infrastructure performance, optimize resource utilization, and implement cost management strategies and collaborate with development teams, security teams, and stakeholders to understand requirements and implement solutions
Senior DevOps Engineer – AWS, AI Infrastructure
Software MindSoftware House focused on results since 1999
• Provision and configure a dedicated VPC and segmented cloud environment on AWS • Build the baseline CI/CD pipeline and maintain and evolve it across all delivery phases • Configure and manage the vector store infrastructure (OpenSearch/Pinecone on AWS) • Set up and manage the observability stack: CloudWatch, X-Ray, alerting thresholds, and LLM-specific monitoring • Implement infrastructure-as-code for all environments (dev, staging, production) using Terraform or CDK • Manage secrets, KMS encryption key configuration, and tenant-scoped access controls • Configure LLM provider connectivity (OpenAI / Anthropic / Amazon Bedrock enterprise tier, zero-data-retention) • Define and implement environment promotion strategy aligned with the 2-week sprint cadence • Support incremental ingestion pipeline infrastructure requirements and nightly scheduling



