DevOps / Platform Engineer
Location
Germany
Posted
21 hours ago
Salary
0
Seniority
Mid Level
No structured requirement data.
Job Description
DevOps / Platform Engineer
Pont Connects e.K.
Role Description Für meinen langjährigen Kunden bin ich auf der Suche nach einem DevOps / Platform Engineer (m/w/d) REMOTE in Festanstellung. - Entwicklung und Implementierung von Infrastruktur als Code (IaC) Lösungen zur Automatisierung der Bereitstellung und Verwaltung von Cloud-Ressourcen. - Überwachung und Optimierung der Systemleistung, um eine hohe Verfügbarkeit und Skalierbarkeit der Plattform sicherzustellen. - Zusammenarbeit mit Entwicklungsteams zur Integration von CI/CD-Pipelines und zur Förderung von Best Practices in der Softwarebereitstellung. - Identifizierung und Behebung von Sicherheitsproblemen innerhalb der Infrastruktur, um den Schutz sensibler Daten zu gewährleisten. - Erstellung und Pflege von Dokumentationen und Leitfäden für die Nutzung und Wartung der Plattform. Qualifications - Sehr gute Kenntnisse in GitLab CI - Fundierte Erfahrung mit Docker und Containerisierung - Praxis in Build- und Release-Automatisierung - Gute Kenntnisse in Kubernetes (Deployments, Debugging, Helm) - Verständnis für automatisierte Tests im CI/CD-Kontext (Integration wünschenswert) - Sehr gute Deutsch- und Englischkenntnisse - Erfahrung mit Deployments auf Debian oder Ubuntu - Kenntnisse in Infrastructure as Code (z. B. Ansible, Terraform) - Python- und/oder Bash-Kenntnisse für Scripting und Tooling - Erfahrung in der Modernisierung von Legacy-Systemen - Operations-Verständnis (Betrieb beim Kunden, Updates, Monitoring, Stabilität)
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Deployment Engineer
ng-voiceThe Hyperscaling IMS Solution: Infrastructure-agnostic, cost-efficient, automated.
• Working in the Client Deployment Team, guaranteeing smooth delivery of solutions, including high-/low-level design with final technical deployment and integration of the solutions. • Planning, preparing, and setting up lab, test, and production environments for client and partnership deployments. • Installing and configuring cloud-native solutions using Kubernetes. • Automation, automation, automation – focusing on technical deployment and identifying ways to speed up these processes. • Maintaining, monitoring, administering, and updating current deployments, ensuring the stability of solutions. • Helping with incident management, providing 2nd-level support, error analysis, and classification. • Coordinating customer requests and disseminating client information. • Unlocking operational efficiency in deployment and support procedures.
• Work as a part of an agile software development devops team to build and maintain critical infrastructure • Participate in planning and design • Complete sprint work collaboratively and punctually • Assist in the development and management of CI/CD processes for efficient code deployment and system updates. • Support the operation and maintenance of cloud infrastructure and services to ensure performance, security, and cost-efficiency. • Contribute to the management of microservices architecture and orchestration utilizing container technologies. • Collaborate with software development teams to troubleshoot and resolve system issues and outages. • Aid in the implementation of best practices for system hardening and configuration management. • Ensuring security compliance and vulnerability remediations remains within SLAs • Participate in on-call rotations to support system operations. • Help document processes and monitor system logs and activity for potential issues
Role Description We currently have several large-scale projects and are expanding our infrastructure team. Our product is an advanced platform for creating and managing AI agents. It can be deployed directly inside a customer’s infrastructure and delivered as an enterprise solution, while also being available as a SaaS version. Under the hood, there is real-time voice and telephony, GPU and LLM inference, streaming analytics, and all of this runs both in the cloud and on-prem, including in banking environments. There is a lot of infrastructure; it is complex, interesting, and sometimes at the edge of what is possible. That is why we are looking for a strong SRE who, like us, cares about making systems transparent, reliable, and built the right way. This is a role for a strong, independent engineer. A Senior SRE with real influence and a voice in how things are built and operated. You will also handle DevOps tasks for the team, but your main focus and area of expertise should be SRE: reliability, observability, incident management, and performance under load. Qualifications - 5+ years in SRE/DevOps. - Deep, practical understanding of Docker and Kubernetes. - Mature understanding of metrics and alerts. - Practical experience with Prometheus, Alertmanager, and Grafana. - Experience with SLIs/SLOs, reliability management, incident investigation, and postmortems. - Experience with load testing and basic capacity planning. - Python programming skills. - Cloud experience with GCP and/or AWS. - DevOps fundamentals: CI/CD and infrastructure as code. - Ownership mindset. - Strong communication with developers. - Willingness and ability to mentor, teach, and share knowledge. - Analytical mindset. - Proactivity. - Strong attention to detail and reliability. Requirements - Experience using AI agents for routine and recurring tasks (nice to have). - Real-time telephony: SIP, FreeSWITCH, RTP, WebRTC (nice to have). - GPU/ML serving: Triton, vLLM, RunPod, Nebius, Lambda, run:ai, DCGM (nice to have). - Streaming data and analytics: Kafka, ClickHouse (nice to have). - Deep experience with IaC and GitOps (nice to have). - Experience working in isolated and highly secure environments (nice to have). - Experience preparing systems for significant growth in load (nice to have). Responsibilities - Responsible for the reliability of our services: SLIs/SLOs, availability, and identifying and eliminating bottlenecks. - Set up monitoring for services, metrics, alerts, and dashboards. - Build and maintain Grafana dashboards. - Run load testing, analyze results, and provide recommendations. - Investigate incidents, participate in on-call rotations, and write postmortems. - Work closely with developers to communicate and defend your position. - Develop and support Kubernetes-based infrastructure across clouds. - Take part in delivering and supporting the platform for customers. - Mentor colleagues and help raise the engineering bar across the team. Benefits - The team has built award-winning AI products for tech corporations. - Cutting-edge tech stack: Speech Technologies, NLP, Generative AI. - High engineering bar and real ownership. - Fast career progression. - Startup pace with enterprise stability. - Fully remote across Europe. - 21 vacation days + public holidays + 5 sick days. - Private English lessons via Preply.
Senior DevOps Engineer
KyribaTransform how you use liquidity as a dynamic vehicle for growth and value creation.
• Contribute to infrastructure automation across the AWS platform using Terraform Enterprise, Ansible, Harness, Kubernetes (including operators, CRDs, and cluster lifecycle management), and GitOps practices • Support and evolve continuous delivery pipelines, ensuring reliable and repeatable deployments across environments (PRE, SBX, PRD) • Build and maintain self-service capabilities so that developers and engineering teams can autonomously claim and consume infrastructure resources (storage, compute, databases, messaging) through Kubernetes-native APIs and GitOps workflows, without manual intervention • Leverage AIOps practices to improve platform reliability: intelligent alerting, anomaly detection, AI-assisted incident triage, and automated remediation • Write, maintain, and improve production runbooks to ensure operational procedures are automated, documented, and accessible to the team • Provide L2 support on automation systems and pipelines already in place, troubleshooting issues across the platform ecosystem • Participate in on-call rotations and contribute to incident response and post-mortem processes • Collaborate actively with Production Ops, DBA, and application engineering teams on platform improvements and migrations • Contribute to FinOps by identifying and implementing AWS cost optimization opportunities • Document architecture decisions, operational procedures, and contribute to team knowledge sharing • Own, automate and improve Kyriba's storage platform (NetApp, S3, EBS), ensuring reliability, performance, and DR readiness • Design and implement storage architectures on AWS with high availability and fault tolerance • Provide L2 support on storage incidents, leading investigations on performance degradation, availability issues, and data integrity events




