Job Closed

This listing is no longer active.

Senior Software Engineer — KVM Virtualization (Cloud Hypervisor/QEMU/Libvirt) & Gardener Kubernetes Platform

Location

CET (UTC+1) + 9 moreAll locations: CET (UTC+1) | UTC-5 to UTC-3 | GMT (UTC+0) | EAT (UTC+3) | IST (UTC+5:30) | JST (UTC+9) | ACT (UTC+9:30) | AST (UTC-9) | PST (UTC-8) | CST (UTC-6)

Posted

37 days ago

Salary

0

Seniority

Senior

Job Description

Senior Software Engineer — KVM Virtualization (Cloud Hypervisor/QEMU/Libvirt) & Gardener Kubernetes Platform

Nemensis Ag

Role Description We are seeking Senior Software Engineers with strong Linux OS, KVM virtualization, and Gardener Kubernetes expertise to provide expert-level support and consulting. You will troubleshoot complex production issues, optimize performance, harden systems, and strengthen integration across: - Gardenlinux - KVM-based virtualization (Cloud Hypervisor/QEMU/Libvirt) - Gardener-managed Kubernetes clusters Key Responsibilities - Gardenlinux (Debian-based OS) - Configure, deploy, and maintain Gardenlinux-based environments - Troubleshoot OS-level issues (kernel, boot, packages, system services) - Debug custom image builds and runtime behavior - Provide recommendations for performance tuning and security hardening - KVM Virtualization Stack (Cloud Hypervisor, QEMU, Libvirt) - Configure and integrate KVM-based virtualization environments - Analyze and resolve hypervisor/VM-level incidents - Optimize performance across compute, networking, and storage - Debug and tune Cloud Hypervisor and Libvirt configurations - Gardener Kubernetes Platform - Troubleshoot incidents in Gardener control plane and shoot clusters - Perform root cause analysis for provisioning, scaling, and upgrade failures - Review configurations and propose improvements - Support integration between Gardener, Gardenlinux, and KVM-based nodes Qualifications - Strong hands-on expertise with Debian-based Linux - Kernel configuration and low-level troubleshooting (boot/perf/kernel) - Experience with OS image customization and build pipelines - Strong knowledge of systemd, package management, and OS hardening - Familiarity integrating OS image build/deployment into CI/CD - Advanced understanding of KVM, QEMU, and Libvirt architectures - Experience configuring/troubleshooting Cloud Hypervisor - Deep knowledge of virtualization: Networking (bridges, VLANs, SDN concepts), Storage (NFS) - Hardware virtualization fundamentals, including NUMA alignment - Automation/scripting skills in Go, Python, and/or Bash - Host performance tuning and low-level debugging experience - Strong Kubernetes internals knowledge (control plane, networking, scheduling) - Hands-on experience with Gardener (seed/shoot operations and troubleshooting) - Cluster lifecycle management: upgrades, node troubleshooting, scaling - Observability proficiency: Prometheus (and ideally Perses) - Ability to lead/author incident RCA and contribute to post-mortems

Related Categories

Related Job Pages

More Platform Engineer Jobs

Full TimeRemoteTeam 501-1,000Since 2005H1B No Sponsor

Reddit is a community of communities. It’s built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 121 million daily active unique visitors, Reddit is one of the internet’s largest sources of information. For more information, visit www.redditinc.com. Team The ML Indexing & Retrieval Platform team at Reddit is responsible for building and scaling the core infrastructure that powers machine learning driven recommendations. We design and maintain systems for ML data ingestion, low-latency retrieval services, and end-to-end lifecycle management of data. With a focus on performance, reliability, and scalability, we enable real-time access to high-quality data that supports a wide range of applications, including Content Understanding, Semantic, Lexical retrieval & GenAI applications. How You'll Have Impact You’ll lead the development of next-generation ML Indexing & Retrieval systems, owning the full lifecycle from ideation to production and going beyond incremental improvements to reimagine core platform capabilities. As part of a high-impact, cross-functional team, you’ll solve complex technical challenges to build scalable, reliable platforms that empower developers to efficiently ship critical ML features. Languages: Go, Java, Python, or any object oriented programming language Frameworks: Flink, Airflow, Spark for large scale batch & stream processing Databases: Familiarity with Vector, Lexical & Key-Value Databases Tools: Kubernetes, Docker, AWS, GCP What You’ll Do - Lead the technical strategy, architecture, and implementation of Reddit’s next-generation ML Indexing & Retrieval engine, integrating capabilities across lexical and vector indexing, low-latency retrieval, and emerging GenAI applications. - Partner closely with product engineers across Content Understanding, Search, Feeds, Ads, Growth, and Safety to deliver high-quality experiences. - Define best practices for observability, reliability, and operational excellence in large-scale distributed systems. - Mentor and guide engineers in designing scalable infrastructure and adopting robust DevOps and SRE principles. - Collaborate with infrastructure, and ML teams to ensure the platform evolves to meet the needs of Reddit’s growing user base and diverse content ecosystem. Who You Might Be: - 10+ years of experience in software engineering, specializing in Indexing and Retrieval systems. - 3+ years in technical leadership, architecting and scaling distributed systems in production environments. - Deep expertise in large-scale data platforms, including batch indexing and stream processing. - Proven experience designing and operating large-scale, low-latency retrieval services. - Expertise in lexical and vector search retrieval technologies, such as Milvus, Vespa, or Elasticsearch. - Skilled in designing cloud-native architectures and managing containerized workloads using Kubernetes and AWS/GCP. - Adept at translating complex technical challenges into clear, actionable strategies. - Strong communicator and mentor who leads through collaboration, influence, and technical excellence. Benefits: - Comprehensive Healthcare Benefits and Income Replacement Programs - 401k with Employer Match - Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support - Family Planning Support - Gender-Affirming Care - Mental Health & Coaching Benefits - Flexible Vacation & Paid Volunteer Time Off - Generous Paid Parental Leave #LI-Remote Pay Transparency: This job posting may span more than one career level. In addition to base salary, this job is eligible to receive equity in the form of restricted stock units, and depending on the position offered, it may also be eligible to receive a commission. Additionally, Reddit offers a wide range of benefits to U.S.-based employees, including medical, dental, and vision insurance, 401(k) program with employer match, generous time off for vacation, and parental leave. To learn more, please visit https://www.redditinc.com/careers/. To provide greater transparency to candidates, we share base salary ranges for all US-based job postings regardless of state. We set standard base pay ranges for all roles based on function, level, and country location, benchmarked against similar stage growth companies. Final offer amounts are determined by multiple factors including, skills, depth of work experience and relevant licenses/credentials, and may vary from the amounts listed below. The base salary range for this position is: $266,000—$372,400 USD In select roles and locations, the interviews will be recorded, transcribed and summarized by artificial intelligence (AI). You will have the opportunity to opt out of recording, transcription and summarization prior to any scheduled interviews. During the interview, we will collect the following categories of personal information: Identifiers, Professional and Employment-Related Information, Sensory Information (audio/video recording), and any other categories of personal information you choose to share with us. We will use this information to evaluate your application for employment or an independent contractor role, as applicable. We will not sell your personal information or disclose it to any third party for their marketing purposes. We will delete any recording of your interview promptly after making a hiring decision. For more information about how we will handle your personal information, including our retention of it, please refer to our Candidate Privacy Policy for Potential Employees and Contractors. Reddit is proud to be an equal opportunity employer, and is committed to building a workforce representative of the diverse communities we serve. Reddit is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If, due to a disability, you need an accommodation during the interview process, please let your recruiter know.

United States
$266K - $372K / year
EY logo

AI Engineers + Platform Architect

EY

Building a #BetterWorkingWorld by providing trust through assurance and helping organizations grow, transform & operate.

Full TimeRemoteTeam 10,001+Since 1989H1B Sponsor

Role Description The Senior AI Engineer designs, builds, and ships enterprise-grade AI/ML and LLM-based solutions. This role focuses on hands-on engineering, high-quality delivery, and strong collaboration with cross-functional teams. Key Responsibilities - Design, build, and deploy AI/ML and LLM-based solutions in enterprise environments. - Collaborate with cross-functional teams (Data Engineering, Cloud, Product) to deliver scalable AI systems. - Ensure high engineering standards, maintainability, and best practices. - Participate in code reviews, architecture discussions, and solution design. - Support continuous improvement of AI delivery processes and tooling. Qualifications - Advanced Python (3–6 years) - FastAPI - scikit-learn - API design - Clean code - Preferred: intermediate SQL, Design patterns (clean architecture/hexagonal), microservices, advanced testing, Docker Requirements - Code quality; API design; troubleshooting; software architecture discipline; applied SQL - LLMs, RAG & Agents: - End-to-end RAG; LangChain/LangGraph - Vector search (FAISS or similar) - Fine-tuning (LoRA/QLoRA) - Advanced evaluation (RAGAS/TruLens/DeepEval) - Agent design - Autogen - Preferred: Llama Index; custom retrievers - Cloud (Azure or Databricks): - Azure OpenAI; Azure AI Search; Azure ML; service integration; AKS/Container Apps; API Management - Advanced MLflow (registry/tracking/serving); Delta Lake; Unity Catalog; Feature Store; Vector Search - Preferred: Workflows/DLT - MLOps & Delivery: - CI/CD (GitHub Actions/Azure DevOps) - Docker - AKS/Kubernetes - End-to-end ML pipelines - Basic monitoring (latency, cost, failures) - Preferred: AI observability (tracing/telemetry); advanced Bicep/Terraform - ML Fundamentals: - Classic models - Advanced metrics & trade-offs - When to use classic ML vs. LLMs - Preferred: Advanced/ensemble models - English: Fluent B2+ technical communication - Autonomy in English, Technical clarity - Proactive - Good at managing request gathering and handling - Proactive communication

Latin America (LATAM)
Job Closed
Crükus Virtual Staffing logo

Five9 CCaaS Platform Engineer

Crükus Virtual Staffing

We provide virtual staffing solutions for insurance agents, real estate agents, and small to medium sized businesses.

ContractRemoteTeam 11-50H1B No Sponsor

• Implement and configure Five9 contact center environments • Build and maintain IVRs, ACD routing, queues, skills, and agent configurations • Configure inbound and outbound campaigns and dialing logic • Manage users, roles, permissions, and tenant settings • Support CRM and third party integrations, including testing and troubleshooting • Participate in go live activities and post deployment stabilization • Provide operational and managed services support for active client environments • Troubleshoot platform issues and escalate when needed • Maintain technical documentation and configuration records

India
$2K - $3K / month
Job Closed
netgo group GmbH logo

Backend Software Developer / Platform (Kubernetes, Cloud)

netgo group GmbH

Wir bei netgo glauben an echtes Teamwork, Offenheit und Weiterentwicklung. Unsere Werte bilden das Fundament für unsere tägliche Arbeit: Unsere Vielfalt. Unsere Stärke. nahdran: Nähe schafft Vertrauen. machen: Wir übernehmen Verantwortung und packen an. TechmitHerz: Weil wir lieben, was wir tun und up-to-date bleiben. zusammenwachsen: Wir entwickeln uns gemeinsam weiter und lernen voneinander. Führung verstehen wir als Verantwortung: Wir kommunizieren offen, schaffen Klarheit, fördern Entwicklung und leben vor, was wir sagen und erwarten. Unsere Prinzipien geben Orientierung, Guidance und sollen empowern – nicht kontrollieren. Du willst nicht nur arbeiten, sondern mitgestalten? Dann bist du bei uns richtig. #bepartofnetgo – denn unser Herz brennt für IT & Software! Interesse? Falls du noch Fragen an uns haben solltest, melde dich doch gerne unter der angegebenen E-Mail-Adresse bei uns. Wir freuen uns auf dich! e. jobs@netgo.de

Role Description Du willst nicht nur Schnittstellen entwickeln, sondern aktiv eine Plattform mitgestalten? Dann entwickle mit uns die Basis für moderne DMS- und Integrationslösungen – skalierbar, zukunftssicher und nah an echten Use Cases. Deine Aufgaben - Backend-Entwicklung: Du entwickelst performante Services und APIs für unsere Middleware und Plattformlösungen. - Cloud & Kubernetes: Du arbeitest auf unserer Kubernetes-Plattform und entwickelst skalierbare, cloud-native Services. - Schnittstellen & Integration: Du verbindest DMS-Systeme (z. B. d.velop, ELO) mit ERP-, Finance-, M365- und weiteren Systemlandschaften. - App- & Service-Entwicklung: Du entwickelst eigene Anwendungen und Services für unsere Kundenlösungen. - Architektur & Performance: Du denkst in sauberen Architekturen und entwickelst Lösungen, die nachhaltig skalieren. Warum diese Rolle spannend ist - Du arbeitest an einer echten Plattform – nicht nur an einzelnen Projekten. - Du bewegst dich an der Schnittstelle von Cloud, Integration und DMS. - Du hast direkten Einfluss auf Architekturentscheidungen und Produktentwicklung. - Du arbeitest eng mit Consulting, Entwicklung und Kundenprojekten zusammen. Qualifications - Erfahrung in der Backend-Entwicklung (z. B. mit .NET, C#, Node.js oder ähnlichen Technologien). - Erfahrung im Umgang mit APIs, REST und Integrationen. - Idealerweise Kenntnisse in Docker und/oder Kubernetes. - Verständnis für Cloud-Architekturen. - Interesse am DMS-/ECM-Umfeld (kein Muss – du wächst bei uns rein). Benefits - Work-Life-Balance & Flexibilität: Plane deinen Tag flexibel mit Vertrauensarbeitszeit. - Genieße 30 Tage Urlaub pro Jahr. - Arbeiten bis zu 3 Monate im Jahr aus dem EU-Ausland möglich. - Fitness & Gesundheit: Kostenfreies Training im netgo fitness club und subventionierte Urban Sports Club-Mitgliedschaft. - Regelmäßige Gesundheitsprogramme in Zusammenarbeit mit der Techniker Krankenkasse. - Fortbildung & Zertifizierungen: Weiterbildungsangebote und Herstellerzertifizierungen, um stets up-to-date zu bleiben. - Zertifizierungen im hauseigenen Pearson VUE Testcenter möglich. - Firmenwagen & Mobilität: Exklusive JobRad-Angebote für Fahrräder. - Arbeitskleidung: Budget für Arbeitskleidung, bequem im Online-Shop bestellbar. - Sicherheit & Vorsorge: Zuschüsse zur Altersvorsorge und Unterstützung beim Vermögensaufbau. - Rabatte & Goodies: Vorteile bei über 1.500 Anbietern und Zugang zu Ticketsprinter. - Familienfreundlich: Zuschüsse zur Kinderbetreuung. Company Description Wir bei netgo glauben an echtes Teamwork, Offenheit und Weiterentwicklung. Unsere Werte bilden das Fundament für unsere tägliche Arbeit: - Unsere Vielfalt. Unsere Stärke. - nahdran: Nähe schafft Vertrauen. - machen: Wir übernehmen Verantwortung und packen an. - TechmitHerz: Weil wir lieben, was wir tun und up-to-date bleiben. - zusammenwachsen: Wir entwickeln uns gemeinsam weiter und lernen voneinander. Führung verstehen wir als Verantwortung: Wir kommunizieren offen, schaffen Klarheit, fördern Entwicklung und leben vor, was wir sagen und erwarten. Unsere Prinzipien geben Orientierung, Guidance und sollen empowern – nicht kontrollieren. Du willst nicht nur arbeiten, sondern mitgestalten? Dann bist du bei uns richtig. #bepartofnetgo – denn unser Herz brennt für IT & Software! Interesse? Falls du noch Fragen an uns haben solltest, melde dich doch gerne unter der angegebenen E-Mail-Adresse bei uns. Wir freuen uns auf dich! e. jobs@netgo.de

United States + 9 moreAll locations: United States | United Kingdom | Canada | Germany | France | Brazil | Australia | Estonia | Japan | Ecuador