Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteMid LevelTeam 51-200

Location

Germany

Posted

50 days ago

Salary

0

Seniority

Mid Level

No structured requirement data.

Job Description

Site Reliability Engineer

deepset

TL;DR We're hiring a Site Reliability Engineer to own and evolve deepset's cloud and customer infrastructure end to end. You'll work across SaaS, private cloud, and on-prem environments to make our self-hosted platform production-ready, drive CI/CD and GitOps maturity, and reduce complexity at scale. Your work will directly shape how deepset's AI platform is built, deployed, and scaled for our own cloud and for customers running it in their own environments. Why deepset At deepset, we’re on a mission to make custom AI solutions accessible to every organization. With Haystack, thousands of developers build advanced LLM applications every day, while our enterprise-ready AI Platform helps companies turn large language models into business value. We’re remote-first, flexible, and built on a culture of trust and ownership. You’ll collaborate with top-tier tech talent, tackle meaningful challenges, and help transform complex AI into solutions that are simple, powerful, and ready for the real world. What you will do You won’t just “keep things running” - you’ll help define how our platform is built, deployed, and scaled across cloud and customer environments. - Build and operate real-world infrastructureDesign, configure, and evolve infrastructure that runs both in our cloud and inside customer environments (SaaS, private cloud, on-prem). - Make self-hosted production-readyHelp us deliver a production-grade, self-hosted platform that can be deployed on any Kubernetes setup in weeks - not months. - Drive automation & platform maturityImprove CI/CD pipelines, GitHub workflows, and GitOps setups so teams can ship faster with confidence. - Reduce complexity and costContinuously simplify systems and optimize infrastructure spend without compromising performance or reliability. - Shape how we buildChampion best practices in reliability, scalability, and security across the organization, not as rules, but as working systems. Requirements - 2-5 years of experience working with large-scale production infrastructure - Fluent German language skills - Experience with distributed or service-oriented architectures - Hands-on expertise with: - AWS - Kubernetes - CI/CD and GitOps (e.g. ArgoCD) - Working knowledge of Infrastructure as Code (Terraform preferred) - Solid troubleshooting skills - you can debug across systems, not just within one layer - A pragmatic mindset: you balance speed, simplicity, and reliability - Ownership and accountability - you take responsibility for systems end-to-end - Ability to work independently while staying aligned with the team’s goals Nice to have - Familiarity with observability stacks (e.g. Datadog, Prometheus) - Experience optimizing cloud costs at scale - Interest or experience in Machine Learning / LLM systems - Experience improving developer experience and platform tooling using AI agents - Contributions to SRE practices like postmortems, SLIs/SLOs, and reliability engineering culture Benefits - Remote-first setup with flexible hours & tech of your choice - 30 days vacation + extra days for family sick leave - Competitive salary & stock options for every team member - Monthly sports & mental health support allowance with Oliva - Annual learning & development budget - Monthly team socials & in-person meetups - Dog-friendly Berlin HQ

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Abacum logo

Senior DevOps Engineer

Abacum

Abacum is the leading business planning platform that empowers Finance teams to drive performance.

DevOps Engineer50 days ago
Full TimeRemoteTeam 51-200Since 2020H1B No Sponsor

• Design and implement our systems to be efficient, scalable, accountable, and secure • Team up with other Engineers to perform experiments and test new ideas • Build a strong DevOps culture and tooling that enable our delivery teams to be autonomous while providing best practices (security, observability, scalability, performance, etc.) • Deploy and manage our infrastructure provisioning • Develop and drive real time observability solutions that provide visibility into system health • Provide technical guidance and educate team members and coworkers on operations and cloud best practices • Continuously improve development delivery CI/CD • Ability to develop and implement security measures related to the development processes and operational needs driven by our security and compliance team • Build and scale our Kubernetes clusters and workloads • Manage and scale our cloud databases • Participate in a 24x7 on-call rotation

Europe
Job Closed

Role Description We are looking for a skilled and collaborative DevOps Engineer to join our Cloud Security and Infrastructure team. As a DevOps Engineer, you will be responsible for building, maintaining, and optimizing cloud-based infrastructure. You will work alongside engineering teams to solve complex operational problems, design solutions, and implement practices that drive efficiency at scale. Reporting into the Engineering Manager, DevOps & Infrastructure, your contributions will be crucial in promoting collaboration between engineering teams to ensure the delivery and management of robust services. - Work closely with Engineering stakeholders to design and maintain a reliable, scalable, and secure platform. - Optimize and enhance deployment tooling and infrastructure, including creating and maintaining new CI/CD pipelines, improving infrastructure for cost-effectiveness and performance. - Collaborate with the Engineering team to identify areas for improvement and implement innovative solutions. - Provide expert DevOps guidance for existing projects and new initiatives. - Set up and maintain test environments for both manual and automated testing. - Conduct vulnerability scans and penetration tests, analyzing results and acting on findings. - Plan and execute ongoing routine application maintenance tasks. - Participate in on-call rotations for monitoring, alerting, and incident response. - Identify cost-saving opportunities and efficiency improvements within our infrastructure. Qualifications - 3+ years of experience in DevOps, Site Reliability Engineering, or automation engineering. - Strong understanding of cloud technologies, particularly AWS. - Hands-on experience managing systems and services on a cloud platform (AWS). - Solid understanding of database technologies such as MongoDB. - Knowledge of Docker, Kubernetes, and Linux Systems. - Proficiency with Terraform and Packer for infrastructure as code. - Experience deploying and managing cloud services, monitoring and alerting systems, and handling critical issues. - Agile development experience and familiarity with CI/CD practices. - Strong attention to detail, problem-solving, and troubleshooting skills. - Exceptional organizational skills and the ability to manage multiple projects concurrently. - Ability to work independently and thrive in an autonomous environment with minimal supervision. - Excellent communication and collaboration skills. - A passion for continuous learning and technical curiosity. Requirements - AWS DevOps Engineer certification. - Experience with cloud-based security protocols. Benefits - We’re a remote-first company built on trust, autonomy, and accountability. - Minimum of 3 weeks vacation, 5 sick days, and 6 personal/flex days, plus a company-wide winter holiday shutdown. - Health, dental, and vision, long-term disability, and a Health Spending Account (HSA). - Flexible parental leave benefits, including top-ups. - A dedicated work-from-home allowance to get you set up for success.

United States + 1 moreAll locations: United States | Canada
C$95K - C$130K / year
Advance Media logo

Breaking and Trending News Reporter

Advance Media

Catalyst IQ is a digital marketing and technology leader formed by uniting Advance Automotive’s top brands—Adpearance, Fox Dealer, Search Optics, and ZeroSum. We empower automotive dealers and manufacturers to grow with precision and profitability through smarter, faster, and more comprehensive solutions. As a part of Advance Local and built on a foundation of over a decade of proprietary technology development, 23 billion data points, 22 OEM certifications, and a national sales force, Catalyst IQ combines cutting-edge innovation with human expertise to deliver real-time insights and actionable intelligence that accelerate sales.

DevOps Engineer50 days ago

Role Description Strengthening and empowering all of the communities we serve. NJ Advance Media has an exciting opportunity for a Breaking and Trending News Reporter on our growing AI content team. We're seeking a candidate who embraces emerging technology, thrives in a fast-paced environment, and reports with immediacy and accuracy. This new segment of our news team covers a wide range of topics including: - Business - Real estate - Weather - Public safety You’ll be producing a high volume of stories while working on cutting-edge projects to harness the efficiency offered by AI. You'll be coordinating closely with the newsroom's largest team of reporters and working side-by-side with a friendly, dedicated, and ambitious group of reporters and editors with decades of experience covering New Jersey from every angle. The base salary range is $53,000 - $55,000 per year. Qualifications - A degree in journalism or communications or equivalent education and work experience with a proven ability in journalism reporting and writing - A minimum of one year in journalism with a proven ability in reporting and writing required - Experience in breaking news and headline optimization - An interest and enthusiasm to forge new AI-driven workflows to improve newsroom-wide efficiency - An ability to work independently under deadline pressure and prioritize tasks appropriately - A sound understanding of news writing, journalistic ethics, and story structure - Experience building or working with AI agents is a plus, but not required - Experience using social media to find stories and promote content - Ability to work remotely - Availability to work weekends, some holidays, and occasional irregular hours to meet the needs of the breaking news staffing Requirements - This job requires reliable transportation to meet with sources and/or cover events in New Jersey during occasional traditional breaking news shifts.

United States
$53K - $55K / year
Fortrea logo

Site Navigator - Start Up & Contract - Portugal

Fortrea

Fortrea is a contract research organization (CRO) that provides advanced laboratory-focused services that help change lives. On a mission to deliver “life-cha

DevOps Engineer50 days ago

The Site Navigator II plays a key role in coordinating and managing site-level activities during the study start-up phase through maintenance and closeout. Acting as the primary liaison between investigative sites, sponsors, and internal teams, the role ensures regulatory compliance, efficient site activation, and consistent progress toward project milestones. - Coordinate and oversee activities from feasibility through activation extending through the maintenance phase of the study and closeout, ensuring compliance with ICH/GCP, local regulations, SOPs, and project timelines. - Act as the primary point of contact for investigative sites, managing site engagement, feasibility activities, and ongoing site support. - Identify, assess, and select suitable research sites, including conducting remote pre-study visits and managing supporting documentation. - Manage the collection, quality review, tracking, and maintenance of essential regulatory documents, ensuring ongoing site compliance. - Liaise with IRB/IEC, Regulatory Authorities, and third bodies as applicable, in collaboration with global regulatory teams - Perform initial contract and budget negotiations with the sites, as well as amended where applicable. - Support site initiation activities by coordinating with CRAs, vendors, and supply teams, and assisting with SIV preparation. - Perform remote visits as required by the monitoring plan including remote monitoring that may require SDV/SDR. - Track progress, proactively identify and escalate risks or issues, and ensure TMF accuracy and audit readiness at all times. Experience Requested The ideal candidate brings hands-on experience in clinical study start-up and regulatory processes, with the ability to manage multiple stakeholders and timelines in a regulated clinical research environment. - Minimum of 2+ years of experience in clinical development, study start-up, or regulatory processes. - Strong working knowledge of ICH/GCP, Regulatory Authority and IRB/IEC requirements, and investigator start-up documentation. - Demonstrated ability to manage multiple priorities and deadlines while ensuring regulatory and operational compliance. - Excellent communication, problem-solving, and negotiation skills, including experience supporting site contracts and budgets. Learn more about our EEO & Accommodations request here.

Portugal