We are shaping the future of online video and TV advertising
Site Reliability Engineer – Observability, Internal Tools
Location
Germany
Posted
12 days ago
Salary
0
Seniority
Senior
Job Description
Site Reliability Engineer – Observability, Internal Tools
smartclip
• Take full ownership of smartclip’s internal utility and platform tooling • Focus your energy on the intersection of observability, automation, and developer infrastructure • Operate and advance our observability stack (including Prometheus, Grafana, and Forgejo) • Embed security engineering into the delivery process • Navigate Linux systems and distributed tooling
Job Requirements
- Be motivated by systems thinking and deep technical curiosity
- Apply an Observability Mindset: Implement a clear strategy for metrics, logs, and traces
- Embrace Ownership: Live the "you build it, you run it" philosophy
- Nice-to-haves: Bring experience with GKE or EKS and Jenkins, Ansible, or Terraform
- Design and evolve production-grade setups on GCP or AWS
- Show us your contributions to open-source projects
- Turn your passion for root-cause analysis into blameless post-mortems
- Understand our systems end-to-end, maintain total flexibility, and contribute back to the open-source ecosystem
Benefits
- 30 days of vacation + Dec 24 & 31 off
- Smart Fridays (4 days week possible)
- Mobility (Germany ticket & JobRad)
- Sports & health offerings
- Mental health support
- Corporate benefits
- RTL+ access
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Role Description We are hiring for a DevOps Engineer who is passionate about problem solving, scalable cloud engineering, automation and microservices, data architectures, complex data transformations, handling large data streams on the fly, and wants to see their work translated into real life applications. Where you will have impact - Automate the deployment, operation, and monitoring of our applications in a reproducible and scalable way using GitOps principles. - Enhance the productivity of over 150 engineers by rethinking and improving CI/CD pipelines and developer workflows. - Take ownership of our cloud infrastructure, including setup, maintenance, scaling, and disaster recovery strategies. - Drive the evolution of our tech stack by researching, experimenting with, and implementing new technologies to improve our platform. - Collaborate closely with the security team to ensure all services and infrastructure adhere to strict security guidelines. - Build out and maintain robust monitoring, alerting, and tooling to ensure platform reliability and performance. - Provide expert support and guidance to your peers and other engineering teams in our open and collaborative environment. - Drive engineering velocity by smartly applying AI to speed up development and leveraging AI and automation to streamline complex debugging, root-cause investigations, and recurring tasks. Qualifications - Experience in a DevOps role, with a proven track record of being part of the design process. - Familiarity with cloud platforms in general (like GCP, Azure, or AWS). - Knowledge of Kubernetes, including what it is and how its components interact. - You are a forward-thinking builder who views AI as a core component of modern architecture. - A proactive mindset with a passion for getting things done and a high bar for quality. - Fluent in English with strong communication skills. Requirements - Experience with other cloud platforms like AWS or Azure. Technologies you will work with - Google Cloud Platform - Kubernetes - Terraform - Istio - ArgoCD - Prometheus - Python - Golang - cdk8s - Helm - FoundationDB - Mimir - Grafana - Typescript - Claude - Gemini Benefits - Flexible time off: Autonomy to manage your work-life balance. - Alan Flex benefits: 160€/month for food or nursery. - Flexible retribution: Optional benefits through tax-free payroll deductions for food, transportation and/or nursery. - Wellbeing support: Subsidized ClassPass subscription. - Comprehensive health insurance: 100% Alan coverage for you, your spouse, and dependents. - Impactful work: Shape products relied on by 85,000+ users worldwide. - Referral bonuses: Earn rewards for bringing in new talent. Diversity, equity, inclusion, and belonging Thank you for considering a career with Lighthouse. We are committed to fostering a diverse and inclusive workplace that values equal opportunity for all. We welcome candidates from all backgrounds, regardless of age, gender, race, religion, sexual orientation, and disability. Our commitment to equality is part of our culture. If you require reasonable accommodation at any point during the application or interview process, please notify your recruiter. Not ticking every box? No problem! We value diverse backgrounds and unique skill sets, and we encourage individuals from all walks of life to apply. If your experience looks a little different from what we've described, but you're passionate about what we do and are a quick learner, we'd love to hear from you!
• Drive our cloud infrastructure, CI/CD pipelines, and operational excellence across a growing portfolio of products and services • Design scalable, secure, and resilient systems while driving automation, reliability, and developer velocity across the organization • Work directly with senior leadership and systems architects • Partner with Engineering, Product, Security, and IT teams across the globe • Modernize infrastructure, sharpen observability, and build systems that scale with INNERGY’s growth • Own design, provisioning, and evolution of INNERGY’s Azure cloud infrastructure • Drive infrastructure-as-code standards across all environments • Optimize for cost, performance, and fault tolerance • Build and maintain CI/CD pipelines for fast, reliable software delivery • Standardize deployment patterns across INNERGY's product suite • Support containerization and orchestration initiatives using Docker and Kubernetes • Reduce toil through automation and self-service tooling • Define and own SLO/SLA targets for production systems • Establish observability standards: metrics, logging, tracing, and alerting • Own disaster recovery strategy and ensure systems meet recovery objectives • Lead incident response process, including runbooks, postmortems, on-call structure • Coordinate with global counterparts for follow-the-sun coverage and round-the-clock reliability • Harden infrastructure and automate security controls • Support audit and compliance requirements
• Act as a key contributor in migrating workloads to Kubernetes (EKS) • Standardize APIs currently running on OKE for the AWS stack • Ensure compliance and best practices in multi-account environments using AWS Landing Zone Accelerator (LZA) and AWS Organizations • Implement and maintain centralized authentication (SSO) integrated with AWS • Provision and manage resources using Terraform • Create and maintain CI/CD pipelines using GitHub Actions
DevOps Engineer, EU
CoralogixFull-stack observability for logs, metrics, traces and security events with built-in cost optimization.
• Works in high scale environments - Coralogix data pipeline processes 55Tb of data each day • Adopt cutting edge technologies with end-to-end responsibility • Building internal tools to expand our platform capabilities • Collaborate with R&D to improve stability & reliability of the system • Lead the product roadmap - our product is designed for engineers. Therefore, our engineers promote, enhance, and take a crucial part in influencing the product roadmap.



