Job Closed
This listing is no longer active.
The AI Chatbot Generator that helps you create conversational experiences that turn into revenue.
Senior Site Reliability Engineer – Platform Engineering
Location
Spain
Posted
178 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer – Platform Engineering
Landbot
• Build and Maintain the Internal Developer Platform • Design and implement core platform services (CI/CD pipelines, infrastructure provisioning, and observability systems). • Design and implement developer-facing tools, APIs, and automation that enable application teams to deploy, scale, and operate services independently. • Manage and optimize cloud resources, Kubernetes clusters, databases, and networking for reliability, scalability, and cost optimization. • Establish SLIs, SLOs, and error budgets to balance reliability with feature velocity. • Design and maintain observability solutions for real-time visibility and proactive issue detection. • Implement alerting strategies that reduce noise and focus on actionable signals. • Lead incident response, conduct blameless postmortems and drive continuous improvement. • Partner with application teams (platform customers) to understand their workflows and pain points, gather feedback, and prioritize improvements aligned with business objectives. • Create and maintain documentation, runbooks, and knowledge bases that reduce knowledge silos and enable self-service. • Drive decisions through written formats (RFCs, ADRs) that document architectural choices. • Measure platform success through developer productivity metrics, adoption rates, and toil reduction.
Job Requirements
- 3-5 years experience in Site Reliability Engineering, Platform Engineering, Infrastructure Engineering, or DevOps roles, or as a full-time freelancer in similar roles.
- Experience reducing operational toil through automation and self-service tooling.
- Experience building internal platforms or developer tooling, or enabling platform capabilities from application teams, with a platform-as-product mindset focused on developer experience.
- Experience managing production infrastructure and establishing reliability practices (SLIs/SLOs, observability, incident response).
- Strong working knowledge of Kubernetes and the container ecosystem.
- Experience with cloud platforms (GCP, AWS, Azure).
- Proficiency with Infrastructure as Code tools.
- Knowledge of Kubernetes manifest management tools and GitOps practices.
- Experience with Observability platforms.
- Knowledge of OpenTelemetry is a plus.
- Good skill in shell scripting.
- Experience with Python or Go is a plus.
- Experience in Linux, databases management, networking, and distributed systems.
- Solid knowledge of CI/CD pipelines.
- Ability to work effectively in paired/mob programming and asynchronous work environments.
Benefits
- Hybrid work model: flexibility to work remotely, from our Barcelona office 🏙️, or a combination of both.
- Collaborative work environment.
- Flexible working hours.
- Paid time off and flexible holidays: 26 paid days per year (23 regular days + December 24th & 31st) 🎉, plus one additional day off on your birthday.
- Annual budget for training and professional development 📚.
- Transportation ticket 🚋.
- English and Spanish lessons 🇬🇧 🇪🇸.
- Flexible compensation plan through Cobee.
- Team-building activities.
- Referral program with bonuses for bringing in talented professionals.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Provide production support on a shift according to the team on-call roster. • Work on the customer and internal engineering/implementation team raised tickets. • Continuously monitor the health and performance of our services, systems, and infrastructure. • Respond to alerts and incidents promptly to ensure high availability. • Develop and maintain automation scripts and tools to streamline operations. • Assist in troubleshooting and resolving incidents, performing root cause analysis. • Participate in the design and implementation of system improvements. • Work closely with software engineers to understand application requirements. • Create and maintain documentation for processes, procedures, and troubleshooting guides. • Assist in capacity planning activities to anticipate future needs. • Implement and adhere to security best practices to protect our systems and data.
• Design, automate, and secure cloud infrastructure across Azure environments • Architect and manage scalable, secure Azure environments (AKS, App Gateway, VNets, NSGs, Key Vault, Managed Identities) • Build and optimize Jenkins and GitHub Actions pipelines for automated build, test, and deployment across environments • Integrate vulnerability scanning, code analysis, and compliance checks into CI/CD workflows • Implement proactive monitoring using Azure Monitor, Prometheus, Grafana, and Log Analytics • Partner with product and engineering teams in agile ceremonies to plan, deliver, and continuously improve.
• Design and maintain self-service platforms and automation frameworks to reduce friction in software delivery. • Optimize CI/CD pipelines using tools such as GitHub Actions, ArgoCD, Argo Rollouts, and Argo Workflows for scalable, secure, and efficient deployments. • Automate infrastructure provisioning and configuration management with tools like Terraform, Pulumi, Crossplane, or ACK, adopting a GitOps approach. • Design cloud solutions on AWS, following best practices for security, observability, and cost optimization. • Monitor and optimize platform performance using tools like Grafana, Datadog, New Relic, Coralogix, and Prometheus. • Participate in migration projects to AWS, ensuring an efficient and optimized transition.
• Design and maintain continuous integration/deployment pipelines to automate cost testing and deployment • Track software performance, fixing errors, troubleshooting systems, implement preventative measures to ensure smooth workflows • Implement and manage infrastructure. Utilize Terraform or CloudFormation for IaC management • Optimize cloud resources by implementing cost-effective solutions • Collaborate with various teams to ensure smooth deployment • Monitor and create new processes based on performance analysis • Implement security best practices, including automated compliance checks and secure code deployment



