Filigran logo
Filigran

Uncover Threats. Take Action. Home of OpenCTI, OpenBAS and more.

Senior Platform Engineer – SRE

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 201-500Since 2022H1B No SponsorCompany SiteLinkedIn

Location

France

Posted

8 days ago

Salary

0

Seniority

Senior

Job Description

Senior Platform Engineer – SRE

Filigran

• Design, build, and operate production‑grade Kubernetes clusters on bare metal and cloud • Industrialize, automate, improve observability & monitoring • Continue to create a culture of service delivery excellence • Participate in on‑call rotation, incident management, and post‑incident reviews • Design and drive projects around DevSecOps practices in the company

Job Requirements

  • 6+ years of experience in SRE/Platform/Cloud engineering
  • Comfortable working in a remote, async-first and english environment
  • Strong technical skills:
  • Kubernetes
  • Linux
  • Ansible/Terraform
  • Observability stack (Grafana, Datadog, Prometheus)
  • ElasticSearch, RabbitMQ, Redis, Minio/S3
  • Proficiency in at least one major cloud provider (AWS / Azure / GCP)
  • Proficient in a development language: Python, Go, Java…
  • Practical experience with infrastructure security (network security, TLS, secrets, IAM/RBAC, vulnerability management)
  • soft skills:
  • Love / passion for open source world and culture.
  • Team player, you want to share your knowledge and mentor new comers.
  • Ability to take ownership on topics, combined with a strong collaborative mindset.
  • Adaptative to be able to navigate a scaling environment

Benefits

  • Competitive pay + equity - everyone shares in our success
  • Remote-first, flexible, and balanced - work that fits your life
  • Your setup, your choice - pick the gear that works for you
  • Twice-a-year gatherings - we meet in person for regional and global offsites to connect, collaborate, and strengthen our culture beyond the screen

Related Categories

Related Job Pages

More DevOps Engineer Jobs

RemotePro.ph logo

DevOps Engineer

RemotePro.ph

We are a US-based IT services firm with a consistently growing and fully remote PH team.

DevOps Engineer8 days ago
Full TimeRemoteTeam 51-200Since 2013H1B No Sponsor

The best way to look at this role is you would have the main responsibility to own our Linux systems and the responsibility for deployment and the support of team that will handle the maintenance and monitoring of applications on them. This will include custom and other more standard open source and proprietary applications. Our favorite candidate will be able to support at least basic needs for Window DevOps and more. **RESPONSIBILITIES:** - Plan/Design, Build and define the monitoring of our Linux and Windows applications and the systems that run them. - Implement client-requested integrations. - Design/Plan and support team in deploying updates and fixes - Conduct root cause analysis if issues - Investigate and resolve technical issues and create/provide resources to team to prevent. - Develop scripts to automate processes and updates - Provide technical support and design procedures for system troubleshooting and maintenance.

Philippines
Thoughtworks logo

Senior Service Reliability Engineer

Thoughtworks

Thoughtworks is a dynamic and inclusive community of bright and supportive colleagues who are revolutionizing tech. As a leading technology consultancy, we’re pushing boundaries through our purposeful and impactful work. Over 30 years of delivering extraordinary impact with clients. Helping clients solve complex business problems with technology as the differentiator.

DevOps Engineer8 days ago
Full TimeRemoteTeam 10,001

Role Description As a Senior Service Reliability Engineer (SRE) you will take a multifaceted approach to ensure technical excellence and operational efficiency within the infrastructure domain. Specializing in reliability, resilience and system performance, you take a lead role in championing the principles of Site Reliability Engineering. By strategically integrating automation, monitoring and incident response, you facilitate the evolution from traditional operations to a more customer-focused and agile approach. Emphasizing shared responsibility and a commitment to continuous improvement, you cultivate a collaborative culture, enabling organizations to meet and exceed their reliability and business objectives. - You will improve site reliability by building mechanisms/architectures that enable fault tolerance and faster median time to respond and median time to detect. - You will drive the integration of observability automation into the CI/CD pipeline. - You will handle production incidents, manage incident communication with clients and draft root cause analysis documents. - You will monitor performance of production systems and improve their scaling to ensure business goals are met within expected SLA and SLO metrics. - You will work closely with application development teams as advisors on improving system reliability and assisting in implementation for reliability improvements. - You will improve system observability across multiple facets such as logging and metrics, reducing false alarms to eliminate unnecessary toil and improving process efficiency. - You will implement chaos engineering practices as necessary to test system reliability, setting up processes for such testing to be done regularly. - You have a clear understanding of client goals and business needs and setting direction for site reliability in line with the same, e.g.: Achieving application availability with minimum/no disruption (99.999%) if necessary for business. Qualifications - You have hands-on experience in programming and scripting languages such as Python, Go or Bash. - You have a good understanding of at least one Public Cloud, e.g.: AWS, Azure or GCP. - You have had exposure to observability tools such as Grafana, Datadog, NewRelic, ELK Stack, Dynatrace or equivalent and you are proficient in using data from these tools to dissect and identify root causes of system and infrastructure issues. - You are familiar with DevOps and GitOps practices. - You have a good knowledge of container-based architecture and orchestration tools such as Kubernetes, AWS EKS, Docker Swarm, Nomad, etc. - You understand technical architecture and modern design patterns, including microservices, serverless functions, NoSQL and RESTful APIs, with experience in fixing bugs, analyzing logs, building metrics and operational dashboards. - You are familiar with creating infrastructure resources for improving reliability of system that follows Cloud’s Well Architected Framework principles: Reliability, security, cost optimization, performance efficiency and operational. Requirements - You have strong communication and articulation skills, and are proficient in English. - You have good people skills with an emphasis on negotiation and close collaboration with multiple cross-functional teams from the client side and/or Thoughtworks. - You solve challenging problems and difficult to debug issues with a never give up attitude. - You have the ability to work under pressure and with composure during production incidents. - You can confidently recommend improvements backed by strong technical arguments to client stakeholders or application development teams. - You are able to understand requirements provided by the client on both technical and business aspects and break them down for successful implementation. - You have a strong drive and ownership mentality, with a willingness to sign up for and deliver work when called upon, without being too concerned about role boundaries. - You’re willing to be part of a rotation- and need-based 24x7 available team. Benefits - There is no one-size-fits-all career path at Thoughtworks: however you want to develop your career is entirely up to you. - Your career is supported by interactive tools, numerous development programs and teammates who want to help you grow. - We see value in helping each other be our best and that extends to empowering our employees in their career journeys.

Singapore
Full TimeRemoteTeam 51-200Since 2011H1B No Sponsor

• In this role, reporting to the Head of DevOps, you will be mainly responsible for CI/CD, infrastructure settings and security, and maintenance of systems, virtual machines, Kubernetes clusters, and cloud applications.

Spain
Akamai Technologies logo

Senior Site Reliability Engineer, Linux

Akamai Technologies

Akamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences helping billions of people live, work, and play every day. With the world's most distributed compute platform from cloud to edge, we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away.

DevOps Engineer8 days ago
Full TimeRemoteTeam 5,001-10,000H1B Sponsor

• Collaborating with our support, operations and engineering teams, investigate and troubleshoot complex problems. • Developing processes, plans, and infrastructure to deploy new software components and updates safely and efficiently at scale. • Participating in on-call rotations, guiding restoration and repair of service-impacting issues. • Improving our system monitoring and analysis platform to speed error detection and remediation, enhancing performance and reliability.

India