Job Closed

This listing is no longer active.

Webflow

Webflow is the way to design, build, and launch powerful websites visually — without coding.

Senior Site Reliability Engineer, Observability

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 501-1,000Since 2013H1B SponsorCompany Site LinkedIn

Location

Argentina

Posted

76 days ago

Salary

Seniority

Senior

Bachelor Degree5 yrs expEnglishAWS Cloud Distributed Systems Docker ElasticSearch Google Cloud Platform Grafana JavaScript Kubernetes MongoDB Node.js PostgreSQL Prometheus React Terraform

Job Description

• Improve reliability and stability of Webflow’s customer-facing, production infrastructure. • Ensure platform security and scalability for users as projects are launched. • Help define and implement observability practices, enabling engineers to confidently ship and operate services in production. • Build and maintain AI-powered agents and automation that help engineers surface insights faster, reduce alert fatigue, and accelerate incident resolution. • Participate in and improve on-call and incident response processes.

Job Requirements

BS / BA college degree or relevant experience.
Business-level fluency to read, write and speak in English.
5+ years of experience building, maintaining, and debugging distributed systems in a customer-facing environment that allows for little to no downtime.
Hands-on experience with observability platforms and tooling such as Datadog, Grafana, Prometheus, ElasticSearch or similar.
Experience with OpenTelemetry or similar instrumentation frameworks for collecting metrics, traces, profiles and logs across distributed services.
Experience defining and operationalizing SLOs/SLIs at scale.
Experience navigating and scaling multi-tier cloud environments on either AWS or GCP.
Experience with container-centric architectures built with tools like Docker and Kubernetes (EKS, GKE, AKS, etc.), or ECS.
Experience with infrastructure-as-code tools like Terraform,or Pulumi.
Experience contributing to full-stack applications built using software like React, Node.js, and MongoDB or PostgreSQL.

Benefits

Ownership in what you help build.
Health coverage that actually covers you.
Support for every stage of family life.
Time off that’s actually off.
Wellness for the whole you.
Invest in your future.
Monthly stipends that flex with your life.
Bonus for building together.

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Salesforce Release Engineer

Volvo Cars

For a better future. We want to provide you with the freedom to move in a personal, sustainable, and safe way.

DevOps Engineer76 days ago

Full Time RemoteTeam 10,001+Since 1927H1B No Sponsor

Company Site LinkedIn

• Manage end-to-end Salesforce release lifecycle across Dev, QA, UAT, and Production environments • Configure and maintain CI/CD pipelines using Gearset • Perform metadata comparisons, validations, and deployments using Gearset • Troubleshoot deployment failures and resolve metadata dependencies, test failures, and conflicts • Integrate and manage source control using GitHub (branching, pull requests, merges) • Collaborate with developers, QA, and business stakeholders to coordinate releases • Maintain deployment best practices, governance, and audit compliance • Monitor deployments, backups, and rollback strategies using Gearset • Drive improvements in release automation, quality, and speed

View details: Salesforce Release Engineer

India

Apply

Senior DevOps Engineer – AWS, Azure

Jalasoft

We provide the best software engineering solutions by investing in our people first.

DevOps Engineer76 days ago

Full Time RemoteTeam 1,001-5,000Since 2003H1B No Sponsor

Company Site LinkedIn

• Responsible for designing, implementing, and managing scalable cloud infrastructure • Automating deployment processes and ensuring system reliability and security • Collaborate with development and operations teams to streamline CI/CD pipelines and enhance operational efficiency

AWS Azure Cloud ElasticSearch Grafana Java JavaScript Jenkins Kubernetes Microservices MySQL Node.js NoSQL PostgreSQL Python React Redis Terraform Go .NET

View details: Senior DevOps Engineer – AWS, Azure

Colombia

Apply

Staff SRE Engineer

Stellar Cyber

Empowering lean security operations teams of any skill to successfully secure their environments. WE ARE HIRING!

DevOps Engineer76 days ago

Full Time RemoteTeam 51-200H1B Sponsor

Company Site LinkedIn

Role Description We are seeking a highly skilled Staff Site Reliability Engineer (SRE) to join our team and drive reliability, scalability, and efficiency across our production systems. The ideal candidate will have deep expertise in cloud infrastructure, Kubernetes administration, observability, and incident management, with a proven track record of building and maintaining highly available and resilient platforms. As a senior member of the SRE team, you will not only operate complex distributed systems but also influence architecture, tooling, and best practices to ensure operational excellence. - Administer and maintain container orchestration platforms and containerized workloads. - Monitor and troubleshoot production systems, participating in on-call rotations to ensure reliability. - Drive observability improvements by enhancing monitoring, logging, and alerting capabilities across systems and data platforms. - Administer and optimize cloud-based environments across multiple providers. - Manage and support distributed data platforms and real-time processing systems. - Develop and maintain continuous integration and delivery pipelines for efficient and reliable deployments. - Own and implement Infrastructure as Code (IaC) practices to ensure consistency and scalability. - Automate and orchestrate infrastructure using programming and scripting languages. - Perform system administration and networking tasks to support internal and external environments. - Collaborate effectively with engineers and stakeholders across different time zones. Qualifications - 5+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering roles. - Proven success leading large-scale production systems in cloud environments (AWS, GCP, Azure, or OCI). - Demonstrated leadership in driving incident response, on-call best practices, and reliability-focused culture. - Strong experience with production on-call operations and incident management. - Advanced proficiency in Kubernetes administration and troubleshooting. - Hands-on experience with observability tools: Prometheus, Grafana, Loki, and Alertmanager. - Knowledge in chat-based operations interfaces and/or auto-remediation controllers using AI agentic framework. - Understanding of AI agents for Auto-triaging alerts, correlate signals and suggest/root-cause hypotheses. - Expertise in operating data platforms (Elasticsearch, MongoDB, Spark, Kafka, Redis). - Proficiency with public cloud services (AWS, Azure, GCP, or OCI). - Strong programming and automation skills in Python and Bash. - Deep understanding of Infrastructure as Code (Terraform, Helm). - Experience with CI/CD pipelines (GitHub Actions, Bitbucket, ArgoCD). - Strong technical background in distributed systems, databases, networking, and Linux administration. - Excellent problem-solving, communication, and leadership abilities. - Bachelor's degree in Computer Science, Engineering, or a related technical field. - Certifications in AWS, GCP, Observability, Linux or Kubernetes are a plus.

View details: Staff SRE Engineer

Hungary

Apply

AppSec, DevSecOps Engineer – Mid

Compass

DevOps Engineer76 days ago

Full Time RemoteTeam 10,001+H1B Sponsor

Company Site LinkedIn

• Serve as a technical reference for AppSec and DevSecOps, embedding security into all stages of projects. • Integrate security into the software development lifecycle (Secure SDLC / Shift Left). • Design, standardize, and maintain secure, reusable, automated, and versioned CI/CD pipelines. • Implement DevSecOps practices and controls in continuous delivery workflows. • Conduct risk analyses, threat modeling, and security assessments for applications and architectures. • Perform triage, analysis, and vulnerability management, supporting developers in fixing issues. • Operate and manage SAST, DAST, SCA, container security, and Infrastructure as Code (IaC) tools. • Perform security-focused code reviews, especially for .NET Core and Node.js applications. • Work in Cloud environments, evaluating architectures and security controls. • Ensure adherence to governance and compliance standards and frameworks such as ISO 27001, SOC 2, and PCI DSS. • Create scripts and automations for security controls and SIEM/SOC integration. • Promote evangelism, mentoring, and training of technology teams in secure development.

AWS Azure Cloud Docker Google Cloud Platform JavaScript Jenkins Kubernetes Node.js SDLC Terraform .NET

View details: AppSec, DevSecOps Engineer – Mid

Brazil

Apply

Job Closed

Senior Site Reliability Engineer, Observability

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Salesforce Release Engineer

Senior DevOps Engineer – AWS, Azure

Staff SRE Engineer

AppSec, DevSecOps Engineer – Mid