Job Closed

This listing is no longer active.

Fable

Fable is a leading accessibility platform powered by people with disabilities.

Senior Site Reliability Engineer, SRE

DevOps EngineerDevOps EngineerFull Time Remote SeniorTeam 11-50H1B No SponsorCompany Site LinkedIn

Location

Canada

Posted

49 days ago

Salary

$130K - $150K / year

Seniority

Senior

5 yrs expEnglishAWS Azure Cloud Google Cloud Platform Grafana Java JavaScript Node.js Prometheus Python Terraform Go

Job Description

• Design, build, and maintain reliable, scalable, and secure infrastructure for Fable’s product services • Improve system observability, monitoring, and alerting to ensure high availability and fast incident response • Contribute to and evolve SRE practices, including SLIs/SLOs, incident management, and postmortems • Support and improve CI/CD pipelines and deployment processes • Identify and reduce operational complexity across systems and tooling • Work across infrastructure and application layers to diagnose and resolve reliability and performance issues, including making targeted improvements to application code when needed • Support infrastructure and platform capabilities required for AI/ML-powered features, including scaling, performance, and reliability considerations • Monitor and optimize infrastructure costs across cloud environments • Contribute to capacity planning and cost forecasting for infrastructure and services • Identify opportunities to improve performance and efficiency at the system level • Evaluate and optimize the cost and performance of compute-intensive workloads (e.g., AI/ML services), ensuring efficient resource usage and scalability • Work with third-party vendors and tools that support Fable’s infrastructure and operations • Help evaluate, select, and manage tools and services to support platform reliability and scalability • Support vendor-related troubleshooting and ongoing service improvements • Partner with Engineering teams to improve reliability, performance, and operational readiness of new features • Partner with application engineering teams to improve service architecture, performance, and observability, and help define best practices for building reliable, scalable systems • Act as a point of support and escalation for production issues • Collaborate across teams to manage dependencies and ensure smooth system operations • Contribute to building strong SRE and operational practices across the organization • Share knowledge through documentation, pairing, and technical discussions • Help onboard and support more junior team members as the team grows • Contribute to improving ways of working within the team and across Engineering

Job Requirements

5–8+ years of experience in Site Reliability Engineering, DevOps, Infrastructure Engineering, or Platform Engineering
Strong experience with cloud infrastructure (AWS, GCP, or Azure)
Experience building internal platforms, tooling, or shared services that improve developer productivity and system reliability
Experience designing systems that bridge infrastructure and application layers
Ability to work across the stack: comfortable reading, debugging, and making changes to application code (e.g., backend services, APIs) when needed to improve reliability, performance, or observability
Experience with at least one backend programming language (e.g., Node.js, Python, Go, Java)
Strong experience with monitoring, observability, and alerting tools (e.g., Datadog, Prometheus, Grafana)
Solid understanding of CI/CD systems and modern deployment practices
Experience managing infrastructure as code (e.g., Terraform, CloudFormation)
Experience optimizing system performance and infrastructure costs
Familiarity with security and compliance considerations in cloud environments
Experience working with third-party vendors and infrastructure tools
Familiarity with infrastructure considerations for AI/ML workloads (e.g., high-compute services, data pipelines, or third-party AI platforms) is a strong asset
Curiosity about emerging technologies and their impact on infrastructure, reliability, and cost at scale
Strong problem-solving skills and ability to navigate complex systems
Excellent collaboration and communication skills.

Benefits

stock options
career growth opportunities
professional development support
health and dental coverage

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

ARQUITECTO DEVSECOPS - CLOUD – INGLES - REMOTO

IRIUM - Spain

DevOps Engineer49 days ago

Full Time RemoteTeam 51-200

🚀 En IRIUM nos preocupamos porque no dejes de perseguir tus sueños. Prepárate para conquistar tus metas, y ten siempre presente disfrutar del camino. Buscamos un/a ARQUITECTO DEVSECOPS - CLOUD con inglés muy alto para Proyecto desl sector energético full remote, para diseño de soluciones. 🔍 ¿Qué buscamos?: REQUISITOS IMPRESCINDIBLES: - Administración GitHub - Experiencia trabajo, con soltura, sobre Amazon AWS y en particular servicios sobre EKS. - Conocimientos de kubernetes: - Despliegue de aplicaciones sobre kubernetes - Desarrollo de procesos en argocd - Implantación y diseño de charts Helm para el despliegue de esas aplicaciones compleja. - Implantación de modelos de Landing zone en Azure. - Experiencia en elaboración de modelos cloud con terraform tanto en amazon como en azure - Inglés - nivel de conversación. REQUISITOS VALORABLES - CICD MLOps, Ansible, Databricks... Administración de Jira, SonarQube, Administración Linux, Windows Administración API Rest, Docker, Ansible Tower, OpenShift. ⭐ ¿Qué Ofrecemos? • Lugar de trabajo: REMOTO – IMPRESCINDIBLE RESIDENCIA EN ESPAÑA • Horario: 7.30 a 15.30 / 8 a 15 en verano • Guardias • Viajes o desplazamientos puntuales • Contrato indefinido con IRIUM • Retribución flexible ✌ • Banda salarial: Según valía y experiencia (30– 38K) • 23 días de vacaciones 🏕️ • Buen clima laboral 🙍‍♀️🙍‍♂️ • Acceso ilimitado a formación tecnológica puntera en modalidad barra libre. 📚 • Club de beneficios para empleados con descuentos directos y miles de ofertas en marcas, hoteles, agencias de viaje, cines, ropa… 💰 ✨Pasarás a formar parte de un gran equipo de personas que estarán siempre dispuestas a ayudarte. IRIUM es una empresa formada por profesionales con inquietudes, dinámicos y resolutivos. Nuestros valores son la responsabilidad y el compromiso con el trabajo bien hecho, este es el espíritu que buscamos en IRIUM, sea cual sea tu edad, si te reconoces ¡esta es tu empresa! Podemos construir juntos el futuro. ¿Hablamos? 🟢🔵🟣

View details: ARQUITECTO DEVSECOPS - CLOUD – INGLES - REMOTO

Spain

€30K - €38K / year

Apply

Senior DevOps Engineer

Prometeo Talent

Empowering startups to scale by connecting you with top 1% global talent. Since 2010. www.prometeotalent.com/

DevOps Engineer49 days ago

Full Time RemoteTeam 11-50Since 2010H1B No Sponsor

Company Site LinkedIn

• Take ownership of scalable, secure, and highly available infrastructure across multi-cloud environments • Design systems that are reliable by default, automated end-to-end, and trusted by engineering teams • Manage full lifecycle of Kubernetes clusters • Design and maintain robust CI/CD pipelines • Build full-stack observability systems • Implement security best practices and support compliance frameworks • Apply FinOps practices for cost optimization • Use AI tools to accelerate IaC, automation, and documentation

AWS Cloud Google Cloud Platform Grafana Kubernetes Prometheus Python Terraform

View details: Senior DevOps Engineer

Colombia

Apply

Job Closed

Senior DevOps/SRE

VExpenses

Reembolso de despesas sem complicação!

DevOps Engineer49 days ago

Full Time RemoteTeam 51-200Since 2016H1B No Sponsor

Company Site LinkedIn

• Design solutions following automation best practices and cloud computing principles, taking into account the context of a fast-growing fintech; • Diagnose, monitor, and document incidents to help build higher-performing solutions; • Fully automate the deployment of our applications, from code to production (Continuous Deployment); • Provide rapid feedback on code changes at scale while maintaining high security and quality standards (Continuous Integration); • Architect and implement new environments together with our Technology team; • Ensure quality and scalability for our platform.

AWS Cloud Terraform

View details: Senior DevOps/SRE

Brazil

Apply

Job Closed

SRE / DevOps Engineer

Verity Group

Somos Humanos. Somos Digitais. Somos Verity!

DevOps Engineer49 days ago

Full Time RemoteTeam 51-200Since 2010H1B No Sponsor

Company Site LinkedIn

• Design, implement, and evolve CI/CD pipelines • Provision and maintain infrastructure on GCP using Terraform and Ansible • Operate and scale Kubernetes environments (GKE) • Define, implement, and monitor SLIs, SLOs, and Error Budgets • Build observability, alerts, and APM (Dynatrace experience is a plus) • Work closely with squads, promoting platform engineering and reliability best practices

Ansible Cloud Docker Google Cloud Platform Jenkins Kubernetes Linux Terraform

View details: SRE / DevOps Engineer

Brazil

Apply

Job Closed

Senior Site Reliability Engineer, SRE

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

ARQUITECTO DEVSECOPS - CLOUD – INGLES - REMOTO

Senior DevOps Engineer

Senior DevOps/SRE

SRE / DevOps Engineer