Job Closed

This listing is no longer active.

Sherweb

More than a cloud distributor

Principal Site Reliability Specialist, IT Operations

DevOps EngineerDevOps EngineerFull Time Remote LeadTeam 1,001-5,000Since 1998H1B No SponsorCompany Site LinkedIn

Location

Canada

Posted

71 days ago

Salary

$102.0K - $145.7K / year

Seniority

Lead

Bachelor Degree10 yrs expEnglishDistributed Systems

Job Description

• Implement a proactive, resilient, and scalable approach to site reliability across all Sherweb platforms. • Shape how reliability is designed, governed, and sustained across systems. • Elevate reliability from reactive operations to an engineered discipline ensuring platforms operate predictably. • Define and evolve reliability standards across platforms and services, including service level objectives (SLOs), service level indicators (SLIs). • Establish a shared reliability language and expectations across IT Operations Teams. • Drive consistency in monitoring and operational practices across services, systems and platforms. • Influence system and operational design to improve reliability, availability and resilience. • Drive the reduction of operational toil through automation, AI, platform capabilities, and repeatable operational patterns. • Improve end to end observability and system understanding. • Enable teams to take end to end ownership of platform reliability. • Partner closely with infrastructure and platform teams to ensure access, tooling, and visibility support full operational ownership. • Act as a reliability advocate and technical advisor during operational reviews, incident learning, and platform evolution. • Partner closely with DevOps teams to implement reliability and observability as code, ensuring integration with CI/CD pipelines and platform tooling.

Job Requirements

Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related field, or equivalent practical experience.
10+ years of experience in Site Reliability Engineering, operating and improving largescale, production environments.
Demonstrated experience improving the reliability, availability, and scalability of production systems, platforms and services.
Hands-on experience operating distributed systems in business critical and customer facing environments.
Proven experience reducing manual operational work through automation and standardization.
Experience defining and applying reliability standards (e.g., SLOs, error budgets) across multiple services or platforms.
Demonstrated ability to influence technical direction across multiple teams without direct authority.
Strong understanding of distributed systems, failure modes, and operational resilience.
Solid experience with observability practices (metrics, logs, traces) and system diagnostics.
Ability to analyze complex systems end to end across infrastructure, platform, and application layers.
Strong systems thinking with a track record of addressing reliability issues through design rather than reactive intervention.
Experience acting as a trusted technical advisor to senior engineers and leaders.
Ability to clearly communicate complex reliability concepts to both technical and non-technical stakeholders.

Benefits

A fast-paced work environment that adapts to you
A friendly and diverse work culture with inclusion and equality at the heart of our actions
State-of-the art technology and tools
A results-oriented culture where talent, action, and thinking outside the box are given due recognition
Annual salary review based on performance
Generous and caring colleagues of various professional and cultural backgrounds
A flexible total compensation offer
Vacation time that considers your previous experience
Advanced paid hours to recharge your batteries (holidays and mobile days)
Flexible benefits plan that adapts to your needs
Flexible savings fund option
A monthly home internet allowance
Considerable growth opportunities
A career path with opportunities to learn and grow
Proximity to your direct manager and open, honest communication to support your development
Multiple initial and on-the-job training opportunities and tools to track your progress and help you scale up in your career
"Sherweblife" - a rich calendar of activities that allow us to gather virtually and face-to-face throughout the year

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior DevOps Engineer

Cority

Global enterprise EHS software provider empowering those who transform the way the world works.

DevOps Engineer71 days ago

Full Time RemoteTeam 201-500Since 1988H1B No Sponsor

Company Site LinkedIn

• Architect, develop and deliver fully automated multi-account global cloud infrastructure on AWS using repeatable methods and patterns • Support continuous delivery pipelines for the deployment to all layers of our SDLC • Conduct Infrastructure as Code reviews ensuring high programming standards • Monitors and tunes the performance of the infrastructure • Identify and correct bottlenecks in the system, while working with Engineering on optimizations and best practices • Support integration of log management and APM solution to provide continuous monitoring capabilities, track all aspects of the system, infrastructure, performance, application errors and roll up metrics • Provide mentorship and training to other team members on technologies and processes, drive education and knowledge transfer of design patterns, technical practices, and relevant technologies and tools • Troubleshoot, analyze the root cause issues in the platform • Fully participate in the ownership of your services and components, including on-call duties • Support Data Center and Cloud Operations: System Provisioning, Monitoring/Alerting, Network Configuration, Administration of Linux, Windows, VMWare/vSphere and Cloud Computing

Ansible AWS Azure Cloud Distributed Systems Docker Jenkins Linux Python SDLC VMware

View details: Senior DevOps Engineer

Canada

Apply

DevOps Manager

Cority

Global enterprise EHS software provider empowering those who transform the way the world works.

DevOps Engineer71 days ago

Full Time RemoteTeam 201-500Since 1988H1B No Sponsor

Company Site LinkedIn

• Responsible for service delivery and cloud & web systems reliability and scalability • Architecting continuous integration and infrastructure-as-code environments • Provide technical expertise for projects • Build processes for consistent and repeatable CI/CD • Mentor and guide the professional and technical development of team members

AWS Cloud

View details: DevOps Manager

Canada

Apply

Senior DevOps Engineer

Cority

Global enterprise EHS software provider empowering those who transform the way the world works.

DevOps Engineer71 days ago

Full Time RemoteTeam 201-500Since 1988H1B No Sponsor

Company Site LinkedIn

Ansible AWS Azure Distributed Systems Docker Jenkins Linux Python SDLC VMware

View details: Senior DevOps Engineer

Texas

Apply

Senior DevOps Engineer, OpenStack

Mirantis

Strategic open source infrastructure for containers and virtual machines.

DevOps Engineer71 days ago

Full Time RemoteTeam 501-1,000H1B Sponsor

Company Site LinkedIn

• Help design, validate, and evolve a cloud-native OpenStack platform. • Contribute to architectural decisions, support complex customer scenarios, and ensure stable product releases. • Actively influence the MOSK roadmap and backlog by providing technical input on feature feasibility, architectural scalability, and long-term maintainability. • Design, implement, and document lifecycle management for OpenStack on Kubernetes. • Drive validation of enabled features by defining test scenarios and integrating feature support into MOSK’s lifecycle management. • Lead design discussions, planning sessions, and retrospectives to uphold standards for software quality. • Design, implement, and maintain core frameworks, libraries, and tools for MOSK development. • Build and evolve automation for deploying and managing complex OpenStack-on-Kubernetes environments. • Debug and resolve complex issues in customer production environments. • Act as an authoritative technical source for MOSK and OpenStack behavior to ensure correctness of documentation.

Ansible AWS Azure Chef Docker Firewalls GCP Groovy Jenkins Kubernetes Linux OpenStack Puppet Python SaltStack Terraform

View details: Senior DevOps Engineer, OpenStack

Poland

Apply

Job Closed

Principal Site Reliability Specialist, IT Operations

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior DevOps Engineer

DevOps Manager

Senior DevOps Engineer

Senior DevOps Engineer, OpenStack