Job Closed

This listing is no longer active.

Work Truck Solutions

Helping dealers sell more Work Trucks.

Cloud Operations Engineer

DevOps EngineerDevOps EngineerOther Remote SeniorTeam 51-200H1B No SponsorCompany Site LinkedIn

Location

California + 2 more

Posted

117 days ago

Salary

$110K - $150K / year

Seniority

Senior

Bachelor DegreeEnglishAWS Azure GCP Jenkins

Job Description

• Oversee all cloud infrastructure and resources, including provisioning, performing regular patch management, and proactive capacity planning • Establish comprehensive system observability and maintain alerting infrastructure; serve as the escalation point for major incidents, drive resolution, and champion thorough Root Cause Analysis (RCA) • Define and maintain a robust security posture by enforcing Identity & Access Management (IAM), completing security audits, ensuring data encryption, and managing audit logs for regulatory compliance • Actively track cloud spend against budgets, direct the team in performing right-sizing and waste elimination, and optimize rates through reserved instances and savings plans (FinOps strategy) • Direct the implementation and regular testing of comprehensive disaster recovery and business continuity plans, including backup management and maintaining a High Availability (HA) architecture across multiple zones

Job Requirements

Proven experience managing infrastructure on major cloud platforms (AWS, Azure, or GCP)
Strong understanding of network security, IAM, and compliance frameworks
Demonstrated ability to reduce cloud costs through FinOps principles
Experience in designing and testing Disaster Recovery and High Availability architectures
Proficiency in scripting languages for operational automation
Familiarity with tools like CloudWatch, Datadog, Jenkins, or similar systems
A focus on system availability as the primary key metric (target uptime 99.99%)

Benefits

Competitive salary
Fully remote Monday-Friday work week
Comprehensive medical, dental, and 401k benefits, with complimentary life insurance
Paid Time Off (PTO) and holidays
Flexible scheduling, subject to manager’s approval
Opportunity to work with a supportive and innovative team

Related Categories

DevOps Engineer

Related Job Pages

DevOps Engineer Jobs in California More Remote Jobs

More DevOps Engineer Jobs

Staff Database Reliability Engineer

AIRINC

LISTEN | PARTNER | DELIVER

DevOps Engineer117 days ago

Other RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Own the health, performance, and availability of Air's PostgreSQL Aurora infrastructure. • Proactively optimize database parameters, indexes, and query patterns to maintain sub-100ms p95 response times. • Uplevel migration practices and tooling to ensure zero-downtime schema changes as the platform scales. • Establish and maintain comprehensive backup, recovery, and disaster recovery procedures with documented RTO/RPO targets. • Partner with backend engineers to implement database best practices in application code (connection pooling, query optimization, caching strategies). • Develop multi-quarter roadmap to scale Air's database infrastructure to support 10x growth in asset volume and user activity. • Collaborate with backend engineers and product leadership to model data growth patterns and anticipate scaling inflection points. • Evaluate and implement horizontal scaling strategies (read replicas, sharding, partitioning) aligned with business needs. • Continuously assess AWS Aurora capabilities, PostgreSQL ecosystem innovations, and emerging database technologies for strategic advantage. • Design and implement database architecture that supports Air's AI-powered features and real-time creative workflows. • Create comprehensive monitoring, alerting, and reporting systems to maintain database reliability and inform data-driven infrastructure decisions. • Implement detailed instrumentation for database performance metrics (query latency, connection pool utilization, replication lag, disk I/O). • Build automated alerting for anomalies in query performance, connection patterns, and resource utilization. • Create executive-level dashboards showing database health trends, capacity utilization, and cost efficiency. • Develop regular database health review cadence with engineering leadership to surface insights and drive continuous improvement.

AWS PostgreSQL

View details: Staff Database Reliability Engineer

United States

$160K - $240K / year

Apply

Job Closed

Staff Site Reliability Engineer

ScalePad

DevOps Engineer117 days ago

Full Time RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Own production infrastructure across AWS and Azure, including networking, IAM, and cost. • Build and operate Terraform modules and state at scale, keeping our infrastructure as code clean and reviewable. • Run Kubernetes in production: upgrades, scaling, troubleshooting, and platform improvements. • Operate and improve CI/CD pipelines that the entire engineering org depends on. • Operationalize SLO/SLI frameworks and observability practices alongside the SRE team. • Own incident response practice, on-call tooling, and incident review follow-through. • Reduce operational toil through automation across secret rotation, access management, and environment provisioning. • Execute on capacity planning, disaster recovery, and resilience work across critical systems. • Build and maintain internal developer tooling that removes friction across engineering. • Lead rollouts of AI-native tooling for code review, testing, and engineering productivity, e.g., CodeRabbit, Copilot-class assistants, and internal AI workflows. • Own migrations and consolidation of internal platforms such as Jira, Confluence, ticketing, and documentation systems. • Partner with engineering and product leadership to identify and remove the biggest DX bottlenecks, and align infrastructure and reliability investments with business goals. • Mentor engineers and technical leads, fostering growth and knowledge-sharing within the organization. • Lead post-mortems and continuous improvement initiatives to strengthen reliability practices. • Evaluate and introduce new technologies, tools, and approaches to improve scalability and efficiency. • Drive standardization and modernization efforts across infrastructure and operational practices. • Lead proof-of-concept and experimentation initiatives to validate new reliability solutions.

AWS Azure Cloud Distributed Systems Kubernetes Terraform

View details: Staff Site Reliability Engineer

Canada

$150K - $175K / year

Apply

Azure DevOps Engineer

Uvation

DevOps Engineer117 days ago

Part Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Design, implement, and manage Azure infrastructure • Automate cloud deployments and manage resources • Create, maintain, and enhance CI/CD pipelines • Manage and maintain Linux servers • Implement and enforce security best practices

Azure Linux

View details: Azure DevOps Engineer

United Arab Emirates

Apply

Job Closed

Principal Site Reliability Engineering Lead

StarCompliance

We are Reputation Guardians, on a mission to make compliance simple and easy.

DevOps Engineer117 days ago

Other RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Act as a senior custodian of the production promotion process across the software platform estate. • Work closely with Technical Leads and QA to define and evolve promotion practices that emphasise quality, performance, and operational readiness. • Define and evolve observability standards across metrics, logging, tracing, and alerting. • Ensure systems are instrumented to support rapid diagnosis, learning, and recovery. • Drive continuous improvement in platform reliability, performance, and release confidence. • Partner with engineering, architecture, and platform teams to embed operability and resilience into system design. • Lead and participate in on-call and rota-based operational support for production systems. • Coordinate and continuously improve incident management practices, including post-incident reviews and preventative actions. • Act as a senior technical authority for production readiness, operational risk, and release confidence. • Mentor SREs and senior engineers, raising reliability and operational standards across teams. • Influence architectural and platform decisions with a strong operational and delivery lens while remaining hands-on.

View details: Principal Site Reliability Engineering Lead

New York

Apply

Cloud Operations Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Staff Database Reliability Engineer

Staff Site Reliability Engineer

Azure DevOps Engineer

Principal Site Reliability Engineering Lead