Job Closed

This listing is no longer active.

AlphaHire

Accelerating growth for Health-focused companies across North America.

Senior DevOps, Infrastructure Engineer

DevOps EngineerDevOps EngineerOther Remote SeniorTeam 51-200Since 2020H1B No SponsorCompany Site LinkedIn

Location

United States

Posted

105 days ago

Salary

Seniority

Senior

EnglishAWS Azure Distributed Systems Docker GCP Grafana Kubernetes Prometheus Python Terraform

Job Description

• Architect and operate multi-region, multi-cloud deployments across AWS, GCP, or Azure • Design and maintain high-throughput telemetry ingestion pipelines • Build event-driven architectures supporting real-time observability • Implement autoscaling, failover strategies, and fault-tolerant system design • Own production observability using Prometheus, Grafana, distributed tracing, and alerting frameworks • Define and manage Production SLOs, incident response, and reliability engineering practices • Develop and maintain CI/CD pipelines, GitOps workflows, and deployment automation • Collaborate with backend engineering on API performance and infrastructure reliability • Harden infrastructure for security, compliance, and tenant isolation • Drive the long-term infrastructure roadmap and architectural direction • Manage Infrastructure-as-Code (Terraform or similar) and full environment lifecycle

Job Requirements

Deep expertise in Kubernetes, Docker, and container orchestration
Strong background in distributed systems and multi-region architectures
Experience with high-ingest, streaming, or event-driven systems
Hands-on experience with Prometheus, Grafana, and tracing/alerting frameworks
Proficiency with Terraform or similar Infrastructure-as-Code tools
Experience building and maintaining CI/CD pipelines
Strong working knowledge of AWS, GCP, or Azure
Proficiency in Python or Go for automation and tooling
Experience operating high-availability, production-critical systems

Benefits

Expense reimbursement
Professional training and certification support
Advancement and leadership growth opportunities
Meaningful equity participation
Significant ownership over core infrastructure decisions

Related Categories

DevOps Engineer

Related Job Pages

Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Principal AI Operations Engineer

Microsoft

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to any characteristic protected by applicable local laws, regulations, and ordinances.

DevOps Engineer105 days ago

Other RemoteTeam 10,001+H1B Sponsor

Company Site LinkedIn

We are seeking a Principal AI Operations Engineer to define the technical direction for the AI Operations group. In this role, you will: Design and architect operational systems Establish standards for branch health, CI/CD pipelines, production deployments, and on-call processes Drive reliability initiatives and maintain production health and uptime Ensure the platform meets its SLOs Be the escalation point for complex incidents Work closely with the Platform team to ensure services are operationally ready

View details: Principal AI Operations Engineer

United States

Apply

Job Closed

Senior Systems Administrator

Arizona Department of Administration

The Attorney General's Office offers a comprehensive benefits package. For a complete list of benefits provided by The State of Arizona, please visit our benefits page.

DevOps Engineer105 days ago

Other RemoteTeam 1,001-5,000

The Senior Systems Administrator is responsible for ensuring the secure, reliable, and efficient operation of enterprise IT systems, networks, and infrastructure. This role proactively solves complex technical issues, implements and maintains security controls, and supports agency operations through system administration, network management, and customer-focused technical support. Own day-to-day technical decision making, escalated incident response, and enforcement of technical standards across systems and infrastructure. Collaborate closely with business partners to protect data, mitigate security risks, and support ongoing technology improvements. Work with minimal supervision, contribute to IT planning and resource management, and help drive operational efficiency through automation, documentation, and continuous improvement initiatives. Configure, monitor, and troubleshoot servers, workstations, networks, cloud platforms, and enterprise applications. Administer user access and security controls. Respond to incidents and suspicious activity. Support IT projects, upgrades, and migrations.

View details: Senior Systems Administrator

United States

Apply

Job Closed

Senior Site Reliability Engineer

Vanco

We serve those who enrich our communities.

DevOps Engineer105 days ago

Other RemoteTeam 51-200H1B Sponsor

Company Site LinkedIn

• Work collaboratively with software and systems engineering to deploy and manage systems within AWS Cloud. • Lead the automation and streamlining operations and processes. • Design, build, setup, and maintain tools for deployment, monitoring, and infrastructure provisioning. • Administer all systems related to R&D projects, including user creation, systems provision troubleshooting, monitoring, etc. • Create the vision and designs the automation strategy across the platform. • Troubleshoot site down issues and respond to emergency outages. • Scale infrastructure to meet demand and continuously monitor/improve the quality of infrastructure. • Participate in on-call rotation as needed.

AWS DNS Amazon EC2 Firewalls Kubernetes Linux Python Ruby Terraform Unix

View details: Senior Site Reliability Engineer

Alabama + 36 more

$85K - $120K / year

Apply

Job Closed

Site Reliability Engineer – AI Infrastructure

Andromeda

Where technology meets empathy – pioneering the future of human-robot interaction.

DevOps Engineer105 days ago

Other RemoteTeam 11-50H1B Sponsor

Company Site LinkedIn

• Provision, configure, and operate Kubernetes-based clusters for customers across multiple providers • Build automation and tooling to streamline cluster deployments and integrations • Debug customer issues across networking, storage, scheduling, and system layers • Improve reliability and scalability of both training and inference infrastructure • Design and implement monitoring, alerting, and observability for critical systems • Collaborate with engineering and product teams to plan and deliver infrastructure for new services • Participate in on-call and incident response, leading postmortems and reliability improvements

Ansible Grafana Kubernetes Linux Prometheus Python Terraform

View details: Site Reliability Engineer – AI Infrastructure

California

Apply

Senior DevOps, Infrastructure Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Principal AI Operations Engineer

Senior Systems Administrator

Senior Site Reliability Engineer

Site Reliability Engineer – AI Infrastructure