Job Closed

This listing is no longer active.

Provision IAM

Take charge of access management. Tools you need. Security you deserve.

Senior Site Reliability Engineer

DevOps EngineerDevOps EngineerOther Remote SeniorTeam 11-50H1B No SponsorCompany Site LinkedIn

Location

Maryland

Posted

114 days ago

Salary

$115K - $140K / year

Seniority

Senior

Bachelor DegreeEnglishAWS Azure Flux GCP Grafana Kubernetes Linux Prometheus Python Terraform HashiCorp Vault

Job Description

• Own and execute infrastructure projects, including migrations, automation, and tooling improvements • Manage and troubleshoot Kubernetes clusters across multiple environments • Maintain and improve GitOps deployment pipelines • Build and maintain CI/CD pipelines • Manage Google Cloud Platform infrastructure (GKE, IAM, networking, storage) • Implement and maintain secrets and configuration management systems • Write and maintain automation (infrastructure as code, configuration management, scripting) • Participate in an on-call rotation supporting production infrastructure as needed • Communicate with internal teams and occasionally with clients when infrastructure matters impact delivery • Collaborate with developers on deployment, reliability, and performance • Use AI tools appropriately to enhance engineering productivity and workflow

Job Requirements

Authorized to work in the United States
Hands-on Kubernetes production experience
Experience with GitOps workflows (ArgoCD, Flux, or similar)
Strong cloud infrastructure experience (Google Cloud preferred; AWS/Azure transferable)
CI/CD pipeline design and maintenance (GitLab CI/CD or equivalent)
Infrastructure as Code (Terraform, OpenTofu, Pulumi, or similar)
Enterprise secrets management tools (HashiCorp Vault or equivalent)
Advanced Linux command-line and system administration
Monitoring and observability tools (Prometheus, Grafana, Datadog, etc.)
Understanding of SLIs/SLOs and incident response practices
Automation and scripting (Bash, Python, or similar)

Benefits

Company-paid health insurance (employee and family coverage)
Generous paid time off
SIMPLE IRA retirement plan (IRS-compliant eligibility and company participation)
Fully remote work environment
Meaningful technical ownership and growth opportunities

Related Categories

DevOps Engineer

Related Job Pages

DevOps Engineer Jobs in Maryland Remote Python Jobs (US)More Remote Jobs

More DevOps Engineer Jobs

Senior DevOps Engineer

NetVendor

We help property managers save time, reduce vendor risk, and optimize maintenance operations.

DevOps Engineer114 days ago

Other RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• You'll build and own the foundation that our engineering team ships on every day. • Design, deploy, and manage AWS infrastructure: EC2, ECS/Fargate, RDS, DynamoDB, ElastiCache, S3, CloudFront, and more. • Implement and evolve our Infrastructure as Code practices. • Build and maintain CI/CD pipelines using GitHub Actions. • Configure IAM roles, policies, and least-privilege access. • Enforce tagging, cost controls, and guardrails across environments. • Design for resilience — redundancy, backups, and multi-AZ or multi-region strategies where appropriate. • Set up CloudWatch and Datadog metrics, dashboards, and monitors/alarms. • Establish backup, recovery, and disaster recovery strategies. • Work with the Head of Security to ensure the appropriate controls and tests (automated via Vanta) are in place to meet the goals of the security program. • Architect and automate well-separated environments for Dev, QA/Test, Staging, and Production.

AWS DynamoDB Amazon EC2 Terraform

View details: Senior DevOps Engineer

United States

Apply

Job Closed

DevOps Engineer

Thrill

The Future of Gaming

DevOps Engineer114 days ago

Full Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Build and maintain production infrastructure in AWS. • Manage Linux servers. • Operate Kubernetes clusters. • Administer and optimize PostgreSQL databases. • Operate monitoring & observability. • Be part of the on-call rotation for the infrastructure components. • Ownership of the CI/CD process. • Work on improving infrastructure and application security. • Manage CloudFlare, WAF, and DDoS protection solutions to improve our stance in this area.

AWS Kubernetes Linux PostgreSQL Python

View details: DevOps Engineer

Europe

Apply

Customer Reliability Engineer

Supabase

Build in a weekend. Scale to millions.

DevOps Engineer114 days ago

Other RemoteTeam 51-200Since 2020H1B No Sponsor

Company Site LinkedIn

• Apply SRE principles to Customer Success • Detect issues commonly occurring in the platform • Proactively find improvements in the platform • Work on escalations and longer-running, more complex technical cases • Assist those using the Supabase platform with complex and/or long-running issues • Deliver on synchronous and asynchronous engagements with Supabase customers • Serve as an internal champion for the platform and how customers use it.

JavaScript MySQL Node.js PostgreSQL Python React Svelte TypeScript Vue.js

View details: Customer Reliability Engineer

United States

Apply

Site Reliability Engineer – AI & ML Infrastructure, Kubernetes, Terraform

Deepgram

Building foundational AI for speech transcription and understanding.

DevOps Engineer114 days ago

Other RemoteTeam 51-200Since 2015H1B Sponsor

Company Site LinkedIn

• Architect and maintain our core computing platform using Kubernetes on AWS and on-premise, providing a stable, scalable environment for all applications and services. • Develop and manage our entire infrastructure using Infrastructure-as-Code (IaC) principles with Terraform, ensuring our environments are reproducible, versioned, and automated. • Design, build, and optimize our AI/ML job scheduling and orchestration systems, integrating Slurm with our Kubernetes clusters to efficiently manage GPU resources. • Provision, manage, and maintain our on-premise bare metal server infrastructure for high-performance GPU computing. • Implement and manage the platform's networking (CNI, service mesh) and storage (CSI, S3) solutions to support high-throughput, low-latency workloads across hybrid environments. • Develop a comprehensive observability stack (monitoring, logging, tracing) to ensure platform health, and create automation for operational tasks, incident response, and performance tuning. • Collaborate with AI researchers and ML engineers to understand their infrastructure needs and build the tools and workflows that accelerate their development cycle. • Automate the life cycle of single-tenant, managed deployments

AWS Kubernetes Python Terraform

View details: Site Reliability Engineer – AI & ML Infrastructure, Kubernetes, Terraform

United States

$160K - $220K / year

Apply

Job Closed

Senior Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior DevOps Engineer

DevOps Engineer

Customer Reliability Engineer

Site Reliability Engineer – AI & ML Infrastructure, Kubernetes, Terraform