Job Closed

This listing is no longer active.

DevOps/SRE Team Lead

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 501-1,000Since 1998H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

33 days ago

Salary

0

Seniority

Senior

Job Description

DevOps/SRE Team Lead

Telestream

• Design, deploy, and administer production Kubernetes clusters, including workload scheduling, namespace management, RBAC, network policies, and cluster upgrades • Design and maintain continuous integration/deployment pipelines to automate testing and deployment, including Kubernetes-native delivery workflows using Helm and ArgoCD or equivalent • Track software performance, fixing errors, troubleshooting systems, implement preventative measures to ensure smooth workflows • Implement and manage infrastructure.  • Utilize Terraform or CloudFormation for IaC management • Optimize cloud resources by implementing cost-effective solutions • Collaborate with various teams to ensure smooth deployment • Monitor and create new processes based on performance analysis • Implement security best practices, including automated compliance checks and secure code deployment • Manage the technical roadmap, architecture while mentoring SRE and DevOps Engineers. (Player/Coach) • Hire, coach, and manage a team of DevOps engineers and Site Reliability Engineers. • Define DevOps/Platform roadmap aligned with business goals (e.g., cloud cost optimization, automation maturity).

Job Requirements

  • Bachelor’s degree in computer science, Engineering or equivalent
  • 5-8+ years of experience in DevOps/SRE, with 2-3+ years in a leadership role.
  • Hands-on experience building and maintaining CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or equivalent) with direct integration into Kubernetes deployment workflows
  • Production-level experience with infrastructure as code (Terraform required; CloudFormation or Pulumi a plus), including managing cloud-hosted Kubernetes clusters (EKS, GKE, or AKS)
  • Experience with monitoring, logging, and observability tooling in Kubernetes environments (Prometheus, Grafana, Datadog, ELK/EFK stack, or equivalent); ability to build dashboards and alerts from scratch, not just consume existing ones
  • Demonstrated, hands-on Kubernetes experience in production environments: cluster administration, Helm chart authoring and management, RBAC configuration, persistent storage, horizontal/vertical pod autoscaling, and diagnosing and resolving real production failures (CrashLoopBackOff, OOMKilled, networking issues, etc.)
  • Strong troubleshooting skills with the ability to diagnose infrastructure and application issues live, under pressure, without reference materials—this is evaluated directly in our interview process
  • Proficiency in scripting languages (Python, Go, Bash, or PowerShell); ability to write and own automation scripts, not just modify existing ones

Benefits

  • Day-one medical, dental & vision coverage
  • 100% company-paid life + disability insurance
  • 401(k) with a sweet company match (up to 8%)
  • Quarterly HSA boosts & flexible spending accounts
  • Flexible time off (salaried) or PTO (hourly) + generous paid holidays
  • Pet insurance (yes, your dog gets benefits too)
  • Legal plan + extras like accident & critical illness coverage

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Full TimeRemoteTeam 10,001+Since 1916H1B Sponsor

• Design, build, and maintain internal tools, dashboards, and pipelines that standardize program and project management practices across BSF • Own the technical delivery workflow for cross-functional initiatives, partnering with engineering leaders to plan releases, track progress, and unblock teams • Implement mechanisms that keep development teams aligned on key decisions and objectives (e.g., engineering decision records, automated updates, GitLab workflows), ensuring high-quality, efficient execution • Build and operate OKR tracking systems and program review workflows, including data ingestion, reporting, and alerts for progress and risk • Serve as the engineering point-of-contact to the PMO and leadership team, translating program requirements into technical solutions and actionable engineering work • Establish and maintain stakeholder engagement by creating clear communication channels (dashboards, status feeds, notifications) that foster trust and alignment with key objectives • Develop risk management and mitigation tooling (e.g., schedule health metrics, dependency maps, CI signal monitoring) to proactively surface issues impacting resources, timelines, and outcomes • Instrument, collect, and monitor delivery metrics such as SLOC trends, Service Desk ticket flows, and cost/financial indicators; maintain baselines and drive data-informed improvements • Write and maintain scripts and services that automate recurring PMO and program operations tasks • Heavily utilizes GitLab (Issues, Epics, Boards, Labels, CI/CD, APIs) to implement program management workflows, automation, and reporting

Colorado + 2 moreAll locations: Colorado | Missouri | Virginia
$127.5K - $197.8K / year
Job Closed
AttainX, Inc. logo

DevSecOps Engineer I

AttainX, Inc.

SBA Certified 8(a), EDWOSB/WOSB and CMMI L3, ISO 9001:2015 Certified QMS

DevOps Engineer33 days ago
Full TimeRemoteTeam 51-200Since 2008H1B No Sponsor

• Support planning, coordination, and execution of secure, controlled, and auditable releases. • Ensure all releases meet quality, security, and operational acceptance criteria before deployment. • Maintain and validate release artifacts, including checklists, test evidence, security results, and rollback plans. • Write, track, and manage security, compliance, and operational tickets in alignment with SLAs. • Coordinate and track vulnerability remediation and support security assessment activities. • Maintain audit-ready documentation and evidence for all activities, ensuring traceability across tickets, changes, and releases. • Develop and manage SOPs, runbooks, and operational documentation in approved repositories. • Enforce change management and governance processes, ensuring proper authorization for all work. • Support work intake, backlog normalization, and ticket lifecycle management in tools such as Jira. • Track metrics, risks, and issues, including maintaining risk logs and reporting status to stakeholders. • Support SDLC governance and DevSecOps practices, including CI/CD pipeline compliance and process improvement. • Assist with operations and maintenance activities, including defect triage, patch coordination, and Tier 3 support documentation.

Virginia
$83.4K - $98.1K / year
Job Closed
Full TimeRemoteTeam 501-1,000Since 1998H1B Sponsor

Role Description We are seeking a DevOps/SRE Team Lead with proven, hands-on Kubernetes expertise to drive the reliability and scalability of our video processing infrastructure and oversee a small team of SREs and DevOps Engineers. This is a deeply technical lead role, requiring real-world experience administering production Kubernetes clusters—not theoretical familiarity. You will own CI/CD pipelines, infrastructure automation, and cloud platform operations in a fully remote environment where independent execution is essential. You will spend 70-80% of your day being hands-on in the following areas: - Design, deploy, and administer production Kubernetes clusters, including workload scheduling, namespace management, RBAC, network policies, and cluster upgrades. - Design and maintain continuous integration/deployment pipelines to automate testing and deployment, including Kubernetes-native delivery workflows using Helm and ArgoCD or equivalent. - Track software performance, fixing errors, troubleshooting systems, implement preventative measures to ensure smooth workflows. - Implement and manage infrastructure. - Utilize Terraform or CloudFormation for IaC management. - Optimize cloud resources by implementing cost-effective solutions. - Collaborate with various teams to ensure smooth deployment. - Monitor and create new processes based on performance analysis. - Implement security best practices, including automated compliance checks and secure code deployment. You will spend 20-30% of your time managing the following areas: - Manage the technical roadmap, architecture while mentoring SRE and DevOps Engineers (Player/Coach). - Hire, coach, and manage a team of DevOps engineers and Site Reliability Engineers. - Strong communication, conflict resolution, and the ability to influence without authority. - Define DevOps/Platform roadmap aligned with business goals (e.g., cloud cost optimization, automation maturity). - Excellent communication and collaboration skills. Qualifications - Bachelor’s degree in computer science, Engineering or equivalent. - 5-8+ years of experience in DevOps/SRE, with 2-3+ years in a leadership role. - Hands-on experience building and maintaining CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or equivalent) with direct integration into Kubernetes deployment workflows. - Production-level experience with infrastructure as code (Terraform required; CloudFormation or Pulumi a plus), including managing cloud-hosted Kubernetes clusters (EKS, GKE, or AKS). - Experience with monitoring, logging, and observability tooling in Kubernetes environments (Prometheus, Grafana, Datadog, ELK/EFK stack, or equivalent); ability to build dashboards and alerts from scratch, not just consume existing ones. - Demonstrated, hands-on Kubernetes experience in production environments: cluster administration, Helm chart authoring and management, RBAC configuration, persistent storage, horizontal/vertical pod autoscaling, and diagnosing and resolving real production failures (CrashLoopBackOff, OOMKilled, networking issues, etc.). - Strong troubleshooting skills with the ability to diagnose infrastructure and application issues live, under pressure, without reference materials—this is evaluated directly in our interview process. - Proficiency in scripting languages (Python, Go, Bash, or PowerShell); ability to write and own automation scripts, not just modify existing ones. Benefits - Day-one medical, dental & vision coverage. - 100% company-paid life + disability insurance. - 401(k) with a sweet company match (up to 8%). - Quarterly HSA boosts & flexible spending accounts. - Flexible time off (salaried) or PTO (hourly) + generous paid holidays. - Pet insurance (yes, your dog gets benefits too). - Legal plan + extras like accident & critical illness coverage.

United States
Job Closed
LWSA logo

Senior Infrastructure/SRE Analyst

LWSA

Integrando soluções & Impulsionando negócios

DevOps Engineer33 days ago
Full TimeRemoteTeam 1,001-5,000Since 1998H1B No Sponsor

• Ensure the stability, availability, and performance of production environments, focusing on automation and reducing manual interventions (toil); • Work across the service lifecycle, from design and deployment to monitoring and continuous improvement; • Implement and evolve infrastructure as code (IaC) and CI/CD pipelines; • Manage and optimize critical services, including Kubernetes/ECS clusters and database layers; • Develop and maintain observability strategies (logs, metrics, and APM) to support root cause analysis and troubleshooting; • Ensure security routines, backup policies, disaster recovery, and cloud cost management; • Collaborate with development teams to improve software architecture and delivery processes; • Document technical procedures and promote infrastructure and reliability best practices across engineering; • Proactively participate in continuous improvement analyses and the architecture of highly available systems.

Brazil
Job Closed