Job Closed

This listing is no longer active.

JUUL Labs logo
JUUL Labs

An electronic cigarette company, JUUL Labs is the creator of the JUUL e-cigarette, which uses nicotine salts found in leaf-based tobacco. Founded to improve the

Senior Site Reliability Engineer

Location

United States

Posted

141 days ago

Salary

$150K - $184K / year

Seniority

Senior

Bachelor Degree8 yrs expEnglishAWSGCPKubernetesPythonTCP/IPTerraform

Job Description

Senior Site Reliability Engineer

JUUL Labs

• A Senior Site Reliability Engineer (SRE) is expected to own the operational stability and performance of Juul’s hybrid cloud infrastructure (Nutanix, AWS/GCP). • This involves leading automation efforts, architecting for reliability, and acting as the final escalation point for critical incidents to ensure the platform is scalable and efficient. • Design, deploy, and maintain enterprise-scale Nutanix AHV clusters and Prism Central for multi-cluster management. • Expert-level proficiency with Nutanix CLI (nCLI and acli) for advanced operations, troubleshooting, and automation. • Develop automation scripts using Nutanix REST APIs, Python SDK, PowerShell, and Terraform for infrastructure-as-code. • Design disaster recovery solutions using Leap, Protection Domains, cross-cluster replication, and metro clustering. • Lead L3 troubleshooting using advanced diagnostics, log analysis (CVM, Genesis), NCC health checks, and cluster service resolution.

Job Requirements

  • 8-12+ years infrastructure experience with 8+ years in Nutanix HCI and enterprise cloud AWS/GCP)
  • Expert-level skills in Python, PowerShell, Bash scripting, infrastructure-as-code (Terraform/CloudFormation), and container orchestration (Kubernetes, EKS/GKE)
  • Proven experience managing enterprise-scale environments, hybrid cloud migrations, disaster recovery, and L3 critical incident management
  • Strong networking knowledge (TCP/IP, VLANs, routing, VPN), security hardening, and compliance frameworks (ITIL)
  • Strategic thinker with exceptional analytical and troubleshooting abilities for complex multi-layer infrastructure issues
  • Excellent communication skills to translate technical concepts to executives and non-technical stakeholders
  • Calm under pressure during critical outages with meticulous attention to security, compliance, and configuration management
  • Self-motivated continuous learner committed to staying current with evolving cloud technologies and automation opportunities
  • Available for on-call rotations with strong documentation skills and customer service orientation
  • Certifications (plus): Nutanix NCP/NCAP, AWS Solutions Architect Professional, AWS DevOps
  • Professional, GCP Professional Cloud Architect, Terraform

Benefits

  • People. Work with talented, committed and supportive teammates
  • Equity and performance bonuses. Every employee is a stakeholder in our success
  • Cell phone subsidy, commuter benefits and discounts on JUUL products
  • Excellent medical, dental and vision, disability, and life insurance, plus family support, wellness, legal, and employee assistance program benefits
  • 401(k) plan with company matching
  • Plus biannual discretionary performance bonuses

Related Categories

Related Job Pages

More DevOps Engineer Jobs

CyberSheath logo

Cloud Operations Engineer

CyberSheath

Assess, Implement, Manage (AIM™)

DevOps Engineer141 days ago
OtherRemoteTeam 51-200Since 2012H1B No Sponsor

• Provision and deliver computer systems and services, both on-premise and cloud hosted solutions • Regularly perform migrations to Office 365 (email, SharePoint, OneDrive, Teams), server migrations to Azure, and implement various Azure technologies • Design and deliver secure cloud solutions in Office 365 and Azure • Architecture, design, system evaluation and analysis, and infrastructure assessments • Deploy and maintain tools and monitoring agents such as endpoint protection, vulnerability management, log collection, multifactor authentication, RMM, etc. • Track all activities, detailed case notes, and time entries within the Service Desk Ticketing system • Work with internal CyberSheath stakeholders to ensure key tasks and timeline are identified, on time, and on budget • Provide timely updates and feedback to the internal team and clients regarding project status activities being on-track. • Drive technical implementation tasks to completion by budgeted deadlines • Proactively communicate with clients to ensure requests are properly addressed • Collaborate with team members to troubleshoot onboarding implementation issues as they arise • Other duties as assigned.

United States
$110K - $130K / year
Job Closed
Zeta Global logo

Senior Site Reliability Engineer

Zeta Global

We deliver better experiences for consumers and better results for your brand.

DevOps Engineer141 days ago
OtherRemoteTeam 1,001-5,000Since 2007H1B Sponsor

• Design, implement, and manage SLOs, SLIs, and error budgets, ensuring reliability aligns with user expectations and business objectives. • Develop production-grade software to enhance system reliability and reduce manual toil through automation. • Implement and optimize observability solutions using tools like OpenTelemetry, with a focus on high-cardinality metrics, distributed tracing, and actionable insights. • Drive postmortem processes and lead in-depth root cause analyses for incidents, ensuring lessons learned are effectively applied to prevent recurrence. • Define and monitor MTTx metrics (MTTA, MTTR, MTTF), using them to guide system improvements and measure reliability progress. • Design and participate in Chaos Engineering exercises. • Collaborate with engineering teams to design systems with reliability and scalability in mind, incorporating capacity planning, resiliency patterns, and modern deployment strategies (e.g., Canary, Blue-Green). • Lead design reviews for alerting strategies, ensuring effective signal-to-noise ratios in monitoring and incident management. • Advocate for and implement best practices in incident response and system design to achieve optimal uptime and performance.

United States
$140K - $170K / year
Job Closed
Veeam Software logo

Manager, Site Reliability Engineering

Veeam Software

Your Single Backup and Data Management Platform for Cloud, Virtual and Physical

DevOps Engineer141 days ago
Full TimeRemoteTeam 1,001-5,000Since 2006H1B Sponsor

• Hire, onboard, and grow your SRE team; coach career development and performance • Foster a psychologically safe, blameless culture that favors learning over blame and emphasizes engineering over firefighting • Ensure a sustainable operational coverage; monitor on-call health and workload • Track and cap toil so engineers spend the majority of time on project work that reduces future toil • Establish and operationalize SLIs/SLOs and error budgets with service owners; run reliability reviews and hold teams accountable to outcomes • Define reliability standards, runbooks, readiness checklists, and alerting patterns (including SLO-based alerting) • Partner with product/EMs to align reliability work with service goals and customer experience, not as a gate but as an enabler • Ensure incident response readiness; lead/coordinate major incidents; drive fast, high-quality postmortems and systemic fixes • Measure MTTR, change failure rate, SLO posture, and repeat-incident reduction; publish learning broadly • Lead software-first reliability investments: observability, deployment safety (canary/blue-green), resilience testing/chaos, and self-service guardrails • Drive platform improvements (IaC, CI/CD, Kubernetes) and internal tools that scale operations and improve developer experience

Czechia
Job Closed
Airalo logo

Senior DevSecOps Engineer

Airalo

World’s first eSIM store that gives you access to eSIMs for 200+ countries worldwide at affordable prices.

DevOps Engineer141 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

• Design, implement, and manage security solutions across the entire software development lifecycle (SDLC), with a focus on automation and continuous integration/continuous delivery (CI/CD) pipelines, including robust API security measures and authentication protocols. • Champion security best practices within engineering, DevOps, SRE, and IT teams, fostering a culture of shared responsibility for security. • Proactively identify and remediate security vulnerabilities in applications, mitigating OWASP Top 10 vulnerabilities, infrastructure, and cloud services through threat modeling, vulnerability assessments, and penetration testing. • Develop and maintain security monitoring and alerting solutions to detect and respond to potential security incidents in real-time and prevent common cyber attacks such as DDoS, injection attacks, and credential stuffing. • Define and enforce secure coding standards and provide training and mentorship to development teams on DevSecOps principles. • Lead compliance initiatives by contributing to security policies, controls, and audit readiness for SOC 2, ISO 27001, GDPR, and other relevant regulations.

Spain
Job Closed