Senior Site Reliability Engineer
Location
United States
Posted
26 days ago
Salary
$141K - $227K / year
Seniority
Senior
No structured requirement data.
Job Description
Senior Site Reliability Engineer
Juul Labs
Role Description A Senior Site Reliability Engineer (SRE) is expected to own the operational stability and performance of Juul’s hybrid cloud infrastructure (Nutanix, AWS/GCP). This involves leading automation efforts, architecting for reliability, and acting as the final escalation point for critical incidents to ensure the platform is scalable and efficient. Nutanix Platform Management - Design, deploy, and maintain enterprise-scale Nutanix AHV clusters and Prism Central for multi-cluster management. - Expert-level proficiency with Nutanix CLI (nCLI and acli) for advanced operations, troubleshooting, and automation. - Develop automation scripts using Nutanix REST APIs, Python SDK, PowerShell, and Terraform for infrastructure-as-code. - Create and manage VM templates, golden images, and standardized deployment catalogs for consistent provisioning. - Design disaster recovery solutions using Leap, Protection Domains, cross-cluster replication, and metro clustering. - Implement network micro-segmentation using Nutanix Flow and configure RBAC, encryption, and security hardening. - Lead L3 troubleshooting using advanced diagnostics, log analysis (CVM, Genesis), NCC health checks, and cluster service resolution. - Configure high availability, VM affinity rules, QoS policies, and optimize performance for mission-critical workloads. - Manage AHV networking with OVS bridges, VLANs, bonds, LACP and implement resource reservations and workload balance. - Design, deploy, and maintain hybrid cloud infrastructure across Nutanix HCI, AWS, and GCP platforms. - Architect and implement multi-cloud solutions ensuring high availability, scalability, and disaster recovery. Cloud Platform Engineering - Architect and deploy enterprise-scale, highly available multi-cloud solutions across AWS and GCP with multi-region/multi-account strategies. - Expert-level proficiency with AWS CLI, GCP CLI, SDK, boto3, and Python for advanced automation and infrastructure orchestration. - Design AWS Organizations and GCP Organization hierarchies with consolidated billing, IAM policies, and centralized governance. - Configure and manage AWS Systems Manager (SSM) including Session Manager, Run Command, State Manager, and Automation for centralized fleet operations. - Implement centralized logging using CloudWatch/CloudTrail and GCP Cloud Logging with S3/Cloud Storage aggregation. - Integrate AWS and GCP with Splunk using HEC, CloudWatch subscriptions, Pub/Sub, Dataflow, and cloud-specific add-ons for SIEM correlation. - Design and deploy advanced load balancing solutions with AWS ALB/NLB/ELB and GCP Cloud Load Balancing including SSL termination and auto-scaling. - Develop infrastructure-as-code using Terraform, CloudFormation, CDK for repeatable multi-cloud deployments and CI/CD pipelines. - Configure AWS SSO, cross-account IAM roles, GCP Workload Identity, and federated access for centralized identity management. - Design VPC architectures with AWS Transit Gateway/PrivateLink and GCP Shared VPC/VPC peering for hybrid connectivity. - Manage containerized workloads using EKS, GKE, ECS, Cloud Run with service mesh, observability, and security best practices. - Implement disaster recovery using AWS Backup, Cross-Region Replication, GCP snapshots, and multi-region failover strategies. - Lead L3 troubleshooting using CloudWatch Insights, GCP Cloud Trace, VPC Flow Logs, X-Ray, and vendor support escalation. - Perform cost optimization through Reserved Instances, Committed Use Discounts, rightsizing, and automated resource lifecycle management. System Administration - Administer and support Windows Server and Unix/Linux environments in production and non-production settings. - Perform OS-level hardening, patch management, and security compliance across heterogeneous systems. - Automate routine administrative tasks using PowerShell, Bash, Python, or similar scripting languages. - Manage GitHub organization settings, user permissions, repository access controls, and monitor GitHub Actions workflows and repository health across multiple teams. - Configure Splunk forwarders, heavy forwarders and other integrations for data ingestion from cloud and on-premises sources. Qualifications - 8-12+ years infrastructure experience with 8+ years in Nutanix HCI and enterprise cloud (AWS/GCP). - Expert-level skills in Python, PowerShell, Bash scripting, infrastructure-as-code (Terraform/CloudFormation), and container orchestration (Kubernetes, EKS/GKE). - Proven experience managing enterprise-scale environments, hybrid cloud migrations, disaster recovery, and L3 critical incident management. - Strong networking knowledge (TCP/IP, VLANs, routing, VPN), security hardening, and compliance frameworks (ITIL). - Strategic thinker with exceptional analytical and troubleshooting abilities for complex multi-layer infrastructure issues. - Excellent communication skills to translate technical concepts to executives and non-technical stakeholders. - Calm under pressure during critical outages with meticulous attention to security, compliance, and configuration management. - Self-motivated continuous learner committed to staying current with evolving cloud technologies and automation opportunities. - Available for on-call rotations with strong documentation skills and customer service orientation. Requirements - Certifications (plus): Nutanix NCP/NCAP, AWS Solutions Architect Professional, AWS DevOps Professional, GCP Professional Cloud Architect, Terraform. Benefits - People. Work with talented, committed and supportive teammates. - Equity and performance bonuses. Every employee is a stakeholder in our success. - Cell phone subsidy, commuter benefits and discounts on JUUL products. - Excellent medical, dental and vision, disability, and life insurance, plus family support, wellness, legal, and employee assistance program benefits. - 401(k) plan with company matching. - Plus biannual discretionary performance bonuses.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Ensure the reliability, performance and availability of connectivity between systems by operating, diagnosing and evolving networks in distributed, hybrid and cloud environments, with a focus on traffic analysis, advanced troubleshooting and communication architecture between services. **Strategic AWS environment management:** • Operate and evolve complex, reproducible environments with high availability, performance and horizontal scalability, using services such as EC2, ECS, Lambda, RDS and S3. **Reliability and observability:** • Define, implement and evolve end-to-end observability practices (SLIs, SLOs, SLAs) with tools like New Relic, CloudWatch and custom dashboards; • Proactively identify bottlenecks and incidents. **Connectivity and network performance:** • Diagnose and resolve latency, packet loss and throughput issues in distributed environments; • Troubleshoot DNS, VPNs, firewalls and load balancers (L4/L7); • Analyze communication flows between services (cloud, on-premises and external integrations); • Support the definition and evolution of connectivity architecture between systems. **Automation and CI/CD:** • Design, maintain and optimize robust and secure pipelines (Jenkins, Bitbucket, GitOps) for continuous delivery of microservices and serverless workloads. **SRE culture and continuous improvement:** • Promote blameless post-mortems, chaos engineering and automation of operational tasks to reduce toil and increase team efficiency. **Security and governance (DevSecOps):** • Integrate best practices for IAM, secure networking, encryption, traffic control, vulnerability monitoring and compliance into the application lifecycle, including connectivity troubleshooting and communication analysis between services. **Documentation and knowledge sharing:** • Create and maintain clear documentation on architecture, automations, incidents and runbooks, driving autonomy and improving onboarding.
Vice President of Engineering – DevOps Engineering
GitLabBuild software faster. The One DevOps Platform enables your entire org to collaborate around your code. We're hiring.
• Define the engineering strategy for your functional area, aligning roadmap, investments, and organizational planning with GitLab's company direction. • Lead a large, globally distributed engineering organization across multiple product domains, supporting Directors, Senior Managers, Engineering Managers, and their teams. • Drive operational excellence through clear engineering metrics, strong incident management practices, and a disciplined approach to reliability, quality, and developer productivity. • Champion AI-native engineering practices across workflows, platforms, and products, including developer tooling, code review, and CI/CD. • Partner with Product Management to create a consistent, high-velocity working model between engineering and product teams. • Collaborate with Finance and executive peers on headcount planning, budgeting, forecasting, and investment tradeoffs. • Work with Sales, Marketing, Alliances, Customer Success, and security leaders to address enterprise needs and support strategic initiatives. • Represent your engineering organization in executive discussions, board-level conversations, and external forums through clear communication and transparent documentation.
Role Description Für unseren Kunden, ein innovatives und forschendes Pharmaunternehmen mit Sitz in Ingelheim, suchen wir einen Regulatory Affairs Manager, Regulatory Affairs Engineer, Device Development Engineer, Senior Engineer, Verification & Validation Engineer, Entwicklungsingenieur Medizinprodukt, Scientist (m/w/d) in Vollzeit! - Planung, Durchführung und Koordinierung von Design-Verifizierungen - Einhaltung regulatorischer Vorgaben und Zulassung neuer Produkte - Weiterentwicklung regulatorischer Vorgaben und Integration in die Entwicklungsprozesse - Vertretung im Device-Entwicklungsteam - Verantwortung für alle relevanten Arbeitspakete zur Design-Verifizierung - Durchführung von Design-Verifizierungsstudien und Erstellung einreichungsrelevanter Pläne und Berichte - Übernahme von Teilen der Designlenkung und Koordination der Schnittstellenarbeit Qualifications - Abgeschlossenes Bachelor- oder Masterstudium einer Life Sciences Disziplin, wie z. B. Medizintechnik, Biotechnologie, Pharmazie, Chemie, Biologie, Molekularbiologie, Zellbiologie, Ingenieurwissenschaften mit Schwerpunkt Biomedizin - Mehrjährige Berufserfahrung in der Entwicklung von Medizinprodukten oder in einem regulierten Umfeld, idealerweise mit Erfahrung in der Design Verifizierung oder der Umsetzung regulatorischer Anforderungen für Kombinationsprodukte - Idealerweise Erfahrung in der Planung, Durchführung und Dokumentation von Design Verifizierungen von Kombinationsprodukten Benefits - Unbefristeten Arbeitsvertrag - Sonderzahlungen wie Urlaubs- und Weihnachtsgeld - Ab 25 bis 30 Tage Urlaub pro Jahr - Außertarifliche Zuschläge + Prämie - 37,5 Std./Woche Vollzeittätigkeit - Bezahlung nach GVP-Tarif und auf Grundlage des Tarifvertrages der chemischen Industrie Baden-Württemberg - Schutz- und Arbeitskleidung werden kostenfrei gestellt - Kostenfreie Arbeitsmedizinischer Vorsorgeuntersuchung - Ermäßigter Zugang zu Sport-, Freizeit- und Wellness-Einrichtungen - Besondere Konditionen für den Kauf von Konzertkarten, Sportartikeln, elektronischen Geräten, Reisen und vielem mehr - Vermögenswirksame Leistungen nach der Probezeit
• Desarrollar, mejorar y migrar pipelines de integración y despliegue continuo desde Jenkins hacia GitLab CI/CD. • Colaborar en la migración de automatizaciones on-premise hacia entornos cloud sobre AWS. • Dar soporte técnico al equipo interno en relación con la arquitectura y procesos de automatización. • Implementar y gestionar infraestructura como código (IaC) para asegurar entornos replicables y escalables. • Administrar sistemas operativos Windows y Linux para entornos de automatización y scripting.



