We deliver better experiences for consumers and better results for your brand.
Principal DevOps Engineer
Location
United States
Posted
41 days ago
Salary
$180K - $210K / year
Seniority
Lead
Job Description
Principal DevOps Engineer
Zeta Global
• Design, build, and operate production-grade CI/CD pipelines enabling multiple developers on multiple teams to deploy concurrently to production, multiple times daily, with zero-downtime guarantees. • Implement and optimize advanced deployment strategies including canary releases, blue/green deployments, rolling updates, incremental rollouts, and feature flag-gated releases via Statsig. • Build self-service deployment tooling that empowers developers to own their release process while enforcing safety guardrails, automated rollback triggers, and automate compliance gates. • Establish deployment observability with real-time canary analysis, automated health scoring, and progressive delivery metrics integrated with Grafana, Prometheus, and Honeycomb. • Champion CI/CD workflows using GitLab CI/CD, Helm charts, and Terraform to ensure infrastructure and application deployments are version-controlled, auditable, and reproducible. • Define and enforce SLOs/SLIs/SLAs across services, establishing error budgets that balance velocity with reliability. • Lead incident response processes, including on-call rotations, runbook development, blameless postmortems, and incident command structure. • Design and implement robust observability stacks leveraging Grafana, Prometheus, Loki, and Honeycomb for metrics, logging, tracing, and alerting at scale. • Proactively identify and eliminate reliability risks through chaos engineering, load testing, capacity planning, and failure mode analysis. • Reduce operational toil through automation, self-healing infrastructure patterns, and intelligent alerting to minimize mean time to detection (MTTD) and recovery (MTTR).
Job Requirements
- 10+ years of progressive experience in DevOps, SRE, Platform Engineering, or Infrastructure Engineering roles, with demonstrated impact at staff or principal level.
- Expert-level Kubernetes knowledge, including cluster administration, Helm chart authoring, custom controllers/operators, network policies, RBAC, and multi-cluster management on AWS EKS.
- Deep expertise in CI/CD pipeline architecture and advanced deployment strategies (canary, blue/green, progressive delivery, feature flag integration) at scale.
- Strong proficiency with Infrastructure as Code using Terraform, including module design, state management, and multi-environment orchestration.
- Expert knowledge of Docker containerization, including multi-stage builds, security hardening, image optimization, and container runtime management.
- Production experience with Apache Kafka, including cluster management, topic design, consumer group strategies, and operational monitoring for high-throughput streaming workloads.
- Strong networking fundamentals: DNS (Route 53, internal DNS), TCP/IP, routing, API Gateway, load balancing (ALB/NLB), service mesh, VPC peering, transit gateways, and network troubleshooting.
- Extensive AWS experience spanning EKS, EC2, SQS, DynamoDB, IAM, VPC, CloudWatch, and related services in production environments.
- Hands-on experience with observability platforms: Grafana (dashboards, alerting), Prometheus (metrics, PromQL), Loki (log aggregation), and Honeycomb (distributed tracing, BubbleUp analysis).
- Working familiarity with multiple language stacks including Node.js, React, Python, Java, and Ruby, sufficient to understand build systems, dependency management, and runtime characteristics.
- Experience operating within regulated environments, with practical knowledge of GDPR, CCPA, SOC 2, and compliance automation in MarTech or AdTech domains.
- Proven ability to influence engineering culture, drive adoption of new practices, and communicate complex technical strategies clearly to both technical and non-technical stakeholders.
- Demonstrated experience with GitLab CI/CD pipelines, including advanced pipeline features such as parent-child pipelines, dynamic environments, and security scanning integration.
Benefits
- Unlimited PTO
- Excellent medical, dental, and vision coverage
- Employee Equity
- Employee Discounts, Virtual Wellness Classes, and Pet Insurance And more!!
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Senior Platform Engineer – SRE, PK
PrideLogicSpecializes in building world-class development teams and extending runways for groundbreaking startups.
• Lead IaC architecture • Drive GitOps at scale • Architect and operate Kubernetes infrastructure • Build self-service infrastructure automation • Own reliability • Set observability standards • Partner with security on zero-trust architecture • Mentor mid-level engineers
Senior Site Reliability Engineer, Database
AirwallexAirwallex is a financial services company that has developed a “global financial platform for modern businesses.” As an employer, the company strives to cul
• Design and build the platforms, automation, and AI-driven tooling that power Airwallex's database infrastructure. • Build a unified database observability platform providing real-time visibility into availability, security posture, reliability metrics, and latency. • Design and implement secure interfaces that allow AI agents to safely query and interact with production databases. • Develop AI-powered automation that handles routine DBA tasks reducing manual toil. • Create tooling that enables product engineering teams to provision, configure, scale, and manage Postgres and Redis instances through self-service workflows. • Establish and enforce database best practices across the organization. • Partner with product engineers to diagnose database performance issues, review schema designs, and provide guidance on data modeling and access patterns.
Senior Platform Engineer – SRE
WizdaaSpecializes in building world-class development teams and extending runways for groundbreaking startups.
• Lead IaC architecture • Drive GitOps at scale • Architect and operate multi-tenant Kubernetes infrastructure on AWS EKS • Build self-service infrastructure automation • Lead the use of agentic coding tools for infrastructure work • Own reliability • Set observability standards • Partner with security on zero-trust architecture • Contribute to technical roadmap • Mentor mid-level engineers
Senior Platform Engineer – SRE
WizdaaSpecializes in building world-class development teams and extending runways for groundbreaking startups.
• Lead IaC architecture • Drive GitOps at scale • Architect and operate multi-tenant Kubernetes infrastructure on AWS EKS • Build self-service infrastructure automation • Own reliability • Set observability standards • Partner with security on zero-trust architecture • Mentor mid-level engineers



