Job Closed
This listing is no longer active.
Email, push notifications, text messages, in-app messages, webhooks: automated and powered by your data.
Site Reliability Engineering Manager
Location
United States
Posted
126 days ago
Salary
$175K - $195K / year
Seniority
Lead
Job Description
Site Reliability Engineering Manager
Customer.io
• Lead effective squad rituals and ensure production readiness through high-quality peer review, QA, documentation, deployment, logging, and monitoring practices • Partner with engineers to ensure solutions are scalable, architecturally sound, flexible, and secure • Create accountability for delivery timelines while fostering an inclusive, collaborative, and empathetic work environment • Provide timely, specific coaching and development opportunities for your direct reports • Hire, onboard, and grow the right people to accomplish business objectives within your squad • Build deep understanding of Customer.io’s vision, products, and customers to drive meaningful engineering investigations and decisions • Collaborate with other Engineering Managers and technical leaders to align on strategy and execution
Job Requirements
- 8+ years of engineering management experience, with at least part of that leading SRE or infrastructure teams in SaaS (B2B or B2C), ideally at early-to-mid-stage companies
- 3+ years of hands-on SRE experience, designing and operating reliable, scalable infrastructure
- Understands SaaS architecture, languages, technologies, and cloud infrastructure deeply enough to represent and advocate for their squad’s technical choices across the company
- Balances pragmatism and vision—capable of delivering near-term improvements while charting a long-term path forward
- Invests in technical depth: reviewing proposals, experimenting with new technologies, and leveling up engineers through feedback and mentorship
- Builds and nurtures high-performing, distributed teams
- Stays energized by solving customer-impacting problems, even under pressure
- Communicates clearly and directly, both verbally and in writing
Benefits
- 100% coverage of medical, dental, vision, mental health, and supplemental insurance premiums for you and your family
- 16 weeks paid parental leave
- unlimited PTO
- stipends for remote work and wellness
- a professional development budget
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Grow as the Subject Matter Expert (SME) for security best practices • Promote a culture of security, automation, and continuous improvement • Integrate and manage security controls and best practices • Manage DAST, IAST, and SAST tools to identify and remediate vulnerabilities • Automate security testing and compliance checks • Develop and enforce policy as code for Kubernetes environments • Implement and manage infrastructure as code (IaC) solutions • Collaborate with development, operations, and security teams to address vulnerabilities • Continuously evaluate and improve DevSecOps tools, processes, and standards
• Participate in technology-related projects such as assisting in moving repositories and workflows into GitHub cloud to set new standards for version control. • Build and troubleshoot automation scripts using GitHub Actions and Jenkins (Groovy/Jenkinsfile). • Help teams migrate applications from traditional servers to Google Kubernetes Engine (GKE) and Cloud Run. • Use Terraform to provision and manage cloud resources programmatically. • Collaborate with technical resources to perform root cause analysis and complete remediation of technology issues, specifically identifying and fixing security vulnerabilities (CVEs) within deployment pipelines and infrastructure. • Collaborate on driving improvement activities by creating tools that help developers deploy their own code without manual intervention. • Translate functional requirements to technical specifications to build Tableau or Cloud Native dashboards for monitoring event metrics and system health. • Research and data gathering for key initiatives related to Cloud Native application architecture and automation methodologies.
DevOps, Cloud & Infrastructure Engineer
EnterpriseAlumniCorporate Alumni Engagement & Management Platform For The Enterprise
• Design, build, and maintain infrastructure on AWS (ECS, EKS, RDS) • Manage and scale Kubernetes clusters (EKS) using Helm • Develop and maintain infrastructure as code using Terraform / Terragrunt • Improve and maintain CI/CD pipelines (Jenkins) • Automate operational tasks using Bash and Python • Work with Docker to build and optimize containerized workloads • Implement and maintain observability solutions (Prometheus, Grafana, OpenSearch) • Ensure system reliability, scalability, and security (Linux hardening, OS-level tuning) • Troubleshoot production issues across infrastructure, networking, and applications • Collaborate with engineering teams and participate in architectural decisions
Site Reliability Engineer – SkillBridge Intern
ZscalerWe make it easy to secure your cloud transformation. Get fast, secure, and direct access to apps without appliances.
• Manage operational tasks for products in US Government classified environments, including deployments, on-call duties, and incident management. • Develop scripts, containerized services, and monitoring mechanisms to automate operations tasks and ensure minimal service disruption. • Create operations documentation and implement measures to prevent recurring incidents while contributing to DevOps best practices. • Build and enhance Zscaler services within classified environments, ensuring 24x7 coverage including night and holiday shifts.




