Staff DevOps Engineer
Location
United States
Posted
49 days ago
Salary
$190K - $235K / year
Seniority
Lead
No structured requirement data.
Job Description
Staff DevOps Engineer
Cast & Crew
About Us At Cast & Crew, we’ve empowered creativity and supported the global entertainment industry for decades. Together with our family of brands - Backstage, CAPS, Checks & Balances, Final Draft, Media Services, Sargent-Disc, and The TEAM Companies – we operate as a combined entertainment technology and services provider offering industry standard screenwriting accounting software, digital payroll products, data & reporting, and a host of creative tools. The industry continues to move faster than ever, and the need for our expertise, our technology, and our people has never been greater. We are a production’s best ally every step of the way. #OneCastOneCrew We are looking for a Staff DevOps Engineer to serve as a technical anchor for our platform engineering practice. In this role you will own the design and evolution of our CI/CD pipelines, Kubernetes infrastructure on AWS EKS, and the developer experience tooling that hundreds of engineers depend on daily. Staff-level engineers at this organization are expected to operate with significant autonomy, identify and resolve systemic problems before they become incidents, and raise the technical bar across the teams they partner with. What You’ll Do Platform & Infrastructure - Architect and continuously improve CI/CD pipelines in Azure DevOps, including pipeline-as-code standards, templating strategies, and artifact promotion workflows across environments. - Own the health and evolution of our AWS EKS clusters — node lifecycle, autoscaling, networking (VPC/CNI), RBAC, and cluster upgrades with minimal service disruption. - Design and enforce Infrastructure-as-Code practices using Terraform or equivalent tooling; champion GitOps patterns across engineering teams. - Drive platform reliability improvements informed by observability data from New Relic, working closely with SRE to translate dashboards and alerts into actionable platform changes. Developer Experience - Define and maintain golden-path templates for containerized workloads — Dockerfile standards, Helm chart libraries, and local development parity with production. - Partner with engineering teams to accelerate onboarding of new services onto the platform and reduce toil through automation. Incident & Operational Excellence - Act as an escalation point for complex infrastructure incidents coordinated through PagerDuty; participate in on-call rotation and lead post-incident reviews for platform-layer failures. - Identify recurring failure modes and drive systemic fixes that reduce page volume and MTTR across the platform. - Maintain and improve runbooks and platform documentation in Confluence, ensuring knowledge is accessible and current. Technical Leadership - Define and socialize DevOps standards — pipeline design, container hygiene, secret management, and deployment safety — across a multi-team engineering organization. - Conduct architecture reviews and provide technical guidance on infrastructure-impacting decisions made by product engineering teams. - Mentor senior and mid-level engineers; grow internal platform capability through pairing, code review, and structured knowledge sharing. - Identify tooling gaps and build the business case for platform investments, working with engineering leadership to prioritize roadmap items. What You Bring - 8+ years of DevOps or platform engineering experience, with at least 2 years operating at a Staff or Principal level in an organization of 100+ engineers. - Deep, hands-on expertise with Kubernetes — EKS specifically preferred — including troubleshooting workloads, networking, storage, and cluster operations at scale. - Strong command of Azure DevOps Pipelines, including YAML pipeline authoring, library management, service connections, and environment promotion gates. - Proven track record designing and maintaining CI/CD systems for microservice architectures with multiple independent teams as consumers. - Experience operating observability platforms (New Relic, Datadog, or similar) to drive proactive reliability improvements, not just reactive alerting. - Proficiency in at least one scripting language (Python, Bash, or Go) and Infrastructure-as-Code tooling (Terraform, Pulumi, or CDK). - Familiarity with feature flag patterns and operational considerations around progressive delivery (Unleash or equivalent is a plus). - Excellent written communication skills — you default to documentation and can translate complex infrastructure decisions into guidance engineers actually read. Nice to Have - Experience with data engineering or ML infrastructure workloads on Kubernetes (Spark on EKS, Argo Workflows, Airflow). - Background contributing to or maintaining internal developer portals (Backstage or similar). - Familiarity with FinOps practices and tooling for AWS cost attribution and optimization across shared Kubernetes clusters. - Experience in SRE-adjacent roles; comfort with SLO/SLI definition and error budget policy. Benefits Cast & Crew provides a comprehensive package of employee benefits including: Medical, Dental, Vision, PTO, health and wellness programs, employee discounts, and more! Note: Cast & Crew benefits are subject to eligibility requirements. Cast & Crew is an equal opportunity employer committed to hiring a diverse workforce and sustaining an inclusive culture. It is our policy to provide equal employment opportunities to all individuals based on job-related qualifications and ability to perform a job, without regard to age, gender, gender identity, sexual orientation, race, color, religion, creed, national origin, disability, genetic information, veteran status, citizenship or marital status, and to maintain a non-discriminatory environment free from intimidation, harassment or bias based upon these grounds. CA residents Your personal information may be collected in connection with certain services provided by Cast & Crew or its affiliated companies. A summary of your California privacy rights can be found at: https://www.castandcrew.com/privacy-policy/ Compensation is commensurate with various factors including, but not limited to, relevant experience, qualifications, skills, training, licensure, certifications, geographic cost of labor, and other business and organizational needs. Compensation range for candidates in other locations may differ based on the cost of labor in that location. The compensation range for this position is: $190,000.00 - $235,000.00 per year.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
We are tech transformation specialists, uniting human expertise with AI to create scalable tech solutions. With over 8,000 CI&Ters around the world, we’ve built partnerships with more than 1,000 clients during our 30 years of history. Artificial Intelligence is our reality. How you’ll make an impact: Are you an experienced Senior/Specialist DevOps Engineer with a passion for platform architecture and solution design? Do you thrive in environments where you can drive innovation and shape the future of infrastructure? Join our team and play a key role in designing, building, and optimizing cutting-edge solutions that enhance our platform’s scalability, reliability, and efficiency. As a Senior DevOps Engineer, you'll work closely with cross-functional teams to deliver robust infrastructure designs, perform PoCs, and ensure our DevOps practices are aligned with the latest industry trends. What will you be doing? - Collaborate with software architects and development teams to define infrastructure requirements and design comprehensive platform solutions. - Lead the design, implementation, and optimization of CI/CD pipelines to streamline software development, testing, and deployment processes. - Architect and manage Infrastructure as Code (IaC) using tools such as Terraform or CloudFormation, enabling scalable and reproducible infrastructure management. - Conduct PoCs to evaluate new tools, technologies, and methodologies, assessing their potential impact on the platform and operations. - Monitor and enhance the performance, reliability, and scalability of systems, ensuring high availability across production and development environments. - Troubleshoot and resolve complex issues across infrastructure, deployments, and applications, implementing robust solutions to improve system stability. - Integrate security best practices into the architecture and deployment processes, ensuring compliance with industry standards and regulations. - Mentor team members on advanced DevOps practices and contribute to establishing a culture of continuous improvement and operational excellence. Requirements: - Fluency in English for daily communication with our client (please submit your resume in English). - Extensive experience as a DevOps Engineer with a focus on platform architecture and designing scalable infrastructure solutions. - Proficiency in building and optimizing CI/CD pipelines using tools like Jenkins, Azure DevOps, or CircleCI, with an emphasis on automation and efficiency. - Strong scripting and automation skills (Python, Bash, or similar), with the ability to create scalable solutions and streamline operations. - Hands-on experience with containerization and orchestration tools (e.g., Docker, Kubernetes), including production-grade deployments. - Deep knowledge of cloud platforms such as AWS, Azure, or Google Cloud, with expertise in infrastructure provisioning and management. - Strong understanding of Infrastructure as Code (IaC) principles and experience with relevant tools (Terraform, CloudFormation). - Experience in performing PoCs and assessing new tools and technologies to enhance infrastructure and operations. - Security-focused mindset with a track record of implementing best practices for securing cloud-based and on-premise environments. - Excellent communication skills, with the ability to clearly articulate technical concepts and collaborate effectively across teams. You will stand out if you have: - Relevant certifications such as Azure Certified DevOps Engineer, Certified Kubernetes Administrator (CKA), or other cloud and DevOps-related certifications. - Previous experience in roles involving software development or Site Reliability Engineering (SRE). - Proven experience with Azure, including designing and managing infrastructure, implementing DevOps practices, and leveraging Azure-specific services. - A history of designing or contributing to platform-level architecture and infrastructure strategies. If you like it, just apply and good luck! #LI-JM2 Our benefits: -Health and dental insurance -Meal and food allowance -Childcare assistance -Extended paternity leave -Partnership with gyms and health and wellness professionals via Wellhub (Gympass) TotalPass; -Profit Sharing and Results Participation (PLR); -Life insurance -Continuous learning platform (CI&T University); -Discount club -Free online platform dedicated to physical, mental, and overall well-being -Pregnancy and responsible parenting course -Partnerships with online learning platforms -Language learning platform And many more! More details about our benefits here: https://ciandt.com/br/pt-br/carreiras At CI&T, inclusion starts at the first contact. If you are a person with a disability, it is important to present your assessment during the selection process. See which data needs to be included in the report by clicking here.This way, we can ensure the support and accommodations that you deserve. If you do not yet have the assessment, don't worry: we can support you in obtaining it. We have a dedicated Health and Well-being team, inclusion specialists, and affinity groups who will be with you at every stage. Count on us to make this journey side by side.
• Design and operate a fleet of Kubernetes (EKS) clusters across production, staging, and ephemeral environments, ensuring reliability and high availability • Evolve AWS infrastructure and network architecture (VPCs, subnets, IAM, account structure) to support scalable, multi-team workloads • Build and maintain infrastructure-as-code and GitOps workflows using tools such as Terraform, CDK, and ArgoCD • Improve platform reliability and performance by defining and driving SLOs, analyzing incidents, and implementing systemic fixes • Participate in and help improve the on-call rotation, leading incident response and post-incident reviews to drive systemic platform improvements • Partner with SRE, Delivery, InfoSec, and product/ML teams to land high-impact infrastructure changes and platform standards • Drive improvements in developer experience by simplifying platform usage, reducing toil, and enabling faster product and ML development • Contribute to cost efficiency initiatives by optimizing resource utilization across Kubernetes and cloud infrastructure
Overview Some of the world’s most innovative global enterprise software companies struggle to find engineering partners capable of matching their rigorous standards. These teams need a partner that can co-own complex problems from within their own development environment. Enter EverOps – the premier Embedded Service Provider. We partner directly with customer engineering teams to assess and address mission-critical delivery and infrastructure challenges. The Challenge EverOps is looking for a Senior DevOps Engineer with a deep mastery of enterprise cloud infrastructure and the ability to drive technical projects autonomously. You will act as a technical anchor, leveraging advanced engineering skills to migrate high-traffic workloads and restructure AWS environments to meet enterprise standards. The Mission As a Senior DevOps Engineer, you will join our U.S.-Based Virtual Operating Center, working within a dynamic team to manage and evolve production cloud environments. Your primary mission will involve the strategic containerization of legacy workloads and the architectural separation of accounts to improve security and scalability. You will be expected to lead by example—architecting solutions with Terraform + Atmos, implementing EKS best practices, and mentoring peers to ensure collective success. What You’ll Do - Workload Migration: Lead the transition of acquired ad exchange workloads from EC2 into a modern, containerized Amazon EKS architecture. - Account Restructuring: Execute AWS account separation, moving from shared environments into distinct, isolated accounts (Dev, Staging, Prod) with robust enterprise guardrails. - Infrastructure as Code: Design and maintain a DRY, component-driven infrastructure using Terraform and Atmos. - EKS Operations: Architect and operate multi-tenant Kubernetes platforms, focusing on namespace isolation, RBAC models, and cluster security. - GitOps & CI/CD: Implement and optimize deployments using ArgoCD and GitHub Actions for a seamless, automated SDLC. - Technical Mentorship: Act as a subject matter expert (SME) within your pod, guiding engineers on complex troubleshooting and architectural best practices. - Cloud-Native Security: Manage secrets and encryption using AWS Secrets Manager and KMS, ensuring secure cross-account access and Kubernetes integration. You Have - Experience: 5+ years of professional experience in DevOps, CloudOps, or SRE, specializing in high-scale Amazon EKS environments. - Infrastructure Frameworks: Advanced proficiency with Terraform, specifically utilizing Atmos (or similar wrappers) to manage hierarchical configurations across multi-account structures. - Migration Mastery: Proven track record of migrating complex workloads from EC2 to EKS with minimal downtime. - Multi-Account Governance: Deep experience with AWS Organizations and Landing Zone patterns to enforce environment isolation. - Enterprise Guardrails: Ability to implement security guardrails using IAM Permission Boundaries and SCPs. - Containerization: Expert knowledge of Docker and Kubernetes-native networking. - Coding: Proficiency in Golang, Python, or Bash for building custom automation and migration tooling. - AWS Security: Production experience with AWS Secrets Manager, KMS, and integration patterns like External Secrets Operator for EKS. - Observability: Experience implementing enterprise monitoring suites like Datadog, Prometheus, or Grafana. Extra Awesome - Progressive Delivery: Production experience with Argo Rollouts for Canary and Blue/Green deployments. - FinOps & Cost Governance: Experience with cost allocation, tagging strategies, and tools like KubeCost to manage spend during account migrations. - Policy as Code: Experience implementing OPA (Open Policy Agent) or Kyverno to enforce compliance within EKS. - Scale & Performance: Background in high-transaction industries (AdTech, Gaming, or Fintech) where platform stability is critical. - Platform Engineering: A mindset toward building internal developer platforms (IDPs) that allow for self-service within the new EKS accounts. - Certifications: AWS Certified Solutions Architect (Pro) or Certified Kubernetes Administrator (CKA). Benefits - 100% Remote Workplace: We’ve been remote since Day 1! - Unlimited Paid Time Off. - Equity: Become a true owner of the company. - 401K with company contribution and sponsored healthcare. - Professional Growth: Access to training and certification programs to accelerate your career.
DevOps Engineer
GTGT provides clients with offshore product teams from CEE, a product development studio & data science services.
• Take ownership of rebuilding and optimizing the DevOps setup in a complex, high-load environment • Analyze existing Azure infrastructure, Terraform configurations, and Kubernetes environments • Reverse-engineer or redesign CI/CD pipelines currently managed by an external vendor • Define the best approach for rebuilding or replicating the existing setup • Provide recommendations around ownership, architecture, and transition strategy • Design and implement CI/CD pipelines (GitHub → Azure) • Automate infrastructure and deployment processes using Terraform and scripting • Ensure full ownership, documentation, and maintainability of pipelines • Manage and optimize Azure services and Kubernetes clusters • Ensure scalability, reliability, and performance of production systems • Apply best practices around infrastructure, deployment, and system stability • Work closely with backend engineers, system engineers, and integration teams • Provide documentation and knowledge transfer to support long-term in-house ownership • Support the transition towards a fully internal DevOps capability • Collaborate with stakeholders to meet deadlines • Participate in regular syncs and provide progress updates



