Job Closed

This listing is no longer active.

DraftKings Inc. logo
DraftKings Inc.

Defining what it means to build and deliver the most extraordinary sports & entertainment experiences.The Crown is Yours

Lead Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 1,001-5,000Since 2012H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

88 days ago

Salary

$148K - $185K / year

Seniority

Senior

Job Description

Lead Site Reliability Engineer

DraftKings Inc.

• Lead SRE initiatives across multiple projects and products, collaborating with cross-functional teams to shape platform and infrastructure engineering efforts across the organization. • Drive technical excellence by mentoring and guiding engineers, fostering a culture of continuous learning and innovation. • Architect and automate self-healing, fault-tolerant infrastructure with declarative configurations, GitOps, and event-driven automation for scalable deployments across public clouds and on-premise. • Design, develop, and maintain software-driven infrastructure automation to build internal tools and eliminate repetitive operational tasks. • Own and drive decisions on product deployment, performance tuning, monitoring, and alerting to ensure high availability and system efficiency in production. • Define key metrics and SLAs around new web services being created to support our rapid traffic growth. • Design and implement monitoring and alerting strategies to enforce application SLAs.

Job Requirements

  • At least 6 years of experience managing distributed cloud environments (GCP, AWS, vSphere, Nutanix) and platform automation at scale.
  • Deep expertise in container orchestration (Kubernetes) and container runtimes (Docker, containers), with the ability to design, scale, and troubleshoot complex workloads.
  • Expert-level understanding of networking and web concepts, with the ability to debug issues down to the packet level.
  • Strong experience developing software for automation and infrastructure tooling (Go, Python).
  • Strong understanding of Linux-based operating systems, including performance tuning, bootloaders, storage, partitioning, kernel debugging, and low-level system optimizations.
  • Experience with Infrastructure as Code (IaC) and configuration management tools (Terraform, Ansible, Chef, etc.), ensuring scalable and repeatable infrastructure provisioning.
  • Understanding of applications written in various programming languages (C#/.NET, Java, Elixir, Ruby, etc).
  • Experience in AWS Greengrass IoT management and A/B booting.

Benefits

  • bonus
  • equity
  • benefits as applicable

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Flywire logo

Site Reliability Engineering Manager

Flywire

Delivering the most important & complex payments.

DevOps Engineer88 days ago
OtherRemoteTeam 1,001-5,000Since 2011H1B Sponsor

• Help drive reliability, automation and performance within cloud-based infrastructure • Coordinate and support daily activities for SREs on the team • Work on issues of limited scope and execute solutions to routine problems • Become embedded within an Engineering team advocating for best practices • Mentor team members and drive initiatives • Debug production issues across services and levels of the stack • Identify opportunities both in processes and tools to improve team productivity • Participate in an on-call shift along with other disciplines to respond to incidents • Lean into business domain and needs as well as company vision, mission and strategy

Massachusetts
$160K - $200K / year
Customer.io logo

Senior Site Reliability Engineer

Customer.io

Customer.io helps companies communicate with their customers in a more authentic and human way. Its versatile marketing automation platform helps “bring humanity to business comm

DevOps Engineer88 days ago

• Build and scale infrastructure to support billions of messages per day and real-time events • Automate deployments, alerting, and incident response • Make our on-call better - clear alerts, solid documentation, and faster resolution • Tune MySQL and other datastore performance and improve reliability across distributed systems • Collaborate across teams to debug, ship, and support systems in production • Share knowledge and raise the bar through sharing your progress publicly with short videos, thoughtful writing, and mentorship • Leverage AI tools to prototype, move faster, and make better decisions

United States
$140K - $180K / year
Job Closed
Customer.io logo

Senior Site Reliability Engineer

Customer.io

Customer.io helps companies communicate with their customers in a more authentic and human way. Its versatile marketing automation platform helps “bring humanity to business comm

DevOps Engineer88 days ago

• Build and scale infrastructure to support billions of messages per day and real-time events • Automate deployments, alerting, and incident response • Make our on-call better - clear alerts, solid documentation, and faster resolution • Tune MySQL and other datastore performance and improve reliability across distributed systems • Collaborate across teams to debug, ship, and support systems in production • Share knowledge and raise the bar through sharing your progress publicly with short videos, thoughtful writing, and mentorship • Leverage AI tools to prototype, move faster, and make better decisions

Europe
$140K - $180K / year
Job Closed
CACI International Inc logo

Senior DevSecOps Engineer, AI Enablement

CACI International Inc

Expertise and Technology for National Security

DevOps Engineer88 days ago
OtherRemoteTeam 10,001+Since 1962H1B No Sponsor

• Join CACI’s AI Enablement team as a Senior DevSecOps Engineer delivering rapid GenAI infrastructure and CI/CD capabilities through 1–2 month program engagements. • Deploy secure pipelines, containerized platforms, cloud environments, and managed AI services while coaching program teams to operate and evolve systems independently. • Enhance our solution catalog by refining IaC templates and contributing new infrastructure patterns from field experience. • Rapidly deploy GenAI infrastructure across AWS, Azure, and on‑prem using catalog templates. • Implement and operationalize containerized platforms; train teams on deployment and troubleshooting. • Establish production readiness standards including observability, reliability, and documentation. • Build and refine GitLab CI/CD pipelines with security scanning and deployment automation. • Configure identity and access management (Keycloak or similar) with OIDC/SAML. • Lead workshops, pair‑programming, and reviews to build program team capabilities. • Develop reusable Terraform modules and IaC patterns for networking, IAM, and GenAI infrastructure. • Document architecture decisions, lessons learned, and best practices. • Improve catalog templates and tooling based on recurring field challenges.

United States
$98.5K - $206.8K / year
Job Closed