Senior Site Reliability Engineer, Node Platform
Location
United States
Posted
74 days ago
Salary
0
Seniority
Senior
Job Description
Senior Site Reliability Engineer, Node Platform
Chainlink Labs
• You will design and build the infrastructure primitives that define how Chainlink Decentralized Oracle Networks (DONs) scale across internal systems and the decentralized ecosystem. • You will help create the CRE (Kubernetes-based) control plane that enables: • Deterministic horizontal scaling of DONs • Safe and repeatable infrastructure expansion • Improved operational efficiency and scalability • You will develop the core infrastructure components, including Kubernetes Operators and scaling automation, that Product teams will adopt and then might later be distributed to external node operators to improve decentralized scaling.
Job Requirements
- 6–9+ years in SRE / Platform / Infrastructure Engineering
- Proven experience scaling Kubernetes in high-throughput production environments
- Deep knowledge of:
- Scheduler behavior
- StatefulSets & persistent workloads
- Autoscaling strategies (HPA, VPA, KEDA, custom scaling)
- Resource management & performance tuning
- Multi-cluster and multi-region architectures
- Experience in diagnosing production failures at the cluster scale
- Strong Terraform or Crossplane experience
- GitOps workflows (ArgoCD / Flux) experience
- CI/CD reliability experience
- Automation-first mindset
- AWS production experience
- Proficiency in Go (strongly preferred) or equivalent systems language.
Benefits
- All roles with Chainlink Labs are global and remote-based.
- We carefully review all applications and aim to provide a response to every candidate within two weeks after the job posting closes.
- Commitment to Equal Opportunity
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
• Lead the design, architecture, and management of CI/CD pipelines using GitHub Actions (and similar tools), ensuring fast, reliable, and reproducible software delivery. • Implement and enforce test-driven deployment systems, integrating automated testing, validation, and monitoring to maintain code quality and accelerate feedback cycles. • Containerize applications and microservices with Docker, optimize image builds, and manage deployment pipelines for distributed environments. • Oversee the build, packaging, and publishing lifecycle for JavaScript, TypeScript, and C++ packages, including versioning, semantic tagging, and NPM or internal registry publication. • Develop and maintain cross-platform build pipelines using CMake or equivalent tools, ensuring consistent compilation and release workflows across web, desktop, and mobile. • Automate end-to-end release processes, including tagging, building, signing, and distributing mobile, web, and desktop applications. • Define and manage Infrastructure as Code (IaC) to provision and maintain reliable, scalable, and secure infrastructure environments. • Collaborate closely with development, QA, and operations teams to troubleshoot deployment issues, optimize performance, and improve release reliability. • Continuously improve observability and feedback loops, leveraging monitoring and alerting systems to maintain operational excellence.
• Lead the design, architecture, and management of CI/CD pipelines using GitHub Actions (and similar tools), ensuring fast, reliable, and reproducible software delivery. • Implement and enforce test-driven deployment systems, integrating automated testing, validation, and monitoring to maintain code quality and accelerate feedback cycles. • Containerize applications and microservices with Docker, optimize image builds, and manage deployment pipelines for distributed environments. • Oversee the build, packaging, and publishing lifecycle for JavaScript, TypeScript, and C++ packages, including versioning, semantic tagging, and NPM or internal registry publication. • Develop and maintain cross-platform build pipelines using CMake or equivalent tools, ensuring consistent compilation and release workflows across web, desktop, and mobile. • Automate end-to-end release processes, including tagging, building, signing, and distributing mobile, web, and desktop applications. • Define and manage Infrastructure as Code (IaC) to provision and maintain reliable, scalable, and secure infrastructure environments. • Collaborate closely with development, QA, and operations teams to troubleshoot deployment issues, optimize performance, and improve release reliability. • Continuously improve observability and feedback loops, leveraging monitoring and alerting systems to maintain operational excellence.
• Lead the design, architecture, and management of CI/CD pipelines using GitHub Actions (and similar tools), ensuring fast, reliable, and reproducible software delivery. • Implement and enforce test-driven deployment systems, integrating automated testing, validation, and monitoring to maintain code quality and accelerate feedback cycles. • Containerize applications and microservices with Docker, optimize image builds, and manage deployment pipelines for distributed environments. • Oversee the build, packaging, and publishing lifecycle for JavaScript, TypeScript, and C++ packages, including versioning, semantic tagging, and NPM or internal registry publication. • Develop and maintain cross-platform build pipelines using CMake or equivalent tools, ensuring consistent compilation and release workflows across web, desktop, and mobile. • Automate end-to-end release processes, including tagging, building, signing, and distributing mobile, web, and desktop applications. • Define and manage Infrastructure as Code (IaC) to provision and maintain reliable, scalable, and secure infrastructure environments. • Collaborate closely with development, QA, and operations teams to troubleshoot deployment issues, optimize performance, and improve release reliability. • Continuously improve observability and feedback loops, leveraging monitoring and alerting systems to maintain operational excellence.
• Lead the design, architecture, and management of CI/CD pipelines using GitHub Actions (and similar tools), ensuring fast, reliable, and reproducible software delivery. • Implement and enforce test-driven deployment systems, integrating automated testing, validation, and monitoring to maintain code quality and accelerate feedback cycles. • Containerize applications and microservices with Docker, optimize image builds, and manage deployment pipelines for distributed environments. • Oversee the build, packaging, and publishing lifecycle for JavaScript, TypeScript, and C++ packages, including versioning, semantic tagging, and NPM or internal registry publication. • Develop and maintain cross-platform build pipelines using CMake or equivalent tools, ensuring consistent compilation and release workflows across web, desktop, and mobile. • Automate end-to-end release processes, including tagging, building, signing, and distributing mobile, web, and desktop applications. • Define and manage Infrastructure as Code (IaC) to provision and maintain reliable, scalable, and secure infrastructure environments. • Collaborate closely with development, QA, and operations teams to troubleshoot deployment issues, optimize performance, and improve release reliability. • Continuously improve observability and feedback loops, leveraging monitoring and alerting systems to maintain operational excellence.

