ServiceNow logo
ServiceNow

ServiceNow provides cloud-based services that automate enterprise information technology operations. As an employer, ServiceNow offers a challenging, collaborat

Distinguished Engineer - Mulit-Cloud - Control Plane - Kubernetes

Cloud EngineerCloud EngineerFull TimeRemoteLeadTeam 29,000Since 2004Company Site

Location

California

Posted

1 day ago

Salary

$254.5K - $445.4K / year

Seniority

Lead

English

Job Description

Distinguished Engineer - Mulit-Cloud - Control Plane - Kubernetes

ServiceNow

Company Description It all started when engineer Fred Luddy wrote code that automated a tedious task for his coworker, Phyllis. She cried tears of joy. That moment inspired Fred to build a company that could do that for everyone-freeing people from busywork so they could focus on meaningful work. Today, ServiceNow is the AI control tower for business reinvention. Our ServiceNow AI platform brings together any AI, any data, and any workflow- helping 85% of the Fortune 500® work smarter, faster, and better. We're building an AI-native culture where technology and talent are unstoppable together. And we're just getting started. Join us to put AI to work for people. Job Description Position Location: This is a Flexible (Hybrid) position. Flexible positions require 2 days per week in a ServiceNow office. We have offices in several locations, including Santa Clara, CA, San Diego, CA, and Kirkland, WA Cloud Platform - Kubernetes, Hyperscalers & Distributed Systems We are currently seeking a Distinguished Engineer. In this role, you will have org-wide technical authority, giving you the ability to set technical direction across multiple engineering organizations. You will own the hardest, most ambiguous problems in the platform domain. What you get to do in this role: - You will set the technical direction for our cloud-native platform across multiple engineering teams and organizations, defining the architecture and standards for how Kubernetes, distributed systems, and hyperscaler infrastructure are built and operated at scale. - You will own the hardest, most ambiguous technical problems in the platform domain - multi-cloud topology, control-plane design, workload isolation, identity and trust fabric, and reliability at the scale of hundreds of clusters and dozens of product workloads. - You will partner with engineering fellows, principal engineers, and other senior technical leaders to drive consistent architectural decisions and the adoption of best practices across the entire platform ecosystem. - You will identify and mitigate the biggest technical risks in initiatives with C-suite visibility, and you will be the person leadership trusts to make the call on managed vs. self-managed tradeoffs, substrate portability, and multi-hyperscaler strategy. - Where you see the need, you will personally design and build the critical components - control planes, operators, infrastructure abstractions, and the systems other teams build on top of. - You will mentor staff and principal engineers and shape the next generation of the organization's technical leadership. Qualifications To be successful in this role you have: - Experience leveraging or critically thinking about how to integrate AI into engineering and platform work - whether using AI-powered tooling, automating operational workflows, building agentic systems for fleet visibility and operations, or reasoning about AI's impact on how infrastructure is built and run. - 12+ years of experience designing, building, and operating large-scale distributed systems in production, with deep expertise running Kubernetes at scale (multi-cluster, multi-region, multi-tenant). - Hands-on, authoritative experience across one or more major hyperscalers (AWS, Azure, GCP), including their managed Kubernetes offerings (EKS, AKS, GKE) and the networking, IAM, and capacity tradeoffs that come with each. - A proven track record of providing technical leadership across multiple engineering teams, influencing architecture and direction without relying on positional authority. - Deep expertise in the core building blocks of a modern platform: the operator/controller pattern, infrastructure-as-code and control planes (e.g., Crossplane), GitOps-based delivery, container networking (CNI), and service mesh. - Strong programming skills in Go and fluency across the cloud-native ecosystem. It also helps if you have: - Experience designing identity and trust fabrics for distributed systems - workload identity, mTLS, and standards such as SPIFFE/SPIRE. - Experience with multi-tenant workload isolation and runtime security (e.g., Kata Containers, sandboxed runtimes). - Experience building and operating platforms for regulated or federal markets (FedRAMP, air-gapped or self-hosted distribution, OCI bundling). - Experience with observability at scale - metrics, tracing, and SLO-driven operations across a large fleet. - A platform-as-product mindset: treating internal engineering teams as customers and the platform as a product with a roadmap, contracts, and a delivery pipeline. For positions in this location, we offer a base pay of $254,500 - $445,400, plus equity (when applicable), variable/incentive compensation and benefits. Sales positions generally offer a competitive On Target Earnings (OTE) incentive compensation structure. Please note that the base pay shown is a guideline, and individual total compensation will vary based on factors such as qualifications, skill level, competencies, and work location. We also offer health plans, including flexible spending accounts, a 401(k) Plan with company match, ESPP, matching donations, a flexible time away plan and family leave programs. Compensation is based on the geographic location in which the role is located and is subject to change based on work location. Additional Information Work Personas We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here . To determine eligibility for a work persona, ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service. Equal Opportunity Employer ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements. Accommodations We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact globaltalentss@servicenow.com for assistance. Export Control Regulations For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities. From Fortune. ©2026 Fortune Media IP Limited. All rights reserved. Used under license.

Related Categories

Related Job Pages

More Cloud Engineer Jobs

ServiceNow logo

Distinguished Engineer - Mulit-Cloud Control Plane - Kubernetes

ServiceNow

ServiceNow provides cloud-based services that automate enterprise information technology operations. As an employer, ServiceNow offers a challenging, collaborat

Full TimeRemoteTeam 29,000Since 2004

Company Description It all started when engineer Fred Luddy wrote code that automated a tedious task for his coworker, Phyllis. She cried tears of joy. That moment inspired Fred to build a company that could do that for everyone-freeing people from busywork so they could focus on meaningful work. Today, ServiceNow is the AI control tower for business reinvention. Our ServiceNow AI platform brings together any AI, any data, and any workflow- helping 85% of the Fortune 500® work smarter, faster, and better. We're building an AI-native culture where technology and talent are unstoppable together. And we're just getting started. Join us to put AI to work for people. Job Description Position Location: This is a Flexible (Hybrid) position. Flexible positions require 2 days per week in a ServiceNow office. We have offices in several locations, including Santa Clara, CA, San Diego, CA, and Kirkland, WA Cloud Platform - Kubernetes, Hyperscalers & Distributed Systems We are currently seeking a Distinguished Engineer. In this role, you will have org-wide technical authority, giving you the ability to set technical direction across multiple engineering organizations. You will own the hardest, most ambiguous problems in the platform domain. What you get to do in this role: - You will set the technical direction for our cloud-native platform across multiple engineering teams and organizations, defining the architecture and standards for how Kubernetes, distributed systems, and hyperscaler infrastructure are built and operated at scale. - You will own the hardest, most ambiguous technical problems in the platform domain - multi-cloud topology, control-plane design, workload isolation, identity and trust fabric, and reliability at the scale of hundreds of clusters and dozens of product workloads. - You will partner with engineering fellows, principal engineers, and other senior technical leaders to drive consistent architectural decisions and the adoption of best practices across the entire platform ecosystem. - You will identify and mitigate the biggest technical risks in initiatives with C-suite visibility, and you will be the person leadership trusts to make the call on managed vs. self-managed tradeoffs, substrate portability, and multi-hyperscaler strategy. - Where you see the need, you will personally design and build the critical components - control planes, operators, infrastructure abstractions, and the systems other teams build on top of. - You will mentor staff and principal engineers and shape the next generation of the organization's technical leadership. Qualifications To be successful in this role you have: - Experience leveraging or critically thinking about how to integrate AI into engineering and platform work - whether using AI-powered tooling, automating operational workflows, building agentic systems for fleet visibility and operations, or reasoning about AI's impact on how infrastructure is built and run. - 12+ years of experience designing, building, and operating large-scale distributed systems in production, with deep expertise running Kubernetes at scale (multi-cluster, multi-region, multi-tenant). - Hands-on, authoritative experience across one or more major hyperscalers (AWS, Azure, GCP), including their managed Kubernetes offerings (EKS, AKS, GKE) and the networking, IAM, and capacity tradeoffs that come with each. - A proven track record of providing technical leadership across multiple engineering teams, influencing architecture and direction without relying on positional authority. - Deep expertise in the core building blocks of a modern platform: the operator/controller pattern, infrastructure-as-code and control planes (e.g., Crossplane), GitOps-based delivery, container networking (CNI), and service mesh. - Strong programming skills in Go and fluency across the cloud-native ecosystem. It also helps if you have: - Experience designing identity and trust fabrics for distributed systems - workload identity, mTLS, and standards such as SPIFFE/SPIRE. - Experience with multi-tenant workload isolation and runtime security (e.g., Kata Containers, sandboxed runtimes). - Experience building and operating platforms for regulated or federal markets (FedRAMP, air-gapped or self-hosted distribution, OCI bundling). - Experience with observability at scale - metrics, tracing, and SLO-driven operations across a large fleet. - A platform-as-product mindset: treating internal engineering teams as customers and the platform as a product with a roadmap, contracts, and a delivery pipeline. For positions in this location, we offer a base pay of $254,500 - $445,400, plus equity (when applicable), variable/incentive compensation and benefits. Sales positions generally offer a competitive On Target Earnings (OTE) incentive compensation structure. Please note that the base pay shown is a guideline, and individual total compensation will vary based on factors such as qualifications, skill level, competencies, and work location. We also offer health plans, including flexible spending accounts, a 401(k) Plan with company match, ESPP, matching donations, a flexible time away plan and family leave programs. Compensation is based on the geographic location in which the role is located and is subject to change based on work location. Additional Information Work Personas We approach our distributed world of work with flexibility and trust. Work personas (flexible, remote, or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here . To determine eligibility for a work persona, ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service. Equal Opportunity Employer ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, creed, religion, sex, sexual orientation, national origin or nationality, ancestry, age, disability, gender identity or expression, marital status, veteran status, or any other category protected by law. In addition, all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements. Accommodations We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process, or are unable to use this online application and need an alternative method to apply, please contact globaltalentss@servicenow.com for assistance. Export Control Regulations For positions requiring access to controlled technology subject to export control regulations, including the U.S. Export Administration Regulations (EAR), ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities. From Fortune. ©2026 Fortune Media IP Limited. All rights reserved. Used under license.

California
$254.5K - $445.4K / year
ClubRunner logo

Cloud Platform Engineer

ClubRunner

Membership Success Platform

Full TimeRemoteTeam 11-50H1B No Sponsor

• Manage and evolve Azure cloud infrastructure and platform services across production and non-production environments • Improve and support CI/CD pipelines, automation, monitoring, observability, and operational processes • Troubleshoot and resolve infrastructure, networking, deployment, security, and production issues • Lead initiatives related to Infrastructure-as-Code, resiliency, disaster recovery, scalability, and operational maturity • Partner with development teams to improve deployment reliability, platform standards, and cloud best practices • Manage cloud networking, secure connectivity, certificates, firewalls/WAF, and related Azure services • Optimize cloud performance, reliability, and Azure operational costs • Participate in architecture discussions, technical planning, documentation, and on-call support for critical production incidents

Canada
$95K - $125K / year
McKesson logo

Lead Cloud & Platform Service Manager

McKesson

Sarah Cannon Research Institute (SCRI) is one of the world’s leading oncology research organizations conducting community-based clinical trials. Focused on advancing therapies for patients over the last three decades, SCRI is a leader in drug development. In 2022, SCRI formed a joint venture with former US Oncology Research to expand clinical trial access across the country. It has conducted more than 850 first-in-human clinical trials since its inception and contributed to pivotal research that has led to the majority of new cancer therapies approved by the FDA in the past decade. SCRI’s research network brings together more than 1,300 physicians who are enrolling patients into clinical trials at more than 200 locations in 20+ states across the U.S.

Full TimeRemoteTeam 10,001+Since 1833H1B Sponsor

Role Description Oversee the health and strategic direction of cloud and platform computing systems, essential for supporting US Oncology practices. Responsible for crafting and executing the US Oncology cloud and hosted computing service roadmap, ensuring alignment with McKesson standards and stakeholder expectations. - Collaboration with customer success teams to ensure project management and technology service delivery meet objectives and adhere to committed timelines. - Strict adherence to technology standards to guarantee reliable service delivery. - Support for both traditional and cloud-hosted platform services across 450+ sites in the United States. - Ensure platform services align with the unified objectives of McKesson and the US Oncology Network strategy. - Conduct regular service and roadmap reviews to align and prioritize initiatives with key stakeholders. - Maintain a culture of performance, transparency, cost efficiency, and quality improvement while building trust and credibility throughout the organization. - Leadership and oversight of major incident management (MIM) for USON cloud and platform services. - Leadership in cloud migration initiatives, driving the adoption of cloud services across US Oncology to improve patient experience. - Maintain documentation that supports the USON environment, including non-standard configurations. - Deliver metrics on KPIs required to support service health and initiatives. - Collaborate with internal and external technology partners to deliver strategic initiatives. - Establish priorities, providing guidance, and securing engagement and commitment from teams. - Manage requests, providing timely approvals, and enforcing governance. - Partner with MT managed services to support get-to-green programs. - 100% telecommuting allowed from any location in the U.S. 10% domestic travel required. Qualifications - Bachelor’s Degree in Computer Science, Electronics Engineering or a related field. - Ten (10) years of experience in the job offered or a related field. - Five (5) years demonstrated experience in: - Conducting Cloud Assessment and discovery to evaluate existing infrastructure and identify opportunities for cloud adoption. - Managing multiple customer deliveries globally, demonstrating ability to handle complex projects across different regions. - Utilizing public cloud platforms such as AWS and Azure to optimize performance and scalability. - Three (3) years demonstrated experience in: - Leading Cloud transformation and migration projects, ensuring seamless transition of applications and data to cloud environments. - Driving application transformation to Cloud. - Implementing Cloud Governance strategies focusing on cost management, compliance, security, availability, and reliability. - Providing hands-on expertise in delivery, effectively leading teams and driving projects to completion. - Overseeing MIM (Incident Management) processes to ensure timely resolution of issues and minimizing downtime. - Providing service management. - Working with platform technologies and cloud hosting support/engineering in a 24x7 Enterprise environment. - Performing business continuity planning, disaster recovery, high availability technologies, and implementations, including troubleshooting and configuration management. - Providing leadership in a matrixed organization, delivering beyond reporting structures. - Working with Microsoft Azure and rolling out and managing server and platform services at scale. - Debugging/resolving system failures in production environments. Requirements - Applicant must have five (5) years demonstrated experience in each of the following: - Conducting Cloud Assessment and discovery to evaluate existing infrastructure and identify opportunities for cloud adoption. - Managing multiple customer deliveries globally, demonstrating ability to handle complex projects across different regions. - Utilizing public cloud platforms such as AWS and Azure to optimize performance and scalability. - Applicant must have three (3) years demonstrated experience in each of the following: - Leading Cloud transformation and migration projects, ensuring seamless transition of applications and data to cloud environments. - Driving application transformation to Cloud. - Implementing Cloud Governance strategies focusing on cost management, compliance, security, availability, and reliability. - Providing hands-on expertise in delivery, effectively leading teams and driving projects to completion. - Overseeing MIM (Incident Management) processes to ensure timely resolution of issues and minimizing downtime. - Providing service management. - Working with platform technologies and cloud hosting support/engineering in a 24x7 Enterprise environment. - Performing business continuity planning, disaster recovery, high availability technologies, and implementations, including troubleshooting and configuration management. - Providing leadership in a matrixed organization, delivering beyond reporting structures. - Working with Microsoft Azure and rolling out and managing server and platform services at scale. - Debugging/resolving system failures in production environments. Benefits - Offered wage: $209,394 – $226,800/year. - Competitive compensation package as part of Total Rewards. - Compensation determined by factors including performance, experience, skills, equity, job market evaluations, and geographical markets. - Other compensation may include annual bonus or long-term incentive opportunities. Contact To apply, please send resumes to JobPostings@McKesson.com . Reference #: 002207.

United States
$209.4K - $226.8K / year

Role Description Linux Vulnerability Remediation Engineer (Server Infrastructure – RHEL 7/8/9/10) Remote, Fulltime Key Responsibilities - Vulnerability Remediation & Patch Management: - Own and execute end-to-end remediation for vulnerabilities identified on Linux servers (RHEL 7/8/9), including OS/package patching and configuration hardening. - Fast-track and manage all Meridian-related remediation requirements as they are received, ensuring adherence to defined SLAs and audit expectations. - Triage vulnerability findings (primarily from Qualys) and translate them into actionable remediation plans, considering exploitability, criticality, asset tiering, and operational risk. - Coordinate remediation activities for: - Kernel and package updates (YUM/DNF), security errata, and required reboots where applicable. - CIS/STIG-aligned configuration changes (as applicable in the environment). - Mitigations/compensating controls when immediate patching is not feasible (documented and approved per process). - Automation, Configuration Management & Engineering: - Develop, enhance, and maintain remediation automation using: - Chef (cookbooks/recipes, attributes, templates, policy files as applicable) - Ansible (playbooks, roles, inventories, modules) - Shell scripting (Bash) and Ruby for server-side automation and custom remediation logic - Convert recurring manual remediation steps into repeatable automated solutions and standardized runbooks. - Ensure code follows internal engineering standards: version control, peer review, testing, documentation, and change management. - Validation, Closure & Reporting: - Validate remediation effectiveness by re-scanning and verifying closure in Qualys (and/or approved internal validation methods). - Confirm fixes did not introduce regressions; coordinate with application and platform teams for post-change verification. - Maintain accurate documentation of remediation actions, approvals, exceptions, and closure evidence to support audit and compliance needs. - Provide progress updates, metrics, and risk status to stakeholders (e.g., open critical/high items, aging items, SLA adherence). - Cross-Team Coordination & Operational Execution: - Schedule and lead remediation calls with infrastructure support teams, application owners, and other stakeholders to drive timely execution. - Work within change management processes: create/execute change plans, develop rollback steps, and coordinate maintenance windows. - Partner with platform engineering to improve standard server baselines and prevent vulnerability recurrence. - Vendor & Release Coordination (as needed): - Follow up with vendors (e.g., Red Hat or software providers) for patch availability, release schedules, and remediation guidance when vulnerabilities require vendor action. - Track advisories (RHSA/RHBA) and coordinate planned rollout timelines where applicable. Qualifications - 6-10 years of Strong hands-on experience with RHEL 7/8/9/10 in enterprise environments. - Proven experience driving vulnerability remediation and patch management for Linux servers. - Expertise with Qualys (or equivalent vulnerability scanners) including interpreting findings, false-positive validation, and closure verification. - Automation experience with Chef and/or Ansible in production. - Strong scripting skills: Bash, plus working proficiency in Ruby (or ability to maintain/extend existing Ruby codebases). - Understanding of Linux security fundamentals (permissions, services, SSH hardening, package management, kernel considerations). - Experience working with change management, incident/problem management, and coordinating across multiple support teams. Preferred Qualifications - Familiarity with compliance/security frameworks (e.g., CIS benchmarks, STIG concepts) as applied to Linux servers. - Experience with CI/CD or automated testing for infrastructure code (linting, unit/integration testing where applicable). - Experience operating in large-scale environments (hundreds/thousands of servers) with tiered production controls. - Working knowledge of container host hardening and server-side runtime dependencies (if applicable to the server fleet). Key Skills & Competencies - Remediation prioritization and risk-based decision making. - Strong troubleshooting and root-cause analysis (package conflicts, dependency issues, service impacts). - Clear communication and ability to drive closure across stakeholders. - Documentation discipline and audit readiness mindset. - Ability to deliver under tight timelines while maintaining system stability. Deliverables / Success Measures - Reduction in open Patch NOW/Critical/High vulnerabilities and improved SLA compliance. - Consistent, repeatable remediation through Chef/Ansible automation. - Verified closures in Qualys with clear evidence and minimal re-open rates. - Improved remediation cycle time for Meridian requirements and other prioritized findings. - Fewer recurring vulnerability patterns through baseline improvements and preventive controls.

United States