Job Closed

This listing is no longer active.

Waymo is a company in the autonomous driving technology space offering self-driving vehicles with the potential to increase mobility and decrease lives lost in

ML Engineer, Foundation Model Infrastructure

Machine Learning EngineerMachine Learning EngineerOther Remote Mid Level Company Site

Location

United States

Posted

136 days ago

Salary

$204K - $259K / year

Seniority

Mid Level

Python C++PyTorch JAX TensorFlow Apache Spark Kubeflow

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description The mission of the Waymo AI Foundations team is to develop machine learning solutions addressing open problems in autonomous driving, towards the goal of safely operating Waymo vehicles in dozens of cities and under all driving conditions. This role follows a hybrid work schedule and you will report to a Senior Research Scientist. - Build and operate the petabyte-scale data systems and ML pipelines at the heart of Waymo's foundation model development - Shepherd cutting-edge foundation models from research prototypes to robust components within the Waymo Driver - Create the automated infrastructure for rigorously benchmarking, continuously monitoring, and safely releasing models - Wield large-scale compute and frameworks like Flume and JAX to process massive datasets and train/deploy complex models - Drive significant leaps in the speed, reliability, and efficiency of the end-to-end ML development lifecycle - Partner with AI Foundations, ML, and Platform experts to transform model innovations into tangible on-road improvements Qualifications - Masters degree in Computer Science, Machine Learning, Robotics, similar technical field of study, or equivalent practical experience - Proficiency in Python - Proficiency in C++ - Familiarity with one of the modern deep learning frameworks (e.g. Pytorch, JAX, Tensorflow) - Experience building or maintaining large-scale data pipelines or ML infrastructure (e.g., Flume, Spark, Borg, Kubeflow) Requirements - Strong hands-on SWE skills, able to drive development of large, complex shared codebases - Experience in AV planning and related research - Experience designing and building distributed systems or MLOps platforms (e.g., model versioning, experiment tracking, CI/CD for ML) - Prior work in an industrial or research setting developing methodologies for the evaluation of ML models Benefits - Eligible to participate in Waymo’s discretionary annual bonus program - Equity incentive plan - Generous Company benefits program, subject to eligibility requirements Salary Range $204,000 — $259,000 USD

Job Requirements

Masters degree in Computer Science, Machine Learning, Robotics, similar technical field of study, or equivalent practical experience
Proficiency in Python
Proficiency in C++
Familiarity with one of the modern deep learning frameworks (e.g. Pytorch, JAX, Tensorflow)
Experience building or maintaining large-scale data pipelines or ML infrastructure (e.g., Flume, Spark, Borg, Kubeflow)
Strong hands-on SWE skills, able to drive development of large, complex shared codebases
Experience in AV planning and related research
Experience designing and building distributed systems or MLOps platforms (e.g., model versioning, experiment tracking, CI/CD for ML)
Prior work in an industrial or research setting developing methodologies for the evaluation of ML models

Benefits

Eligible to participate in Waymo’s discretionary annual bonus program
Equity incentive plan
Generous Company benefits program, subject to eligibility requirements
Salary Range
$204,000 — $259,000 USD

Related Categories

Machine Learning Engineer AI Engineer AI Research Scientist LLM Engineer Computer Vision Engineer NLP Engineer

Related Job Pages

Remote Python Jobs (US)More Remote Jobs

More Machine Learning Engineer Jobs

Member of Technical Staff, Inference

Runway

Business financials got stuck in the 15th century so we're showing them today’s computers 🖥

Machine Learning Engineer136 days ago

Other RemoteTeam 11-50Since 2018H1B Sponsor

Company Site LinkedIn

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description We're looking for an ML infrastructure engineer to bridge the gap between research and production at Runway. You'll work directly with our research teams to productionize cutting-edge generative models—taking checkpoints from training to staging to production, ensuring reliability at scale, and building the infrastructure that enables fast iteration. You'll be embedded within research teams, providing platform support throughout the entire model development lifecycle. Your work will directly impact how quickly we can ship new models and features to millions of users. A peek at our technical stack - API endpoints for real-time collaboration and media asset management written in TypeScript, running in ECS containers on AWS Fargate. - Leverage multiple AWS-native components, such as S3, CloudFront, Lambda, Kinesis, and SQS. - Inference backend written in Python (PyTorch, TorchScript), deployed across multiple clusters/cloud providers. - Use Kubernetes for container orchestration, with k8s-native components such as Flyte, Kueue, and Kyverno for efficient job orchestration. - Invest in Prometheus and Grafana for monitoring, and Terraform to manage infrastructure. Qualifications - 4+ years of experience running ML model inference at scale in production environments. - Strong experience with PyTorch and multi-GPU inference for large models. - Experience with Kubernetes for ML workloads—deploying, scaling, and debugging GPU-based services. - Comfortable working across multiple cloud providers and managing GPU driver compatibility. - Experience with monitoring and observability for ML systems (errors, throughput, GPU utilization). - Self-starter who can work embedded with research teams and move fast. - Strong systems thinking and pragmatic approach to production reliability. - Humility and open-mindedness; at Runway we love to learn from one another. Requirements - Experience building custom inference frameworks or serving systems (Nice to Have). - Deep understanding of distributed training and inference patterns (FSDP, data parallelism, tensor parallelism) (Nice to Have). - Ability to debug low-level issues: NCCL networking problems, CUDA errors, memory leaks, performance bottlenecks (Nice to Have). - Experience with diffusion models or video generation systems (Nice to Have). - Knowledge of real-time or latency-sensitive ML applications (Nice to Have). Benefits - Salary range: $240,000 - $290,000. - Commitment to creating a space where employees can bring their full selves to work and have equal opportunity to succeed. Company Description Runway strives to recruit and retain exceptional talent from diverse backgrounds while ensuring pay equity for our team. Our salary ranges are based on competitive market rates for our size, stage, and industry, and salary is just one part of the overall compensation package we provide. There are many factors that go into salary determinations, including relevant experience, skill level and qualifications assessed during the interview process, and maintaining internal equity with peers on the team. The range shared below is a general expectation for the function as posted, but we are also open to considering candidates who may be more or less experienced than outlined in the job description. In this case, we will communicate any updates in the expected salary range. Lastly, the provided range is the expected salary for candidates in the U.S. Outside of those regions, there may be a change in the range, which again, will be communicated to candidates. We're excited to be recognized as a best place to work by Crain's, InHerSight, BuiltIn NYC, and INC.

View details: Member of Technical Staff, Inference

United States + 171 more

$240K - $290K / year

Apply

Job Closed

ML Engineer, Foundation Model Evaluation

Waymo

Waymo is a company in the autonomous driving technology space offering self-driving vehicles with the potential to increase mobility and decrease lives lost in

Machine Learning Engineer136 days ago

Other Remote

Company Site

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description The mission of the Waymo AI Foundations team is to develop machine learning solutions addressing open problems in autonomous driving, towards the goal of safely operating Waymo vehicles in dozens of cities and under all driving conditions. This role follows a hybrid work schedule and you will report to a Senior Research Scientist. - Develop and extend cutting-edge research in robotics and machine learning to advance state-of-the-art methodologies for evaluating the quality, safety, and realism of embodied AI agents - Partner within and across organizations to land disruptive and innovative tech in production - Work with a variety of state-of-the-art Foundation Models - Drive model development through defining evaluation and benchmarks - Implement and extend large scale data and evaluation pipelines Qualifications - Masters degree in Computer Science, Machine Learning, Robotics, similar technical field of study, or equivalent practical experience - Proficiency in Python - Familiarity with one of the modern deep learning frameworks (e.g. Pytorch, JAX, Tensorflow) - Prior work in an industrial or research setting developing methodologies for the evaluation of ML models Requirements - Strong hands-on SWE skills, able to design, implement, and extend large distributed pipelines - Track record of publications in top-tier conferences or leading open source projects in the related fields - Proficiency in C++ - Experience in AV planning and related research - Experience in labeling and curating data for ML eval and training Benefits - Eligibility to participate in Waymo’s discretionary annual bonus program - Equity incentive plan - Generous Company benefits program, subject to eligibility requirements Salary Range The expected base salary range for this full-time position across US locations is listed below. Actual starting pay will be based on job-related factors, including exact work location, experience, relevant training and education, and skill level. Salary Range: $170,000 — $216,000 USD

Python PyTorch JAX TensorFlow C++Distributed Systems

View details: ML Engineer, Foundation Model Evaluation

United States

$170K - $216K / year

Apply

Job Closed

Principal Machine Learning Engineer

Grace Hill

Helping owners and operators of real estate increase property performance, reduce operating risk and grow top talent.

Machine Learning Engineer136 days ago

Other RemoteTeam 51-200H1B Sponsor

Company Site LinkedIn

• Design and implement the statistical models and ML algorithms that drive our market analysis • Architect how models are trained, versioned, and served in a production environment • Partner with the product team to design the data foundations for every new feature • Act as a force multiplier for our full-stack engineers • Define the HelloData standard for data integrity, pipeline observability, and algorithmic transparency

BigQuery GCP Node.js Pandas PostgreSQL Python PyTorch scikit-learn TypeScript

View details: Principal Machine Learning Engineer

United States

$175K - $250K / year

Apply

Job Closed

Senior ML Engineer – Neural Rendering

Torc Robotics

Leading autonomous vehicle technology since 2007, Torc develops automated Level 4, Class 8 trucks with Daimler.

Machine Learning Engineer136 days ago

Full Time RemoteTeam 501-1,000Since 2007H1B Sponsor

Company Site LinkedIn

• Implement the latest research advances in Neural Rendering and generative models • Translate cutting edge solution in the domain of autonomous driving for high-quality Camera, LiDAR and Radar sensor simulations • Support implementing a neural rendering framework that allows to scale perception simulation and AV 3.0 training • Integrate the framework in a cloud environment and automate the pipeline to allow scaling for the target verification and validation of our autonomous trucks • Own development projects in the team – From research, design, to implementation, testing and deployment • Design, implement, test and deploy shippable production quality software starting from early prototypes using disciplined software development processes. • Work in the cloud machine learning ecosystem alongside other machine learning services existing in the company. • Proactively assess current capabilities to identify areas for improvement proposing solutions that align with core strategy and operation. • Demonstrate project management skills, serving as project lead guiding less experienced team members in multiple facets of project execution, coaching and mentoring as needed.

Cloud Python PyTorch

View details: Senior ML Engineer – Neural Rendering

Michigan

$177.3K - $234K / year

Apply

ML Engineer, Foundation Model Infrastructure

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Machine Learning Engineer Jobs

Member of Technical Staff, Inference

ML Engineer, Foundation Model Evaluation

Principal Machine Learning Engineer

Senior ML Engineer – Neural Rendering