Founded in 2013, Shippo is a logistics and supply company that provides shipping services to retailers, ecommerce platforms, marketplaces, and more. Operating f

Principal Machine Learning Engineer, ML Platform

Machine Learning EngineerMachine Learning EngineerOther Remote Lead Company Site

Location

Hawaii + 6 more

Posted

104 days ago

Salary

$212K - $287K / year

Seniority

Lead

15 yrs expEnglishDistributed Systems Kubernetes

Job Description

• Set technical strategy and drive a multi-quarter roadmap for ML platform capabilities aligned to Shippo’s business priorities. • Own cross-team architecture decisions, RFCs, and design reviews for ML lifecycle and inference. • Raise the engineering bar through mentorship, production readiness standards, and reusable platform primitives. • Be accountable for platform adoption, reliability, and cost-performance outcomes. • Build and operate core ML platform components: ML lifecycle foundation (experiment tracking, reproducibility, artifact management, model registry, versioning, and controlled promotion workflows using MLflow or equivalent). • Training and experimentation enablement (standardized environments, reusable pipelines/templates, evaluation harnesses, and repeatable workflows that let data scientists move from exploration to production with confidence). • Kubernetes-native model serving for real-time inference (safe rollout and rollback, autoscaling, reliability practices, and cost controls). • Batch inference and scoring pipelines (repeatable backfills, retraining triggers, consistent packaging between training and inference). • Observability for ML systems (service health metrics, alerting, and model-quality signals such as drift and data quality). • Developer experience (templates, reference implementations, documentation, and self-service workflows). • Evaluate and recommend inference frameworks and deployment patterns, and document tradeoffs for Shippo’s workloads. • Identify and resolve performance bottlenecks across the inference stack (model runtime, compute utilization, networking, serialization, and autoscaling behavior). • Establish ML engineering standards across training, evaluation, testing, model packaging, CI/CD, production readiness, and incident response. • Partner with Data Science teams to bridge research and production environments by creating repeatable frameworks, shared standards for code quality and reproducibility, and self-serve paths to deploy models safely. • Collaborate with Data and Engineering teams to ensure the platform supports real workflows, drives adoption, and meets reliability expectations. • Mentor engineers through design reviews, architecture guidance, and shared best practices across platform and ML development.

Job Requirements

15+ years of software engineering experience, including ownership of production systems (platform, infrastructure, or distributed systems).
4+ years owning ML systems end-to-end in production, including on-call and incident response, and making architecture decisions based on operational constraints (latency, throughput, availability, and cost).
Strong experience building and running services on Kubernetes, including deployments, autoscaling, and observability.
Hands-on experience with ML lifecycle tooling such as MLflow or equivalent (tracking, registry, packaging, and promotion workflows).
Demonstrated ability to evaluate inference tradeoffs across batch and real-time serving, CPU versus GPU, latency and throughput, cost, and operational complexity.
Demonstrated Principal-level technical leadership, including setting technical direction, driving cross-team alignment via RFCs/design reviews, and delivering multi-quarter roadmaps.
Proven ownership of reliability and operational outcomes for production systems (SLOs, incident response, and measurable improvements in stability and performance).
Demonstrated ability to ship incrementally, prioritize production reliability over perfect solutions, and drive adoption through pragmatic platform design.
Experience working with or evaluating managed ML platforms (Databricks, SageMaker, Vertex AI, or similar), with clear judgement on strengths, limitations, and build-vs-buy decisions.
Bonus Databricks experience (useful, not required), including Databricks workflows and ML tooling integration.
Experience with inference and serving frameworks.
Experience with feature store patterns, online and offline consistency, and model evaluation at scale.
Experience supporting optimization systems and decision engines in production.
LLM or agent workflow experience, especially evaluation harnesses, deployment patterns, guardrails, and monitoring.

Benefits

Healthcare coverage for medical, dental, and vision (90% covered by the company, incl. dependents).
Pets coverage is also available!
Take-as-much-as-you-need vacation policy & flexible working hours
One week-long company wide winter slow down
3 Volunteer Days Off (VTOs)
WFH stipend to set up your home office
Charity donation match up to $100
Dedicated programs, coaching, tools, and resources for your professional and career growth as well as an individual learning stipend for your personal and focused growth
Fun team in person time through our Shippos Everywhere program which includes regular team and company off-sites throughout the year as well as local Shippos gatherings.

Related Categories

Machine Learning Engineer AI Engineer AI Research Scientist LLM Engineer Computer Vision Engineer NLP Engineer

Related Job Pages

Machine Learning Engineer Jobs in Hawaii More Remote Jobs

More Machine Learning Engineer Jobs

Senior Machine Learning Engineer

Brahma

The only account you'll ever need to secure, transact, and explore onchain like never before.

Machine Learning Engineer104 days ago

Full Time RemoteTeam 11-50Since 2022H1B No Sponsor

Company Site LinkedIn

• Research and build deep learning systems that can generate expressive, natural-sounding speech from text or audio prompts. • Collaborate with cross-functional teams to integrate your work into production-ready pipelines. • Help build state-of-the-art generative systems for video and audio synthesis, performance transfer, and visual translation. • Shape core infrastructure, set best practices, communicate with stakeholders, and mentor junior engineers while delivering high-quality ML pipelines in production environments.

AWS Google Cloud Platform Python PyTorch Tensorflow

View details: Senior Machine Learning Engineer

United Kingdom

Apply

Head of Machine Learning

Brahma

The only account you'll ever need to secure, transact, and explore onchain like never before.

Machine Learning Engineer104 days ago

Full Time RemoteTeam 11-50Since 2022H1B No Sponsor

Company Site LinkedIn

• Define and execute the strategic roadmap for generative ML research aligned with company objectives • Lead research initiatives in multimodal generative models (video, audio, language), temporal consistency techniques, and multimodal generation with main focus on video generation • Build, lead, and mentor a diverse, high-performing ML team • Partner with product, engineering, and creative teams to integrate ML innovations into production systems • Contribute to the company's overall technical and product strategy.

Python PyTorch TensorFlow

View details: Head of Machine Learning

India

Apply

Machine Learning Engineer

MaxanaPay

Empowering businesses one transaction at a time

Machine Learning Engineer104 days ago

Other RemoteTeam 11-50Since 2019

Company Site LinkedIn

• Design, build, and deploy production-ready machine learning models that power critical features across our platform • Collaborate with your team to transform research prototypes into robust, scalable, and maintainable ML systems • Work closely with product and engineering teams to understand business problems and define technical ML solutions • Improve recommendation algorithms that personalize user experiences and drive engagement • Develop and maintain ML pipelines that can reliably process large volumes of data (LLM) • Monitor model performance in production and implement strategies to detect and address model drift • Participate in code reviews and mentor junior engineers in ML best practices • Stay current with the latest advancements and evaluate new techniques for potential application • Work on cutting edge projects that are pivotal to our Fortune 500 client

Python PyTorch scikit-learn Apache Spark SQL TensorFlow

View details: Machine Learning Engineer

United States

$150K - $160K / year

Apply

Job Closed

Senior Machine Learning Engineer

MaxanaPay

Empowering businesses one transaction at a time

Machine Learning Engineer104 days ago

Other RemoteTeam 11-50Since 2019

Company Site LinkedIn

• Work closely with Machine Learning Engineers to understand, refine, and prioritize requirements • Design and build our model serving service with simple, powerful APIs to capture our users/clients needs • Build our core ML model lifecycle management system to provide an ML-aware release and deployment experience • Improve machine learning data quality by using & building tools to automatically detect issues • Create intelligent ML-aware real-time monitoring & observability systems • Work closely with partner teams to integrate with other ML tools to create a seamless end-to-end experience • Leverage open-source technologies like Kubeflow, Kubernetes, Spark, Docker, Airflow, Tensorflow, and PyTorch

Airflow Docker Kubernetes Python PyTorch Scala Apache Spark TensorFlow

View details: Senior Machine Learning Engineer

United States

$120K - $160K / year

Apply

Job Closed

Principal Machine Learning Engineer, ML Platform

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Machine Learning Engineer Jobs

Senior Machine Learning Engineer

Head of Machine Learning

Machine Learning Engineer

Senior Machine Learning Engineer