Dive into anything
Staff Machine Learning Engineer, AI Serving
Location
United States
Posted
22 days ago
Salary
$253.3K - $354.6K / year
Seniority
Lead
Job Description
Staff Machine Learning Engineer, AI Serving
Reddit, Inc.
• Lead the end-to-end design, implementation, and maintenance of a highly available, low-latency GPU-based model serving system for search, ranking, and LLMs supporting Millions of QPS. • Design and develop ML and Generative AI systems in cloud-based production environments on Kubernetes at scale. • Rapidly develop prototypes and develop a high-performance feature hydration and processing system as a part of the inference stack - including routing, caching, and batching. • Lead a unified GPU model export framework to support converting trained models into optimized GPU inference models. • Strong understanding of real-time ML observability to track feature/model performance. • Experience working with LLM serving online at scale. • Built an E2E inference performance benchmarking framework • Deep Understanding of multi-cluster compute environment and network topology that is specific to ML inference use cases.
Job Requirements
- 7+ years of experience in ML Engineering, AI Platform Engineering, or Cloud AI Deployment roles.
- Have experience operating orchestration systems such as Kubernetes at scale
- Deep experience with cloud-based technologies for supporting an ML platform, including tools like AWS, Google Cloud Storage, infrastructure-as-code (Terraform), and more
- Proficiency with the common programming languages and frameworks of ML, such as Go, Python, etc.
- Excellent communication skills with the ability to articulate technical AI concepts to non-technical stakeholders
- Strong focus on scalability, reliability, performance, and ease of use. You are an undying advocate for platform users and have a deep intuition for the genAI product development lifecycle.
- Strong knowledge of model serving, inference pipelines, monitoring, and observability for AI systems is a plus
- Strong proficiency in Python and deep experience with modern AI/ML frameworks (Triton, Dynamo, vLLM, Pytorch)
Benefits
- Comprehensive Healthcare Benefits and Income Replacement Programs
- 401k with Employer Match
- Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
- Family Planning Support
- Gender-Affirming Care
- Mental Health & Coaching Benefits
- Flexible Vacation & Paid Volunteer Time Off
- Generous Paid Parental Leave
Related Guides
Related Job Pages
More Machine Learning Engineer Jobs
At Bose Corporation, we believe sound is the most powerful force on earth — and for over 60 years, we have been a company built on innovation, excellence, and independence. Privately owned, fiercely customer-focused, and driven by our values, we continue to lead industries and transform lives through sound. Today, Bose Corporation is entering an exciting new era. Across multiple global Business Units and Global Functions, we are shaping the future of audio technology, automotive, luxury, and premium experiences. We invite you to join us in this transformation. Job DescriptionTimeframe: June 1 – August 21, 2026 THE ROLE The goal of the Audio Machine Learning Research team is to develop novel AI-powered audio processing algorithms. The twist is that our algorithms must run in real-time, on physical devices, for applications such as voice pickup, hearing augmentation and ones we haven't even thought of yet. As part of the team, you will work with experts in machine learning (ML), digital signal processing (DSP), software engineering and psychoacoustics to prototype and implement new algorithms. Bose has a strong history of combining creative thinking with cutting-edge technology in the audio domain. We are looking for candidates passionate about machine learning and audio to help us shape the next chapter in the future of Bose! Responsibilities: - Most of your time will be devoted to prototyping, implementing and evaluating ML algorithms, curating and developing internal resources, and presenting your findings. - You will integrate your novel solutions into existing systems and platforms to showcase new (proof of concept) solutions. - You will be able to contribute to projects, which will be shipped to Bose customers, apply for patents, and/or submit papers to top-tier AI and signal processing conferences (e.g., NeurIPS, ICASSP, Interpeech, etc.). Education: - Pursuing or recently finished a graduate-level degree in ML, Computer Science, Music Technology or a related field. Skills: - Practical knowledge of Applied audio ML (TensorFlow/PyTorch, TFLite/ONNX is a plus) and Audio DSP (Python, Matlab and/or C/C++). - Hands-on experience in at least one of the following research topics: Audio source separation, Speech enhancement, Microphone array signal processing, Tiny ML, Generative audio modelling - Familiarity with methods for spatial sound synthesis and/or room acoustics simulation/analysis is a plus. - Strong communication skills. You will be presenting your work to a large interdisciplinary community. At Bose, you're inspired to be and do your best and are rewarded for your unique talents! Our compensation is thoughtfully tailored to your skills, experience, education, and location, and goes beyond base salary. The hiring range for this position in the primary work location of Framingham, Massachusetts is: $40.00-$51.25 per hour.The hiring range for other Bose work locations may vary. In addition to competitive base pay we offer rewards including bonus programs, comprehensive health and welfare benefits, a 401(k) plan, plus exclusive perks designed to support your wellbeing, and a generous employee discount where you can immerse yourself in our products and experiences. We are a proudly independent company—driven by purpose, guided by our values, and united by a belief in the power of sound. As the world leader in audio experiences, we’re creating what’s next—pushing boundaries and delivering transformative sound experiences for people everywhere. Join us and make your next career move a mic-drop. Let’s Make Waves. Bose is an equal opportunity employer. We evaluate qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, genetic information, national origin, age, disability, veteran status, or any other legally protected characteristics. The EEOC’s “Know Your Rights: Workplace discrimination is illegal” Poster is available here: https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf. Bose is committed to providing reasonable accommodations to individuals with disabilities. If you require reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employee selection process, please direct your inquiries to applicant_disability_accommodationrequest@bose.com. Please include "Application Accommodation Request" in the subject of the email. Our goal is to create an atmosphere where every candidate feels supported and empowered in the interviewing process. Diversity and inclusion are integral to our success, and we believe that providing reasonable accommodation is not only a legal obligation but also a fundamental aspect of our commitment to being an employer of choice. We recognize that individuals may have different needs and requirements based on their abilities, and we provide reasonable accommodations to ensure ideal conditions are met during the application process.
Role Description Want to own the data infrastructure behind some of the most naturalistic voice models in production? You'll be joining a well-funded speech AI startup — just closed their Series A — with strong enterprise traction and revenue that more than doubled last quarter. They're building ultra-realistic voice technology that handles natural laughter, breathing, seamless language switching, and accurate pronunciation across languages and accents. Their models are powering hundreds of millions of conversations monthly. Before training a single model, they built their own corpus — full-duplex, studio-quality conversational speech annotated by PhD linguists. As their MLE, you'll own the pipelines that turn that raw material into clean, training-ready data. - Own end-to-end data pipelines from raw audio ingestion through to versioned, training-ready datasets - Build quality systems that catch annotation errors and alignment issues before they reach a training run - Maintain the training infrastructure that keeps GPUs fed — dataloaders, streaming datasets, multi-modal batching - Build and iterate on tooling across speech representations including neural codecs, semantic tokens and mel features - Handle full- and half-duplex pipeline work including two-channel alignment and overlap handling Qualifications - Strong engineering fundamentals with experience building ML data pipelines at scale - Hands-on experience with speech or audio data - Solid understanding of speech representations and the tradeoffs between them - Experience with multi-channel audio data including diarisation and alignment Requirements - Experience with multilingual data pipelines (Nice to have) - Large-scale training infrastructure experience — FSDP, DeepSpeed, Ray (Nice to have) - Annotation tooling and human-in-the-loop systems (Nice to have) Benefits - Remote-friendly - Competitive base plus stock
Role Description We’re looking for a Senior Machine Learning Engineer who can bring their depth and breadth of experience in applied data science and optimization at industry scale to help guide our Index-wide technology strategy for machine learning and optimization, and to drive pragmatic execution and iterative improvement of the same. - Design and implement enterprise-scale MLOps systems and platforms (including data ingestion, feature pipelines, model training, validation, deployment and monitoring), setting standards for high-performance ML products. - Productionalize and support scalable and efficient Machine Learning models and solutions. - Define and enforce standards for model lifecycle management, including versioning, monitoring, alerting, traceability and highly available low-latency inference systems. - Refine and contribute to advanced data management strategies, optimizing for performance given extensive data loads. - Mentor teams and provide guidance for machine learning engineering deployment practices, promoting a culture of excellence. - Navigate complex problem-solving situations, contributing to decisions that affect the broader business's strategic direction. - Utilize CI/CD best practices to ensure the enterprise keeps with innovative practices and remains efficient, through engagement with ML and MLOps evolution outside of the organization. - Enable safe deployment strategies like shadow, canaries & gradual rollout. Qualifications - Advanced degree in Computer Science, Engineering, or a related field, or equivalent experience. - Expertise in high-performance backend technologies, preferably including Golang. - Extensive experience with cloud and/or on-premises distributed systems, data-intensive applications, and large-scale backend system design. - Deep knowledge of machine learning operationalization, including current trends, tools, and frameworks. - Proven problem-solving prowess, capable of pioneering novel solutions to sophisticated technical challenges. - Experienced in software development, deployment, and continuous improvement of complex CI/CD pipelines. - A love of technology, or thirst for knowledge, or curiosity about the world, or a desire to solve the hardest problems. Benefits - Comprehensive health, dental, and vision plans for you and your dependents. - Paid time off, health days, and personal obligation days plus flexible work schedules. - Competitive retirement matching plans. - Equity packages. - Generous parental leave available to birthing, non-birthing, and adoptive parents. - Annual well-being allowance plus fitness discounts and group wellness activities. - Commuter benefits and discounts, where available. - Employee assistance program. - Mental health first aid program that provides an in-the-moment point of contact and reassurance. - One day of volunteer time off per year and a donation-matching program. - Bi-weekly town halls and regular community-led team events. - Multiple resources and programming to support continuous learning. - A workplace that supports a diverse, equitable, and inclusive environment.
Role Description As a Senior Machine Learning Engineer in Computer Vision, you will design and deliver advanced vision systems that power mission-critical applications for global and Fortune 500 companies. You’ll work across deep learning, large-scale data pipelines, and high-performance infrastructure, owning models end-to-end from experimentation to production deployment. This role is designed for engineers who think systems-level, understand the real-world constraints of ML at scale, and can turn ambiguous visual problems into high-impact, production-ready solutions. You’ll shape architectures, guide model strategy, and bring modern vision capabilities into enterprise environments where reliability, speed, and accuracy matter. Functional Responsibilities: - Develop and fine-tune models for tasks like image classification, object detection, segmentation, and generative modeling using TensorFlow, PyTorch, or Keras. - Implement techniques such as resizing, normalization, data augmentation, and feature extraction to improve model performance. - Optimize and deploy computer vision models on cloud platforms (AWS, GCP, Azure), edge devices, and specialized hardware (GPUs, TPUs). - Use CI/CD, model versioning, and monitoring tools to ensure reliable and scalable deployment of vision models. - Improve model speed and performance using quantization, pruning, and hardware acceleration techniques. Qualifications - +5 years of hands-on experience developing and deploying machine learning models in production environments. - Proven experience writing production-level code, with strong proficiency in Python. - Strong Python programming skills with proficiency in deep learning frameworks (TensorFlow, PyTorch, or Keras). - Expertise in designing, training, and fine-tuning models for: - Image classification (ResNet, EfficientNet) - Object detection (Faster R-CNN, YOLO, SSD) - Image segmentation (U-Net, Mask R-CNN) - Strong understanding of image preprocessing techniques (resizing, normalization, data augmentation). - Experience with computer vision libraries such as OpenCV and torchvision. - Experience with transfer learning and adapting pre-trained models. - Ability to deploy models on cloud platforms (AWS, GCP, Azure) and specialized hardware (GPUs, TPUs). - Familiarity with MLOps tools for automating ML pipelines. Benefits - Ownership through equity participation. - Annual company retreat. - Education bonus for continuous learning. - Company-wide winter break. - Paid time off. - Optional in-person events and meetups. - Tailored career roadmaps. - High-performance culture.


