Job Closed

This listing is no longer active.

Red River logo
Red River

Technology Decisions Aren't Black and White. Think Red.

Senior Machine Learning Engineer, vLLM

Machine Learning EngineerMachine Learning EngineerFull TimeRemoteSeniorTeam 501-1,000Since 2003H1B SponsorCompany SiteLinkedIn

Location

United States + 1 moreAll locations: United States | Canada

Posted

57 days ago

Salary

0

Seniority

Senior

No structured requirement data.

Job Description

Senior Machine Learning Engineer, vLLM

Red River

Job Summary At Red Hat we believe the future of AI is open and we are on a mission to bring the power of open-source LLMs and vLLM to every enterprise. Red Hat AI Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI deployments. As leading developers, maintainers of the vLLM project, and inventors of state-of-the-art techniques for model quantization and sparsification, our team provides a stable platform for enterprises to build, optimize, and scale LLM deployments. As a Senior Machine Learning Engineer focused on model optimization algorithms, you will work closely with our product and research teams to develop SOTA deep learning software. You will collaborate with our technical and research teams to develop LLM training and deployment pipelines, implement model compression algorithms, and productize deep learning research. If you are someone who wants to contribute to solving challenging technical problems at the forefront of deep learning in the open source way, this is the role for you. Join us in shaping the future of AI! What you will do - Contribute to the design, development, and testing of various inference optimization algorithms in the vLLM , and related projects, such as llm-d. - Create and manage inference serving deployment pipelines - Benchmark, profile, and evaluate different parallelizations, quantization and sparsification approaches to determine the best performance for specific hardware and models - Stay up-to-date with the latest advancements in the open source LLM model architecture, LLM Inference parallelizations/optimizations techniques, and quantization research - Stay up-to-date of latest CPU and GPU hardware architecture and features to boost AI inference performance - Give thoughtful and prompt code reviews - Continuous collaboration with internal and external open source comitters and contributors while contributing to vLLM and related projects ​ What you will bring - Strong understanding of machine learning and deep learning fundamentals with experience in one or more of LLM Inference Optimizations, Computer Vision, NLP, and reinforcement learning - Experience with tensor math libraries such as PyTorch and NumPy - Strong programming skills with proven experience implementing Python based machine learning solutions - Ability to develop and implement research ideas and algorithms - Experience with mathematical software, especially linear algebra - Understanding of Linear Algebra, Gradients, Probability, and Graph Theory - BS, or MS, or PhD in computer science or computer engineering or a related field. #LI-MD2 #AI-HIRING About Red Hat Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact. Inclusion at Red Hat Red Hat’s culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from different backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions that compose our global village. Equal Opportunity Policy (EEO) Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law. Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee. Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email application-assistance@redhat.com. General inquiries, such as those regarding the status of a job application, will not receive a reply.

Related Job Pages

More Machine Learning Engineer Jobs

Sezzle logo

Principal Machine Learning Engineer

Sezzle

Financially empowering the next generation of consumers.

Full TimeRemoteTeam 201-500Since 2016H1B Sponsor

• Overseeing the design, development, and deployment of machine learning models. • Drive the creation of scalable machine learning solutions for personalized recommendations, fraud detection, and credit risk assessment. • Collaborate with engineers and data scientists to build large-scale solutions.

Turkey
$50K - $95K / year
Dropbox logo

Machine Learning Intern, PhD

Dropbox

Dropbox is the one place to keep life organized and keep work moving.

InternshipRemoteTeam 1,001-5,000Since 2007H1B Sponsor

• Research and prototype innovative machine learning approaches in areas such as Search, Large Language Models (LLMs), Multimodal Content Understanding, and Recommender Systems • Design and implement end-to-end ML pipelines—from data exploration to model training, evaluation, and deployment • Analyze large-scale datasets to identify opportunities for personalization and improved user experiences • Partner with Product, Design, and Engineering teams to integrate models into Dropbox products • Contribute to the team’s technical discussions, offering research-based perspectives to guide experimentation and long-term strategy

United States
$11.5K - $12.5K / year
Job Closed
Cloudwerx logo

Senior AI/ML Engineer

Cloudwerx

SUMMIT Salesforce Consulting Partner - Advisory, Implementation, Integration, Automation, Data, AI & Managed Services

Full TimeRemoteTeam 201-500Since 2018H1B No Sponsor

• Architect and implement sophisticated multi-agent systems and autonomous workflows leveraging the Google AI SDK, LangGraph, and LangChain to solve complex, non-linear business processes. • Lead the design and construction of cloud-native solutions, using Terraform, Kubernetes, and Docker to ensure that AI models are deployed on scalable, reliable infrastructure. • Apply rigorous statistical evaluation frameworks to model performance, moving beyond standard metrics to include uncertainty estimation, calibration, and robust hypothesis testing during model optimization (LoRA, QLoRA). • Lead the development of custom predictive models and deep learning solutions, utilizing frameworks like PyTorch and Scikit-Learn to select suitable architectures—whether decision trees, neural nets, or ensembles—based on performance and client criteria. • Design and implement state-of-the-art generative models for NLP and multimodal tasks, leveraging tools like OpenCV for image preprocessing and Stable Diffusion concepts where applicable. • Champion MLOps best practices within the team, building validated data pipelines and CI/CD/CT workflows using Kubeflow and Vertex AI Pipelines to ensure model quality and integrity. • Collaborate directly with clients to understand their unique needs, translating business challenges into technical solutions and providing expert guidance on dataset management best practices. • Personally tackle the most difficult engineering challenges, identifying technical risks such as overfitting or latency issues, and optimizing hyperparameters to ensure precision and interpretability.

Canada
Buzz Solutions logo

Senior Computer Vision, Machine Learning Engineer

Buzz Solutions

Artificial intelligence, actionable insights, and predictive analytics for infrastructure inspections.

Full TimeRemoteTeam 11-50H1B No Sponsor

• Architect and lead end-to-end computer vision projects focused on: Equipment defect detection, Thermal anomaly identification, Vegetation encroachment monitoring, Surveillance of closed areas for human and animal intrusions • Drive innovation by incorporating the latest advances in deep learning and generative AI to enhance model training, accuracy and reliability • Develop production-grade Python libraries for the complete ML lifecycle • Mentor team members and establish best practices for model development, evaluation, deployment, and monitoring • Advocate for and uphold software quality standards within the ML team

United States