Senior ML Engineer, Token Factory
Location
Netherlands
Posted
111 days ago
Salary
0
Seniority
Senior
Job Description
Senior ML Engineer, Token Factory
Nebius Group
• Token Factory is a part of Nebius Cloud, one of the world's largest GPU clouds, running tens of thousands of GPUs. • We are building a high-performance inference and fine-tuning platform designed to push foundation models to their hardware limits. • Our mission is to maximize throughput, minimise latency, and optimise cost-per-token across tens of thousands of GPUs. • Inference Optimization: Identifying LLM inference bottlenecks to drive production speedups. • Inference engines support: Implement novel speculative decoding architectures, optimise components of various LLM designs (dense/MoE, autoregressive/parallel), and contribute to open-source inference engines. • Low Precision Training & Inference: Design and productionise low-precision (FP8, NVFP4/MXFP4) training and inference pipelines with measurable gains in throughput and cost-efficiency.
Job Requirements
- A profound understanding of theoretical foundations of machine learning and transformer architecture.
- Experience profiling GPU workloads using Nsight, PyTorch profiler, or similar tools
- Understanding of GPU memory hierarchy and compute/memory tradeoffs
- Familiarity with important ideas in LLM space, such as MHA, RoPE, KV-cache, Flash Attention, and quantisation
- Understanding of performance aspects of large neural network training (sharding strategies, custom kernels, hardware features etc.)
- Strong software engineering skills (we mostly use Python)
- Deep experience with modern deep learning frameworks
- Proficiency in contemporary software engineering approaches, including CI/CD, version control and unit testing
- Strong communication and leadership abilities.
Benefits
- Competitive salary and comprehensive benefits package.
- Opportunities for professional growth within Nebius.
- Flexible working arrangements.
- A dynamic and collaborative work environment that values initiative and innovation.
Related Guides
Related Job Pages
More Machine Learning Engineer Jobs
• Ship and support AI-native products from concept to production. • Apply sound judgment on building and maintaining high-performing production ML systems. • Stay deeply informed on the latest machine learning research and developments. • Collaborate with a team of strong engineers; serve as a mentor and resource for other high-potential ML engineers.
Senior Machine Learning Engineer
PartnerOneWe are the leaders in Big Data management through hyper-automation, virtualized cloud tiering, metadata and AI
• Implement and integrate agent-based systems into operational workflows. • Build, deploy, and monitor ML/AI models in production (batch). • Design, build, and maintain large-scale geospatial data pipelines. • Develop backend services and ML tooling • Establish observability for pipelines, models, and agents (metrics, tracing, alerting). • Collaborate with product and customer teams to drive revenue.
• Design and implement deep learning models for 3D computer vision tasks, including object detection, segmentation, and depth estimation. • Develop and maintain end-to-end machine learning pipelines encompassing data preprocessing, model training, evaluation, and deployment. • Optimize models for real-time inference and deploy them using cloud platforms such as AWS SageMaker or GCP Vertex AI. • Monitor deployed models, analyze performance metrics, and implement retraining strategies to ensure sustained accuracy and reliability. • Document methodologies, experiments, and findings; actively participate in code reviews and technical discussions. • Stay abreast of the latest research and advancements in machine learning and computer vision to inform model development.
• Develop scalable, production-ready LLM applications using frameworks like LangChain/LangGraph • Build robust RAG pipelines and integrate knowledge graphs for biological and clinical data • Write maintainable, high-performance code and build clean APIs and services for machine learning applications • Work with data engineers to build and optimize data workflows and pipelines for high-quality data ingestion and processing • Collaborate with product and domain teams to rapidly prototype AI solutions, iterate based on feedback, and scale models for production • Use modern MLOps tools to deploy and monitor models in production environments (AWS preferred) • Partner with engineering, data, and business teams to identify and develop high-value AI/ML applications • Stay ahead of the curve on emerging ML frameworks, GenAI capabilities, and healthcare technologies




