Senior ML Engineer – Token Factory
Location
Germany
Posted
123 days ago
Salary
0
Seniority
Senior
Job Description
Senior ML Engineer – Token Factory
Nebius Group
• Enhance fine-tuning methodologies - both LoRA-based and full-parameter - for cutting-edge LLMs (e.g., GPT-OSS, Kimi K2, DeepSeek V3/R1, GLM-4.5), focusing on both model quality and training efficiency. • Research and implement advanced inference optimization techniques, such as speculative decoding, quantization, and large-scale draft model training. • Re-implement state-of-the-art open-source LLM architectures in JAX.
Job Requirements
- A profound understanding of theoretical foundations of machine learning and reinforcement learning.
- Deep expertise in modern deep learning for language processing and generation.
- Substantial experience with training large models on multiple computational nodes.
- Reasonable understanding of performance aspects of large neural network training (sharding strategies, custom kernels, hardware features etc.).
- Strong software engineering skills (we mostly use Python).
- Deep experience with modern deep learning frameworks (we use JAX).
- Proficiency in contemporary software engineering approaches, including CI/CD, version control and unit testing.
- Strong communication and leadership abilities.
- Previous experience working with language models or other similar NLP technologies is nice to have.
- Familiarity with important ideas in LLM space, such as MHA, RoPE, ZeRO/FSDP, Flash Attention, quantization is a plus.
- A track record of building and delivering products (not necessarily ML-related) in a dynamic startup-like environment is desired.
- Strong engineering skills, including experience in developing large distributed systems or high-load web services is beneficial.
- Open-source projects that showcase your engineering prowess would be an advantage.
- Excellent command of the English language, alongside superior writing, articulation, and communication skills is required.
Benefits
- Competitive salary and comprehensive benefits package.
- Opportunities for professional growth within Nebius.
- Flexible working arrangements.
- A dynamic and collaborative work environment that values initiative and innovation.
Related Guides
Related Job Pages
More Machine Learning Engineer Jobs
Machine Learning Engineer (LLMs / AI)
ghSMARTWe help CEOs, boards and investors develop winning executive teams and make high-stakes leadership decisions.
Who We Are ghSMART is a premier leadership advisory firm trusted by CEOs, boards, and investors to solve their most critical leadership and talent decisions. For more than 30 years, we’ve partnered with many of the world’s most influential leaders and organizations to build winning leadership teams and amplify positive impact. Recognized for excellence, ghSMART consistently earns top rankings in industry surveys (e.g., Vault Consulting awards) and is featured in Forbes’ list of America’s Best Management Consulting Firms. Our culture is entrepreneurial and collaborative, with a strong focus on innovation and client success. Our team is made up of nearly 200 extraordinary individuals across the U.S., Europe, and APAC, who become trusted advisors to these leaders, helping amplify their positive impact on the world. We advise on the art and science of building winning leadership teams, doing meaningful work every day. What You’ll Do As a Machine Learning & Data Engineer, you will leverage ghSMART’s extensive, structured leadership dataset to build advanced AI solutions/agents and create new RAG‑based solutions. You’ll contribute to the evolution of our ghSMART Leadership Intelligence Platform, help drive research initiatives, and collaborate with academic partners. You will also support both data engineering and AI engineering efforts, ensuring systems, pipelines, and models work cohesively to power leadership insights at scale. Responsibilities Design, build, and extend ML models (LLMs and traditional ML) that deliver high-accuracy insights from ghSMART’s structured leadership dataset; own end-to-end experimentation, evaluation, and deployment. Develop RAG-based agents and algorithms to unlock novel leadership insights from our research database. Integrate advanced solutions and AI Agents into the Leadership Intelligence Platform and partner cross-functionally to align features with strategic objectives and user needs. Optimize data pipelines and workflows to ensure robust, efficient data ingestion, transformation, and model serving across engineering teams. Collaborate on research with academic partners and contribute to publications and thought leadership by validating findings with rigorous methods. You Bring 5+ years of ML engineering experience building and shipping large-scale models and systems (training, tuning, inference, MLOps, monitoring). Hands-on expertise with RAG frameworks and LLMs, including designing retrieval strategies, prompt orchestration, evaluation, and deployment at scale. Experience building AI agents via the LangChain, LangGraph framework is a plus. Strong data engineering fundamentals across pipelines, data quality, and feature engineering to support reliable ML workflows. Experience with Databricks and Azure is a plus. Security and privacy mindset, with experience applying best practices to protect sensitive data in ML systems. Collaborative, remote-first working style with clear communication and ownership; familiarity with Salesforce (SFDC), Jira, Confluence, and Git. Why join ghSMART? Meaningful Impact Everyday: We believe leadership is the greatest force for good. At ghSMART, whether you're guiding the world’s top leaders or helping power the firm from within, you play a vital role in solving our clients’ greatest challenge: building and developing talented, diverse teams that fuel lasting success. Together, we help leaders amplify their positive impact—on their organizations, their people, and the world. Exceptional team, grounded in generosity: We have a team of extraordinary people united by excellence, humility, and a shared purpose. You’ll collaborate with brilliant colleagues who challenge and support you. Here, exceptional talent meets deep respect—where people show up with heart, and everyone has a place. Freedom to Shape a Career with Purpose: You have the power to shape a career that aligns with your purpose—doing meaningful work that drives impact for the world’s top leaders. You’ll help solve challenges that matter while being supported by brilliant colleagues and trusted with the flexibility you need to recharge, perform at your best, and grow for the long term. Have your voice and talents recognized . We are a flat organization that values proactivity and ability over bureaucracy and tenure. All our decisions and actions are guided by our Values and Credo - to help leaders amplify their positive impact on the world. Learn why our consultants love working here . We are ranked #1 or #2 in 10 Consulting categories by Vault . See what others think about working at ghSMART on Glassdoor . Compensation Certain US jurisdictions require ghSMART to include a reasonable estimate of the salary range for this role. We are built on a culture of freedom and flexibility, we operate fully remotely, and our team members balance deeply energizing, high intensity work, with flexible schedules to support life outside of work. Our compensation model reflects these values. Compensation for this role in the United States includes base salary, annual discretionary performance bonus, 401(k) plan with an annual employer contribution, and a comprehensive benefits package. You should reasonably expect a base salary of $160,000 - $175,000. In addition, we offer an annual discretionary performance bonus. Please be advised that all emails will originate from the @ghsmart.com domain; any other domains are fraudulent, should be ignored, and deleted. ghSMART is an Equal Opportunity Employer committed to fostering an inclusive and diverse workplace across all of our global locations. We welcome applicants of all backgrounds and ensure equal employment opportunities without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status. Our global policies and practices are designed to support an environment of respect and equity for all.
Senior Machine Learning/Computer Vision Engineer
Parallel SystemsParallel Systems is a startup company developing the future of intermodal transportation. Our mission is to decarbonize freight while improving supply chain logistics and safety. We are developing vehicles and software to create new autonomous and electric transportation systems for existing rail infrastructure, allowing railroads to convert part of the $700 billion U.S. trucking industry to rail.
Parallel Systems is pioneering autonomous battery-electric rail vehicles designed to transform freight transportation by shifting portions of the $900 billion U.S. trucking industry onto rail. Our innovative technology offers cleaner, safer, and more efficient logistics solutions. Join our dynamic team and help shape a smarter, greener future for global freight. Senior Machine Learning/Computer Vision Engineer Parallel Systems is seeking an experienced Machine Learning Engineer to help build the next generation of perception systems powering our fully autonomous, battery-electric rail vehicles. In this role, you’ll take ownership of designing and deploying cutting-edge deep learning models that enable our vehicles to perceive and reason about complex, real-world environments. From handling adverse weather and ambiguous signals to navigating multi-agent interactions on active railways, your work will directly shape the safety and reliability of our autonomous platform. You’ll collaborate closely with top-tier engineers across autonomy, robotics, and systems, tackling some of the most challenging problems in real-time machine learning and computer vision. If you're excited by the opportunity to push the boundaries of AI in safety-critical, real-world applications, we’d love to work with you. This can be a remote role for a senior engineer with experience in 0 to 1 builds of perception systems. Responsibilities: Design, develop, and deploy advanced machine learning models for large-scale perception problems. Own the full ML lifecycle—from data mining and annotation to training, evaluation, and deployment of production-grade models. Build and optimize deep learning architectures for object detection, segmentation, tracking, pose estimation, and scene understanding. Develop scalable and efficient training pipelines that ensure robust, real-time inference performance. Work extensively with large image, video, lidar and radar datasets to power next-generation computer vision systems. Conduct research and empirical studies to evaluate new architectures, techniques, and algorithmic improvements, incorporating or adapting state-of-the-art methods as appropriate. Build and contribute to infrastructure and tools for supporting ML Pipeline to automate data labeling, training workflows, evaluation processes, and model versioning. Collaborate cross-functionally with other engineering, research, and product teams to ensure seamless integration of ML systems into real-world applications. What Success Looks Like : After 30 Days: You have developed a deep understanding of the current perception architecture, sensor setup, and system requirements. You've identified key challenges in the ML pipelines and proposed initial areas for improvement across data workflows, model performance, and deployment constraints. Requirements : Bachelor’s or higher degree in Computer Science, Machine Learning, or a related technical discipline. 4+ years of hands-on experience developing and deploying ML systems at scale. Strong background in computer vision and/or deep learning with practical experience in designing and training neural networks for real-world applications. Proficiency in Python and familiarity with standard ML libraries and tools (e.g., NumPy, SciPy, Pandas). Expertise in at least one deep learning framework such as PyTorch or TensorFlow. Strong mathematical foundation in linear algebra, geometry, probability, and optimization. Proven track record of working autonomously and driving complex technical projects in fast-paced environments. Excellent communication and collaboration skills, with experience working on interdisciplinary teams. Preferred Qualifications : Experience with multi-modal perception (e.g., sensor fusion from cameras, lidar, radar). Experience optimizing models for deployment on edge devices with real-time constraints. Background in autonomous systems, robotics, or other safety-critical domains. Publications in top-tier ML or CV conferences (e.g., CVPR, ICCV, NeurIPS, ICML, ECCV). Experience with GPU/TPU programming and optimization tools (e.g., CUDA, TensorRT). Knowledge of low-level programming languages like C++ or Rust. Experience working directly with sensing hardware and understanding its constraints. We are committed to providing fair and transparent compensation in accordance with applicable laws. Salary ranges are listed below and reflect the expected range for new hires in this role, based on factors such as skills, experience, qualifications, and location. Final compensation may vary and will be determined during the interview process. The target hiring range for this position is listed below. Target Salary Range: $150,000 — $240,000 USD Parallel Systems is an equal opportunity employer committed to diversity in the workplace. All qualified applicants will receive consideration for employment without regard to any discriminatory factor protected by applicable federal, state or local laws. We work to build an inclusive environment in which all people can come to do their best work. Parallel Systems is committed to the full inclusion of all qualified individuals. As part of this commitment, Parallel Systems will ensure that persons with disabilities are provided reasonable accommodations. If reasonable accommodation is needed to participate in the job application or interview process, to perform essential job functions, and/or to receive other benefits and privileges of employment, please contact your recruiter.
• Build and operationalize the infrastructure that allows machine learning to run reliably in production. • Architect and implement Built’s foundational ML Ops platform from scratch • Define and deploy reusable patterns for model training, deployment, monitoring, and retraining • Build CI/CD pipelines for ML lifecycle automation, including versioning and experimentation tracking • Stand up a feature store integrated with Snowflake and AWS to support structured and unstructured data • Implement model registry and governance standards to ensure reproducibility, auditability, and rollback capability • Integrate ML workloads into our event-driven architecture (Kafka, Kinesis) • Develop observability frameworks to monitor drift, performance, latency, and model quality in production • Automate ML infrastructure using Terraform and AWS-native tooling (SageMaker, Lambda, ECS, Batch, Step Functions) • Establish security and compliance standards across ML assets, including data lineage and access control • Mentor engineers on ML Ops patterns and deployment best practices
Product Owner – Machine Learning
Mitek SystemsThe global leader in mobile capture and digital identity verification.
• Own and manage the backlog for ML-driven biometric and document verification capabilities. • Translate fraud, identity, and customer requirements into clear and actionable ML work items. • Partner closely with ML engineers and data scientists to refine problem statements into feasible deliverables. • Define acceptance criteria that reflect real world performance, not just offline model metrics. • Participate actively in model design discussions, prioritization, and tradeoff analysis. • Support model lifecycle activities including training, evaluation, deployment, and retraining. • Ensure monitoring, drift detection, and feedback loops are incorporated into delivery plans. • Partner with agent operations and data teams on labeling strategy and data quality. • Incorporate fraud patterns and adversarial thinking into backlog prioritization. • Work closely with engineering, fraud, compliance, legal, and customer teams.




