Train AI on distributed data
Founding ML Engineer, Flower Frontier Model Team
Location
Germany
Posted
177 days ago
Salary
0
Seniority
Senior
Job Description
Founding ML Engineer, Flower Frontier Model Team
Flower Labs
• Join as a founding member of the Flower Frontier Model Team • Build category-defining models that blend existing practices with decentralized learning methods • Help build a reliable, maintainable and scalable software stack • Produce world-leading open-sourced models integrated into new Flower Lab products • Design, implement and optimize core components across data curation, evals, pre-training, post-training • Diagnose and resolve GPU/kernel issues, memory/storage bottlenecks, and multi-node failures at scale • Collaborate on the debugging of training instabilities and related issues • Devise surrounding infrastructure, tooling, monitoring, and observability
Job Requirements
- Exceptional software engineering skills (Python, deep learning frameworks, testing, profiling, refactoring, reproducibility)
- Expertise with modern ML training stacks: PyTorch, JAX or equivalent
- Experience implementing model architectures from scratch and working within libraries like DeepSpeed, Megatron or equivalent
- Ability to tune, debug, and profile large-scale training runs
- Hands-on experience working with large GPU clusters, including job orchestration, scheduling, multi-node runs, NCCL/RDMA issues, and GPU performance optimization
- Ability to collaborate effectively with both research-oriented and engineering-oriented colleagues
- Good engineering hygiene: modular design, code reviews, documentation, reproducibility, versioning of data/models/configurations
- Familiarity with common tools (Linux command line, git, Docker)
- Openness to adopting new tooling
- Solid understanding of distributed systems and networking
- Strong written English
- Open, honest and transparent communication skills
Benefits
- Opportunity to work on frontier AI models
- Potential for technical leadership
- Collaborative start-up environment
Related Guides
Related Job Pages
More Machine Learning Engineer Jobs
• Join as one of the founding members of the Flower Frontier Model Team, a new group at Flower Labs charged with building category-defining models. • Build SOTA LLMs and foundation models within a small, high-impact team. • Design, implement and optimize core components across the full spectrum of stages relevant to frontier model building: data curation, evals, pre-training, post-training. • Collaborate on the debugging of training instabilities and related issues. • Devise surrounding infrastructure, tooling, monitoring, and observability for large-scale LLM development.
• Build and deploy AI-driven products that accelerate clinical trials and improve patient outcomes. • Develop advanced ML models and LLM-powered agents for critical use cases like patient recruitment. • Leverage modern cloud tools and MLOps best practices to build robust data pipelines. • Collaborate across teams of data scientists, product managers, and engineers to integrate AI capabilities. • Stay up-to-date with the latest developments in ML/AI and proactively bring new ideas to the team.
Machine Learning Engineer, Personalization
SpotifyPassionate music fans. Innovative tech pros. Perfect harmony. Join our band.
• Utilize in-house and 3rd party LLMs to solve language understanding problems • Employ techniques such as fine-tuning and RAG to improve models • Contribute to designing, building, evaluating, shipping, and refining Spotify’s product by hands-on ML development • Help drive optimization, testing, and tooling to improve quality of our content enrichment assets • Collaborate with cross-functional teams of MLEs, data and backend engineers, and other stakeholders including tech research, data science, and product to develop new features and technologies • Be a participant in our AI Foundation’s ML community and work collaboratively and efficiently within our existing platforms and systems • Perform data analysis to establish baselines and inform product decisions • Stay up-to-date on the latest machine learning algorithms and techniques
Senior Machine Learning Engineer
QodeaQodea (formerly Appsbroker CTS) is Europe's largest Google Premier only transformation partner.
• Lead the algorithm selection, design, and prototyping of machine learning models to solve complex business problems, including recommendation, personalization, and predictive analytics. • Apply your expertise in statistical modeling and machine learning to perform deep data analysis, guide crucial feature selection, and identify opportunities for product improvement. • Own the full ML lifecycle, from breaking down discrete steps of a pipeline (e.g., with a DAG) to analyzing model implementations and improving their robustness in the wild. • Implement and manage robust model observability, tuning, and optimization processes to ensure sustained performance and accuracy post-deployment. • Develop and maintain data pipelines to process and prepare data for model training and evaluation. • Design and conduct A/B tests to evaluate model performance and its impact on key business metrics. • Collaborate closely with product managers and engineers to define problems and deliver effective AI-driven solutions. • Mentor other team members, champion best practices in machine learning engineering, and stay current with the latest advancements in the field.



