Train AI on distributed data

Founding ML Engineer, Flower Frontier Model Team

Machine Learning EngineerMachine Learning EngineerFull Time Remote SeniorTeam 11-50Since 2023H1B No SponsorCompany Site LinkedIn

Location

Germany

Posted

177 days ago

Salary

Seniority

Senior

Postgraduate DegreeExperience acceptedEnglishDistributed Systems Docker Linux Node.js Python PyTorch

Job Description

• Join as a founding member of the Flower Frontier Model Team • Build category-defining models that blend existing practices with decentralized learning methods • Help build a reliable, maintainable and scalable software stack • Produce world-leading open-sourced models integrated into new Flower Lab products • Design, implement and optimize core components across data curation, evals, pre-training, post-training • Diagnose and resolve GPU/kernel issues, memory/storage bottlenecks, and multi-node failures at scale • Collaborate on the debugging of training instabilities and related issues • Devise surrounding infrastructure, tooling, monitoring, and observability

Job Requirements

Exceptional software engineering skills (Python, deep learning frameworks, testing, profiling, refactoring, reproducibility)
Expertise with modern ML training stacks: PyTorch, JAX or equivalent
Experience implementing model architectures from scratch and working within libraries like DeepSpeed, Megatron or equivalent
Ability to tune, debug, and profile large-scale training runs
Hands-on experience working with large GPU clusters, including job orchestration, scheduling, multi-node runs, NCCL/RDMA issues, and GPU performance optimization
Ability to collaborate effectively with both research-oriented and engineering-oriented colleagues
Good engineering hygiene: modular design, code reviews, documentation, reproducibility, versioning of data/models/configurations
Familiarity with common tools (Linux command line, git, Docker)
Openness to adopting new tooling
Solid understanding of distributed systems and networking
Strong written English
Open, honest and transparent communication skills

Benefits

Opportunity to work on frontier AI models
Potential for technical leadership
Collaborative start-up environment

Related Categories

Machine Learning Engineer AI Engineer AI Research Scientist LLM Engineer Computer Vision Engineer NLP Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Machine Learning Engineer Jobs

Founding ML Engineer – Flower Frontier Model Team

Flower Labs

Train AI on distributed data

Machine Learning Engineer177 days ago

Full Time RemoteTeam 11-50Since 2023H1B No Sponsor

Company Site LinkedIn

• Join as one of the founding members of the Flower Frontier Model Team, a new group at Flower Labs charged with building category-defining models. • Build SOTA LLMs and foundation models within a small, high-impact team. • Design, implement and optimize core components across the full spectrum of stages relevant to frontier model building: data curation, evals, pre-training, post-training. • Collaborate on the debugging of training instabilities and related issues. • Devise surrounding infrastructure, tooling, monitoring, and observability for large-scale LLM development.

Distributed Systems Docker Linux Node.js Python PyTorch

View details: Founding ML Engineer – Flower Frontier Model Team

United Kingdom

Apply

Senior Machine Learning Engineer

OneStudyTeam

Better. Sooner. Together.

Machine Learning Engineer177 days ago

Other RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Build and deploy AI-driven products that accelerate clinical trials and improve patient outcomes. • Develop advanced ML models and LLM-powered agents for critical use cases like patient recruitment. • Leverage modern cloud tools and MLOps best practices to build robust data pipelines. • Collaborate across teams of data scientists, product managers, and engineers to integrate AI capabilities. • Stay up-to-date with the latest developments in ML/AI and proactively bring new ideas to the team.

AWS Docker Pandas Python PyTorch scikit-learn TensorFlow

View details: Senior Machine Learning Engineer

United States

Apply

Machine Learning Engineer, Personalization

Spotify

Passionate music fans. Innovative tech pros. Perfect harmony. Join our band.

Machine Learning Engineer178 days ago

Other RemoteTeam 5,001-10,000Since 2008H1B Sponsor

Company Site LinkedIn

• Utilize in-house and 3rd party LLMs to solve language understanding problems • Employ techniques such as fine-tuning and RAG to improve models • Contribute to designing, building, evaluating, shipping, and refining Spotify’s product by hands-on ML development • Help drive optimization, testing, and tooling to improve quality of our content enrichment assets • Collaborate with cross-functional teams of MLEs, data and backend engineers, and other stakeholders including tech research, data science, and product to develop new features and technologies • Be a participant in our AI Foundation’s ML community and work collaboratively and efficiently within our existing platforms and systems • Perform data analysis to establish baselines and inform product decisions • Stay up-to-date on the latest machine learning algorithms and techniques

Apache HTTP Server AWS GCP Java Python PyTorch Ray Scala Apache Spark SQL TensorFlow

View details: Machine Learning Engineer, Personalization

New York

$138.3K - $197.5K / year

Apply

Senior Machine Learning Engineer

Qodea

Qodea (formerly Appsbroker CTS) is Europe's largest Google Premier only transformation partner.

Machine Learning Engineer178 days ago

Full Time RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Lead the algorithm selection, design, and prototyping of machine learning models to solve complex business problems, including recommendation, personalization, and predictive analytics. • Apply your expertise in statistical modeling and machine learning to perform deep data analysis, guide crucial feature selection, and identify opportunities for product improvement. • Own the full ML lifecycle, from breaking down discrete steps of a pipeline (e.g., with a DAG) to analyzing model implementations and improving their robustness in the wild. • Implement and manage robust model observability, tuning, and optimization processes to ensure sustained performance and accuracy post-deployment. • Develop and maintain data pipelines to process and prepare data for model training and evaluation. • Design and conduct A/B tests to evaluate model performance and its impact on key business metrics. • Collaborate closely with product managers and engineers to define problems and deliver effective AI-driven solutions. • Mentor other team members, champion best practices in machine learning engineering, and stay current with the latest advancements in the field.

GCP NoSQL Pandas Python PyTorch scikit-learn Apache Spark SQL TensorFlow

View details: Senior Machine Learning Engineer

Portugal

Apply

Job Closed

Founding ML Engineer, Flower Frontier Model Team

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Machine Learning Engineer Jobs

Founding ML Engineer – Flower Frontier Model Team

Senior Machine Learning Engineer

Machine Learning Engineer, Personalization

Senior Machine Learning Engineer