Toptal logo
Toptal

The World's Top Talent, On Demand®

Lead Data Scientist

Data ScientistData ScientistFull TimeRemoteLeadTeam 1,001-5,000Since 2010H1B No SponsorCompany SiteLinkedIn

Location

Worldwide

Posted

3 days ago

Salary

0

Seniority

Lead

Job Description

Lead Data Scientist

Toptal

Role Description We are looking for a Senior Data Scientist to join us as the first Data Scientist on a new product we are building. This is a founding role: you will shape the data science function from the ground up, set technical direction, and own the end-to-end delivery of intelligent systems that define how our product creates value. You will tackle open-ended problems involving: - Task Mining - Process Mining - Behavioral workflow analysis - Pattern discovery - Predictive modeling - Applied GenAI/ML systems The goal is not just to build models, but to turn raw interaction data into measurable product and business impact: discovered workflows, bottlenecks, optimization opportunities, and scalable foundations for future DS/ML work. This is a remote position. We do not offer visa sponsorship or assistance. Resumes and communication must be submitted in English. Qualifications - 5+ years of professional experience in Data Science, Machine Learning, or Applied ML roles. - Demonstrated experience operating as the sole or lead Data Scientist on a product or team — owning problems end-to-end without senior DS supervision. - Strong experience with supervised and unsupervised ML, modern ML/data tooling, and the judgment to select the right approach for the problem. - Practical familiarity with representation learning, sequence modeling, Transformers, LLMs, or GenAI systems where relevant to product use cases. - Experience handling large-scale structured, unstructured, event, or interaction datasets. - Advanced proficiency in Python and SQL, with hands-on experience using tools such as PyTorch, scikit-learn, pandas/Polars, experiment tracking, and production ML workflows. - Experience deploying ML models, data pipelines, or intelligent systems into production. - Familiarity with Task Mining, Process Mining, event-log analysis, behavioral analytics, workflow automation, or adjacent domains. - Advanced degree in Computer Science, Data Science, AI, Statistics, Mathematics, or a related field is a plus; equivalent practical experience is strongly valued. Requirements - A founder’s mindset: full responsibility for outcomes, not just deliverables. - Comfort operating in high ambiguity: able to turn unclear product goals, noisy data, and incomplete requirements into an executable roadmap. - Strong business sense — connects technical work to commercial impact and measurable product value. - Pragmatic technical judgment — knows when to use advanced ML, when to simplify, and when better data, labeling, or evaluation is the real bottleneck. - Ability to build foundations for rapid scaling: reusable datasets, pipelines, metrics, evaluation frameworks, and modeling patterns future DS/ML hires can build on. - Highly proactive problem solver who acts without waiting for detailed instructions. - Excellent communication skills, with the confidence to push back constructively and propose direction. Nice to Have - Previous experience as a first or early Data Scientist at a startup or new product line. - Direct experience with Task Mining, Process Mining, workflow intelligence, RPA, or productivity analytics. - Experience with LLMs and Generative AI applications, especially evaluation, structured outputs, semantic labeling, summarization, or human-in-the-loop workflows. - Experience working with privacy-sensitive behavioral, productivity, or user-interaction data. - Experience with product experimentation, causal inference, or measuring the impact of workflow/process interventions. - Knowledge of MLOps and distributed processing frameworks, such as Spark. - Experience with cloud environments, especially GCP.

Related Categories

Related Job Pages

More Data Scientist Jobs

Binance logo

Data Scientist – LLM, Trading

Binance

The World’s Leading Blockchain Ecosystem and Digital Asset Exchange

Data Scientist3 days ago
InternshipRemoteTeam 1,001-5,000Since 2017H1B No Sponsor

• Leveraging knowledge of financial derivatives, participate in the design and development of a Web3 AI-powered financial token risk control system integrated into a real-world system. • Assist in building and maintaining benchmark datasets to evaluate the performance of the AI ​​system, and assist in prototyping and iterating the workflow of the financial AI agent system. • Work closely with senior engineers to complete system design, integration, and deployment.

Hong Kong
Jetnet logo

Data Scientist - Aviation Intelligence

Jetnet

At JETNET, you’ll be part of an innovative company that stands at the forefront of aviation data solutions with a sterling reputation in the industry. Ready to take flight with us? Apply today and become a part of the JETNET Team!

Data Scientist3 days ago

Role Description JETNET is seeking an Aviation Intelligence Data Scientist to help shape the future of aviation data intelligence. This is not a traditional analytics role. It is a strategic, hands-on opportunity for a data scientist who can architect intelligence frameworks, build AI-enabled workflows, and transform vast historical and live aviation data into scalable, production-ready capabilities. In this role, you will sit at the center of JETNET’s evolution into a modern Data Operations & Aviation Intelligence function. You will partner closely with leadership, engineering, and frontline research teams to develop intelligence models and systems that improve data quality, accelerate researcher workflows, strengthen product innovation, and expand JETNET’s competitive advantage in the aviation intelligence space. - Lead the design of JETNET’s post-ingestion intelligence layer, converting raw aviation data into structured, workflow-ready insights - Build and refine intelligence frameworks such as confidence scoring, inference models, entity resolution, ownership continuity, and relationship strength modeling - Develop AI-assisted workflows and LLM-enabled tools that surface ranked suggestions, next-best actions, and confidence-weighted recommendations for researchers - Identify and prioritize high-value data gaps and enrichment opportunities across aircraft, companies, contacts, and related aviation entities - Design graph-based models and linking strategies to strengthen relationship discovery and reduce duplication across datasets - Apply process mining and operational analytics to improve the ingest, verification, and publishing lifecycle, reducing bottlenecks and cycle time - Partner with aviation researchers to build practical, scalable solutions that improve consistency, speed, and data confidence in daily workflows - Lead end-to-end data science initiatives from problem framing through experimentation, validation, deployment, and performance monitoring - Collaborate closely with the CTO and Engineering Team to productionize models and intelligence capabilities within modern data environments - Translate complex findings into clear, executive-ready reporting, dashboards, and recommendations that support strategic decision-making Qualifications - Proven success independently leading end-to-end data science initiatives from concept to production deployment - Strong expertise in AI/ML, including building, evaluating, and optimizing production-grade models and LLM-assisted systems - Experience designing intelligent workflows that blend automation, analyst judgment, and measurable performance improvement - Hands-on experience with graph-based linking, probabilistic modeling, entity resolution, and relationship discovery techniques - Strong operational mindset, with experience using analytics to identify process optimization opportunities and improve workflow outcomes - Experience working with both structured and unstructured data, including OCR and extraction workflows - Advanced proficiency in Python and SQL, with experience in modern cloud data environments and model deployment practices - Strong business judgment and the ability to prioritize work that delivers meaningful product, operational, and customer impact - Excellent communication skills, including the ability to explain complex technical concepts to non-technical audiences and executives - Curiosity, ownership, and a builder mentality, with enthusiasm for solving difficult data problems in a specialized industry Requirements - Location: Open to applicants based in the USA with current legal authorization to work. - Compensation Range: $100,000 - $135,000/year Benefits - Remote Work Flexibility: Enjoy a balanced work-life arrangement with remote flexibility, empowering you to deliver your best work from home. - Comprehensive Paid Time Off: We understand the value of rest and recharge, so we offer competitive PTO to support a healthy work-life balance. - Comprehensive Benefits Coverage: With health, dental, and vision benefits, we prioritize your well-being so you can focus on making an impact.

United States
$100K - $135K / year

• Design, develop, train, and deploy machine learning and AI models that process and analyze field equipment sensor data (time-series IoT, embedded device telemetry) alongside structured and unstructured datasets. • Build and refine predictive, prescriptive, and anomaly detection models using techniques such as regression, time-series forecasting, classification, clustering, and deep learning to support real-time or near-real-time decision-making. • Perform exploratory data analysis (EDA), data preprocessing, feature engineering/signal processing, and feature extraction on high-volume, noisy sensor data and multimodal datasets to surface patterns, correlations, and actionable insights. • Contribute to end-to-end AI workflows, including automated data ingestion, model training pipelines, inference at the edge or in the cloud, and continuous monitoring for model drift and performance degradation. • Apply statistical modeling, hypothesis testing, and experimentation methods (A/B testing, causal inference where applicable) to validate model performance and ensure robustness in dynamic operational environments. • Support the development and maintenance of reproducible, scalable ML pipelines using MLOps best practices, including model versioning, retraining, deployment (including edge/embedded constraints), and lifecycle management. • Collaborate with engineering, product, and domain experts to translate business problems (e.g., predictive maintenance, fault detection, process optimization) into well-defined data science solutions. • Perform data cleansing, validation, and collation activities to ensure models are accurate, reliable, and aligned with real-world operating conditions. • Solve complex technical challenges related to analytical toolsets that support engineering and operational decision-making. • Communicate technical findings, model performance metrics, and business value to internal stakeholders through clear visualizations, written reports, and presentations. • Explore and evaluate emerging techniques (e.g., generative AI for synthetic sensor data, edge AI optimization, multimodal data fusion) and recommend incorporation into production workflows where appropriate. • Assist in formulating and managing data-driven project requirements aligned with business needs and strategic company goals. • Provide subject matter input on analytical tools and methods to cross-functional product development teams. • Work with software and business development teams to support revenue opportunities tied to data science initiatives and product/service enhancements. • Support internal resources involved in research, product development, and ongoing production of data analytics deliverables.

California + 2 moreAll locations: California | Illinois | New York
$98.8K - $154.5K / year
Health Care Service Corporation logo

Principal Data Scientist

Health Care Service Corporation

Empowering Whole Person Health With Compassion and Innovation

Data Scientist4 days ago
Full TimeRemoteTeam 10,001+H1B Sponsor

• Provide leadership and strategic direction in solving business problems using advanced mathematical and statistical methods. • Build advanced time-series models to forecast trends in medical spend using hierarchical Bayesian models and temporal fusion transformers. • Ensure scientific rigor and reproducibility of model evaluation. • Optimize models for accuracy and interpretability. • Communicate findings to stakeholders from technical ICs to top leadership.

Texas
$121.2K - $225.2K / year