We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
Data Science Experts
Location
United States
Posted
14 hours ago
Salary
$70 - $100 / hour
Seniority
Mid Level
Job Description
Data Science Experts
Weekday (YC W21)
Role Description Join a leading AI lab at the forefront of generative AI innovation and help shape the next generation of Large Language Models. We are seeking experienced Data Science professionals with strong expertise in statistical analysis, machine learning, predictive modeling, and quantitative reasoning to contribute to the development of advanced AI systems. In this role, you will apply your analytical expertise to create, evaluate, and refine high-quality training data that improves how AI models reason about data, statistics, and real-world analytical challenges. You will collaborate with researchers and engineers to ensure AI systems demonstrate rigorous quantitative thinking and sound data science practices. This is a full-time engagement requiring 40 hours per week during standard weekdays. Key Responsibilities - Provide Data Science Expertise - Advise research and engineering teams on statistical methodologies, experimental design, predictive modeling, and analytical best practices. - Help improve AI model performance across data science, analytics, and machine learning domains. - Design Analytical Challenges - Create complex, real-world data science tasks that test quantitative reasoning, statistical thinking, and machine learning knowledge. - Develop accurate, well-structured solutions that reflect industry best practices. - Evaluate AI-Generated Outputs - Review and assess analytical solutions generated by AI systems and subject matter experts. - Identify errors, inconsistencies, flawed assumptions, and opportunities for improvement. - Provide clear, actionable feedback to enhance model quality and reasoning capabilities. - Develop Evaluation Frameworks - Build scoring rubrics and evaluation methodologies for: - Statistical reasoning - Predictive modeling - Machine learning workflows - Data interpretation - Experimental design - Business analytics and decision-making - Ensure consistent quality standards across datasets and evaluation processes. - Collaborate Across Teams - Work closely with AI researchers, engineers, and domain experts to maintain accuracy, consistency, and rigor in training data development. Qualifications - 3+ years of professional or research experience in: - Data Science - Statistical Analysis - Machine Learning - Predictive Analytics - Applied Mathematics - Quantitative Research - Business Analytics - Strong understanding of: - Statistical inference - Hypothesis testing - Regression analysis - Machine learning algorithms - Data visualization - Experimental design - Experience working with structured and unstructured datasets. - Ability to commit to 40 hours per week during standard business days. - Excellent written communication skills with the ability to clearly explain analytical decisions and modeling approaches. Preferred Qualifications - Experience with AI systems, Large Language Models (LLMs), or agent-based workflows. - Familiarity with model evaluation, reinforcement learning from human feedback (RLHF), data annotation, or human-in-the-loop systems. - Proficiency in Python, SQL, R, or other analytical programming languages. - Experience building and deploying machine learning solutions in production environments. - Advanced degree in Data Science, Statistics, Computer Science, Mathematics, Economics, Operations Research, or a related quantitative field. Benefits - Contribute directly to the development of state-of-the-art AI systems. - Work alongside leading researchers and engineers in artificial intelligence. - Help define how future AI models reason about data, statistics, and analytical decision-making. - Apply your expertise to challenging, high-impact projects at the cutting edge of AI innovation. Engagement Details - Full-time commitment (40 hours per week). - Weekday availability required. - Opportunity to work on long-term AI research and evaluation initiatives. - Project scope and responsibilities may evolve based on research priorities and business needs.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Own 1–4 concurrent migration projects end-to-end: scoping, planning, execution, and customer handoff • Be the primary customer contact: run weekly check-ins, manage stakeholder expectations, and escalate risks early before they compound • Configure Datafold's Migration Agent and oversee the migration execution • Partner with Datafold's engineering team to execute migrations • Help refine and scale our product and delivery playbook as the team grows
• Serve as the strategic architect and technical anchor for Masdar’s global digital ecosystem. • Spearhead the Enterprise Digital function, designing a flexible, best-of-breed composable architecture that seamlessly bridges modern corporate systems with agile, field-level renewable asset operations. • Own and steer the global Data Management & Data Governance strategy, defining the universal data taxonomies that transform cross-border operational insight into a competitive advantage. • Build up high-performing domain architects, to safeguard and scale our Digital Architecture and infrastructure, as well as a robust team of competent data management specialists. • Empower Masdar to continue scaling and expanding securely, rapidly, and intelligently, and powering the world's clean energy future.
Senior Data Engineer - Marketing Technologies
H&R BlockWith expert guidance, upfront pricing, and more ways to file, it’s #BetterWithBlock.
• Design, develop and test enterprise MarTech platforms for data engineering using SQL, Azure Data engineering skills including Azure Data Factory, Databricks/Fabric technologies • Proficiency in Azure-based cloud technologies to support data needs, along with working in marketing projects (Adobe Experience Platform and/or Salesforce Marketing Cloud platform) • Leverage cutting-edge data technologies, programming languages, and industry-standard coding practices to innovate new features and optimize existing product/marketing functionalities • Design, develop, and maintain high-quality software components • Create and execute unit tests, troubleshoot issues, and resolve defects efficiently • Collaborate with Product, architects and cross-functional teams to align on requirements and implementation strategies • Translate business and functional requirements into clear technical specifications and product deliverables • Participate in technical design discussions and conduct code reviews to ensure quality and consistency • Document system architecture, design approaches, and development processes for future reference • Develop and maintain unit test plans and alpha test plans to support product validation • Stay current with emerging technologies, tools, and methodologies to continuously improve design, development, and deployment practices
Role Description We are seeking an AI Data Engineer to build and operate the large-scale data systems that power modern AI training and evaluation pipelines. The role combines deep data engineering expertise with a strong understanding of AI workloads, focusing on ingestion, transformation, quality assurance, lineage, and high-throughput delivery of data to training jobs across diverse modalities. The ideal candidate has experience operating petabyte-scale data systems, strong software engineering fundamentals, and a clear understanding of how data infrastructure choices propagate into model quality and training efficiency. Key Responsibilities - Design and operate large-scale data pipelines supporting AI training, evaluation, and continual improvement workflows. - Build ingestion systems for diverse modalities including text, image, audio, video, and structured signals. - Implement data cleaning, deduplication, filtering, and quality assurance at petabyte scale. - Develop dataset versioning, lineage, and provenance tracking systems suitable for reproducible training. - Build high-throughput data loading systems that maximize GPU utilization during training. - Implement labeling workflows, active learning pipelines, and human-in-the-loop data improvement systems. - Design storage architectures balancing cost, throughput, and latency across data tiers. - Build evaluation dataset construction pipelines with strict integrity and contamination controls. - Implement data privacy, redaction, and consent enforcement throughout the pipeline. - Collaborate with ML researchers and engineers to align data systems with model development needs. - Drive observability of data quality, drift, and pipeline health across the AI data estate. - Optimize cost and performance through compression, format selection, and caching strategies. - Document data systems, schemas, and operational procedures for broad internal use. - Stay current with AI data infrastructure research and emerging open-source tools. Qualifications - Bachelor’s or Master’s degree in Computer Science or a related field. - Six or more years of data engineering experience, with significant work supporting ML or AI workloads. - Strong proficiency in Python and at least one JVM or systems language. - Deep experience with modern data processing frameworks such as Spark, Ray, or Beam. - Hands-on experience operating petabyte-scale storage and pipeline systems. - Strong understanding of distributed systems, data modeling, and storage formats. - Experience with dataset versioning, lineage, and reproducibility for ML workflows. - Familiarity with high-throughput data loading for accelerator-based training. - Strong software engineering practices including testing, CI/CD, and code review. - Excellent communication and cross-functional collaboration skills. Preferred Qualifications - Experience with multimodal datasets at large scale. - Familiarity with data quality tooling and dataset evaluation methodology. - Exposure to privacy-preserving data systems and regulated data handling. - Open-source contributions to data infrastructure projects. - Experience supporting frontier model training pipelines. How to Apply Would you like to know more about this opportunity? For immediate consideration, please send your resume to [email protected] . Learn more about Bright Vision Technologies at www.bvteck.com .



