Human Experts Implementing Artificial Intelligence #AI #ArtificialIntelligence #HumanIntelligence
Senior Data Scientist
Location
Germany
Posted
3 days ago
Salary
0
Seniority
Senior
Job Description
Senior Data Scientist
Xenon Seven
• Architect Document Intelligence Solutions: Design and implement advanced Machine Learning and Deep Learning models to parse, extract, and interpret text and complex chemical structures from unstructured, scanned PDF documents. • Develop LLM & Retrieval Systems: Build and optimize Large Language Model (LLM) applications, leveraging vector databases to enable semantic search, advanced data interpretation, and retrieval-augmented generation (RAG). • End-to-End ML Pipelines: Own the entire machine learning lifecycle, including data preprocessing (specifically for chemical data and OCR outputs), model training, evaluation, deployment, and post-deployment monitoring. • Bridge Chemistry & AI: Apply your chemistry domain knowledge to translate molecular structures, diagrams, and chemical data into machine-readable formats, embeddings, and actionable insights. • Cloud Architecture & Deployment: Deploy scalable, secure, and production-ready AI/ML pipelines within the AWS ecosystem, ensuring high availability and performance. • Cross-Functional Collaboration: Partner closely with software engineers, data engineers, and domain experts to integrate ML models into the core product architecture and align with business goals.
Job Requirements
- Professional Experience: 5+ years of proven experience working as a Data Scientist, with a track record of delivering production-grade machine learning models.
- Domain Expertise: A strong background in Chemistry, Cheminformatics, or a highly related scientific field, with a demonstrated ability to interpret and manipulate complex chemical structures and data types.
- Core Technical Stack: Advanced proficiency in Python and deep hands-on experience with the AWS cloud stack (e.g., SageMaker, Lambda, S3, EC2).
- Generative AI & Search: Practical, hands-on experience working with LLMs (fine-tuning, prompt engineering, or API integration) and vector databases (e.g., Pinecone, Milvus, Weaviate, or Qdrant).
- ML/DL Mastery: Robust experience in model development, validation, deployment, and evaluation framework tools (e.g., PyTorch, TensorFlow, Scikit-Learn).
- Document Processing (Plus): Prior experience with Computer Vision, Optical Character Recognition (OCR), or Document AI systems is highly desirable given the scanned PDF focus.
- Soft Skills: Strong analytical problem-solving skills, excellent communication, and the ability to thrive in a highly collaborative, cross-disciplinary project environment.
Benefits
- Ecosystem of Opportunity: You'll be part of a growing network where client engagements, thought leadership, research collaborations, and mentorship paths are interconnected. Whether you're building solutions or nurturing the next generation of talent, this is a place to scale your influence.
- Collaborative Environment: Our culture thrives on openness, continuous learning, and engineering excellence. You'll work alongside seasoned practitioners who value smart execution and shared growth.
- Flexible & Impact-Driven Work: Whether you're contributing from a client project, innovation sprint, or open-source initiative, we focus on outcomes—not hours. Autonomy, ownership, and curiosity are encouraged here.
- Talent-Led Innovation: We believe communities are strongest when built around real practitioners. Our Innovation Community isn’t just a knowledge-sharing forum—it’s a launchpad for members to lead new projects, co-develop tools, and shape the direction of AI itself.
Related Guides
Related Categories
Related Job Pages
More Data Scientist Jobs
• Bridge the gap between unstructured, real-world data, and frontier AI models • Structure clinical datasets and write reproducible code • Establish, automate, and enforce data quality control (QC) and validation frameworks • Participate in technical conversations with external partners • Design, build, and maintain data dictionaries, schemas, and metadata models
Role Description We are seeking a visionary and highly skilled Senior Data Scientist to lead the development of a cutting-edge document intelligence and discovery platform. This unique role sits at the intersection of advanced Generative AI, Machine Learning, and Cheminformatics. Your primary mission will be to solve a highly complex unstructured data challenge: transforming scanned PDF documents containing intricate chemical structures into highly searchable, interpretable, and actionable knowledge bases. - Architect Document Intelligence Solutions: Design and implement advanced Machine Learning and Deep Learning models to parse, extract, and interpret text and complex chemical structures from unstructured, scanned PDF documents. - Develop LLM & Retrieval Systems: Build and optimize Large Language Model (LLM) applications, leveraging vector databases to enable semantic search, advanced data interpretation, and retrieval-augmented generation (RAG). - End-to-End ML Pipelines: Own the entire machine learning lifecycle, including data preprocessing (specifically for chemical data and OCR outputs), model training, evaluation, deployment, and post-deployment monitoring. - Bridge Chemistry & AI: Apply your chemistry domain knowledge to translate molecular structures, diagrams, and chemical data into machine-readable formats, embeddings, and actionable insights. - Cloud Architecture & Deployment: Deploy scalable, secure, and production-ready AI/ML pipelines within the AWS ecosystem, ensuring high availability and performance. - Cross-Functional Collaboration: Partner closely with software engineers, data engineers, and domain experts to integrate ML models into the core product architecture and align with business goals. Qualifications - 5+ years of proven experience working as a Data Scientist, with a track record of delivering production-grade machine learning models. - A strong background in Chemistry, Cheminformatics, or a highly related scientific field, with a demonstrated ability to interpret and manipulate complex chemical structures and data types. - Advanced proficiency in Python and deep hands-on experience with the AWS cloud stack (e.g., SageMaker, Lambda, S3, EC2). - Practical, hands-on experience working with LLMs (fine-tuning, prompt engineering, or API integration) and vector databases (e.g., Pinecone, Milvus, Weaviate, or Qdrant). - Robust experience in model development, validation, deployment, and evaluation framework tools (e.g., PyTorch, TensorFlow, Scikit-Learn). - Prior experience with Computer Vision, Optical Character Recognition (OCR), or Document AI systems is highly desirable given the scanned PDF focus. - Strong analytical problem-solving skills, excellent communication, and the ability to thrive in a highly collaborative, cross-disciplinary project environment. Benefits - Ecosystem of Opportunity: You'll be part of a growing network where client engagements, thought leadership, research collaborations, and mentorship paths are interconnected. - Collaborative Environment: Our culture thrives on openness, continuous learning, and engineering excellence. - Flexible & Impact-Driven Work: Whether you're contributing from a client project, innovation sprint, or open-source initiative, we focus on outcomes—not hours. - Talent-Led Innovation: Our Innovation Community isn’t just a knowledge-sharing forum—it’s a launchpad for members to lead new projects, co-develop tools, and shape the direction of AI itself.
Data Scientist
New FlyerNew Flyer is North America's largest transit bus manufacturer and EV leader, and a subsidiary of NFI Group.
• Work with various ERP and IOT data sources to Build interactive and dynamic Power BI dashboards for data storytelling and stakeholder decision support • Design and develop machine learning models and predictive analytics solutions using data sourced from Snowflake • Perform data exploration, statistical analysis, and feature engineering to support AI initiatives • Collaborate with data engineering teams to ensure robust data pipelines • Contribute to the development of AI-driven tools and frameworks within the organization • Evaluate model performance and retrain as needed to adapt to evolving data • Participate in cross-functional projects involving manufacturing, quality, supply chain and engineering. • Update training, and documentation. • Deliver user training to operating groups as required. • Other duties as assigned
• Own marketing analytics across DTC and Amazon, including paid acquisition, retention, LTV, and channel-level performance. • Build and maintain attribution models for DTC acquisition that account for multi-touch and organic effects—not just last-click. • Analyze Amazon performance data including traffic, conversion, ad spend efficiency, and organic/paid contribution. • Identify performance trends and surface actionable insights to growth leads on a regular cadence. • Support the Growth Marketing and Ecommerce teams on the design and analysis of A/B and multivariate tests across paid channels, email, website, and Amazon storefronts. • Set standards for test design, sample sizing, and statistical significance across the marketing team. • Support the ecommerce team on updating Marketing on website and digital storefront performance analysis. • Track and report on conversion funnel metrics, site behavior, and the impact of UX or merchandising changes. • Support the CMO and VP of Growth in the monthly revenue forecasting cycle — pulling and organizing upstream inputs, updating the growth model, and flagging anomalies or trends that warrant a closer look. • Take ownership of running and interpreting incrementality testing tool and use results to inform channel investment decisions.



