The salary range and/or hourly rate listed is a good faith determination of potential base compensation that may be offered to a successful applicant for this position at the time of this job advertisement and may be modified in the future. When determining a team member's base salary and/or rate, several factors may be considered as applicable (e.g., location, specialty, service line, years of relevant experience, education, credentials, negotiated contracts, budget and internal equity).
Data Scientist, Cancer Informatics and AI/ML
Location
United States
Posted
3 days ago
Salary
$75.0K - $126.3K / year
Seniority
Mid Level
Job Description
Data Scientist, Cancer Informatics and AI/ML
Northwell
Role Description This position will support computational oncology and cancer informatics research initiatives focused on transforming complex clinical data into structured, actionable datasets for research, quality improvement, clinical trial identification, and care delivery optimization. The role will emphasize applied machine learning, natural language processing, and large language model-driven workflows using real-world clinical data, including electronic health record data, pathology reports, radiology reports, clinical notes, genomics, treatment data, and other institutional data sources. The Data Scientist will work semi-independently in close collaboration with clinical investigators, informatics teams, biostatisticians, and other data science stakeholders to design, build, evaluate, and refine computational pipelines. The ideal candidate will have practical prior experience developing data science workflows in Python and using modern machine learning or LLM-based tools in real projects. Job Responsibility - Develop, test, and maintain Python-based data pipelines for clinical research, quality improvement, and computational oncology projects. - Support cancer informatics projects involving natural language processing, machine learning, large language models, and structured extraction from unstructured clinical data. - Build workflows for processing clinical notes, pathology reports, radiology reports, treatment records, genomics reports, and other real-world healthcare data sources. - Implement and evaluate LLM-assisted workflows, including prompt engineering, structured output generation, model benchmarking, validation pipelines, and error analysis. - Assist with the development of retrieval-augmented generation workflows, vector search, embedding-based retrieval, and related approaches where appropriate. - Work with clinical subject matter experts to translate oncology-focused research questions into executable data science tasks. - Perform data cleaning, data wrangling, exploratory analysis, feature engineering, model development, and model performance evaluation. - Generate reproducible analyses, reports, dashboards, tables, and visualizations to communicate findings to clinical and operational stakeholders. - Maintain clear documentation of code, analytic decisions, model assumptions, validation methods, and project outputs. - Participate in model validation efforts, including comparison of computational outputs against clinician-reviewed reference standards. - Contribute to manuscript, abstract, grant, and presentation development through data analysis, figure generation, and methods documentation. - Work independently on assigned analytic tasks while communicating progress, limitations, and blockers clearly to project leadership. Qualifications - Bachelor’s Degree in Computer Science, Informatics, Statistics, Engineering, Data Science, or related field, required. Master’s Degree, preferred. - Minimum of two (2) years of post-graduate training or experience involving quantitative data analysis, required and working with clinical data, data science, and machine learning, preferred. - Working familiarity with basic medical and health information technology concepts, including standardized terminologies and ontologies and electronic health records, as well as Data Warehousing and Business Intelligence tools, required. - Expertise in working with SQL relational databases and statistical or general programming languages (e.g., Python, R), required. - Deep understanding of statistical and predictive modeling concepts, machine-learning approaches, clustering and classification techniques, and recommendation and optimization algorithms. Requirements - Demonstrated prior experience building or implementing applied data science, machine learning, NLP, or LLM-based workflows. Completion of a short AI certificate, bootcamp, or introductory course alone is not sufficient for this role. - Strong practical experience with Python for data science, including pandas, NumPy, scikit-learn, Jupyter notebooks, and reproducible analytic workflows. - Prior experience applying machine learning, natural language processing, or large language models to real-world data problems. - Experience using off-the-shelf LLMs through APIs or enterprise platforms, including structured prompting, output parsing, evaluation, and workflow integration. - Experience with retrieval-augmented generation, vector databases, embeddings, semantic search, or document retrieval pipelines. - Experience working with clinical, biomedical, or electronic health record data. - Familiarity with oncology data, cancer registries, pathology reports, radiology reports, genomics reports, or clinical trial data. - Experience working in secure data environments, enterprise data warehouses, Databricks, Spark, SQL databases, or cloud-based analytic platforms. - Ability to write clean, maintainable, well-documented code and use version control such as Git. - Demonstrated ability to work semi-independently, manage multiple analytic tasks, and communicate technical concepts to non-technical clinical collaborators. - Prior experience contributing to academic research, abstracts, manuscripts, grant-funded projects, or healthcare quality improvement initiatives. - Understanding of model evaluation concepts including accuracy, precision, recall, F1 score, calibration, error analysis, and external validation. - Experience with prompt engineering alone is not sufficient; candidates should have substantive prior experience in data science, machine learning, computational research, or applied analytics. Benefits - The salary range and/or hourly rate listed is a good faith determination of potential base compensation that may be offered to a successful applicant for this position at the time of this job advertisement and may be modified in the future. - When determining a team member's base salary and/or rate, several factors may be considered as applicable (e.g., location, specialty, service line, years of relevant experience, education, credentials, negotiated contracts, budget and internal equity).
Related Guides
Related Categories
Related Job Pages
More Data Scientist Jobs
• Time Series & ML Engineering: Support the team in building, improving, and retraining machine learning forecasting models • Production Operations: Actively assist in monitoring, updating, and troubleshooting forecasting models and pipelines operating in production environments • Data Pipelines & Cleansing: Help build and maintain robust data transformation pipelines using SQLMesh and BigQuery to pre-process large streams of data • Simulation & Validation: Use our internal simulation framework to backtest forecast models and analyze how forecast errors directly impact our high-level EMS optimization yield • Agentic AI & Workflow Automation: Assist in writing, structuring, and testing behaviors for autonomous AI agents to help automate workflows • Documentation & Team Sync: Help maintain clean, clear technical documentation in Notion and collaborate with Optimization and Data Engineers during sprint cycles
Role Description As a Data Scientist at Cint, you will play a pivotal role in developing next-generation AI solutions that power our product portfolio. Collaborating closely with Product and Engineering teams, you will bridge the gap between traditional research data and synthetic intelligence. You will focus on the research, validation, and delivery of models—including Large Language Models (LLMs)—that augment high-quality human signals across the Cint Exchange. This role involves advanced data mining, robust data validation, and the development of sophisticated statistical and machine learning methodologies. The ideal candidate can independently research, develop, and maintain high-impact solutions that align Cint’s AI capabilities with market research trends, contributing to the technical roadmap for Cint’s proprietary synthetic data platform. Responsibilities - Contribute to the research, discovery, and development of machine learning models - specifically focused on synthetic row generation, open-ended text generation, and data augmentation. - Execute statistical tests and experiments to validate LLM performance and synthetic modeling hypotheses. - Develop logic for on-demand and dynamic boosting capabilities, collaborating with Engineering to integrate these models into Cint Exchange fielding workflows. - Design and refine sophisticated profiling taxonomies, leveraging large-scale datasets to create syndicated audiences. - Manage technical workflows and development cycles with guidance. - Collaborate with Product and Engineering teams to support integration. - Create clear, effective prototypes and deliverables that explain and defend complex Generative AI concepts to both technical and non-technical audiences. Qualifications - Minimum 2-4 years of experience in a Data Science capacity, with experience delivering end-to-end data science solutions. - A Master's degree (or equivalent) in Statistics, Data Science, or a related quantitative field. - Deep understanding of Generative AI and LLMs, particularly for applications in text generation and data synthesis. - Advanced knowledge of statistical techniques: hypothesis testing, sampling theory, experimental design, and causal inference. - Strong knowledge of a variety of ML techniques (e.g., clustering, regression, neural networks, etc.) and their real-world trade-offs. - Expert proficiency in Python (DS/ML stack) and experience with frameworks used for LLM development and fine-tuning. - Advanced SQL skills and experience working with large-scale databases. - Ability to research and adopt new methods. Essential Qualities - Highly accountable self-starter and quick learner, consistently motivated to deliver high-quality, impactful results. - Strong data-driven mindset with the ability to translate abstract business requests into actionable AI initiatives and solutions. - Excellent written and verbal communication skills, with the ability to communicate technical findings clearly. Nice to Have - Direct experience with Synthetic Data Generation techniques and the evaluation of synthetic data quality/utility. - Experience with Prompt Engineering, RAG (Retrieval-Augmented Generation), or fine-tuning open-source LLMs for open-end generation. - Experience with probabilistic modeling, or advanced profiling techniques. - Familiarity with online market research or survey exchange platforms. - Experience using Databricks, Spark, or PySpark for large-scale workflows. Additional Information - #LI-Remote Our Values - Collaboration is our superpower. - Innovation is in our blood. - Our curiosity is insatiable. - We do what we say. - Excellence comes as standard. - We are caring. More About Cint We’re proud to be recognised in Newsweek’s 2025 Global Top 100 Most Loved Workplaces®, reflecting our commitment to a culture of trust, respect, and employee growth. In June 2021, Cint acquired Berlin-based GapFish – the world’s largest ISO certified online panel community in the DACH region – and in January 2022, completed the acquisition of US-based Lucid – a programmatic research technology platform that provides access to first-party survey data in over 110 countries. Cint Group AB (publ), listed on Nasdaq Stockholm, this growth has made Cint a strong global platform with teams across its many global offices, including Stockholm, London, New York, New Orleans, Singapore, Tokyo and Sydney.
Senior Director, Data Science
UnitedHealth GroupUnitedHealth Group is a healthcare and well-being company that’s dedicated to improving the health outcomes of millions around the world. We are comprised of
Title: Senior Director, Data Science - NextGen Forecasting - Remote Location: Minnetonka, MN, United States Job Description: Requisition number: 2363792 Job category: Business & Data Analytics Primary location: Minnetonka, MN Overtime status: Exempt Travel: No Optum Tech is a global leader in health care innovation. Our teams develop cutting-edge solutions that help people live healthier lives and help make the health system work better for everyone. From advanced data analytics and AI to cybersecurity, we use innovative approaches to solve some of health care's most complex challenges. Your contributions here have the potential to change lives. Ready to build the next breakthrough? Join us to start Caring. Connecting. Growing together. The health solutions marketplace is hungry for new ideas, innovative products and software that drives elevated performance for the business and the customer. The UnitedHealth Group family of businesses is feeding incredible solutions to that marketplace every day by bringing out the best in our software engineering teams. We serve customers across the health system. Not only do we have more of them every day, we also have more technology, greater data resources and far broader expertise than any competitor anywhere. We're out to change the way our businesses and consumers engage with technology. If you're in, you'll be challenged like never before. It's time to join this history making. You'll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges. For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week. Primary Responsibilities: - Provide enterprise level leadership for advanced forecasting, predictive, and prescriptive analytics supporting strategic, operational, and financial decision making - Lead the design, execution, and scaling of next generation forecasting capabilities leveraging complex, unstructured, and high volume datasets - Apply advanced statistical modeling, machine learning, simulation, optimization, and mathematical techniques to deliver materially earlier insight into emerging trends and risks - Translate complex and ambiguous business questions into analytically rigorous, scalable forecasting solutions delivering measurable enterprise value - Provide strategic direction and accountability for forecasting architecture, modeling strategy, prioritization, validation, and deployment across the Next Gen Forecasting program - Direct multiple layers of management and senior level data science professionals, ensuring strong technical rigor, delivery discipline, and talent development - Establish forecasting standards, governance, and validation frameworks ensuring accuracy, interpretability, scalability, and sustained stakeholder trust - Partner closely with Finance, Actuarial, Healthcare Economics, and Technology leaders to align forecasting roadmaps, manage cross functional dependencies, and embed insights into enterprise decision workflows - Design and operationalize advanced time series forecasting solutions using classical statistical methods (ARIMA/SARIMA, ETS, state space models) as well as modern machine learning and deep learning approaches - Lead development of forecasting frameworks that explicitly account for trend, seasonality, stationarity, autocorrelation, and temporal dependencies across large scale enterprise datasets - Establish enterprise forecasting standards including backtesting methodologies, rolling/expanding window validation, probabilistic forecasting, prediction interval generation, and analytical risk management practices - Drive advanced feature engineering strategies for time series forecasting, including lag features, rolling statistics, calendar effects, Fourier terms, and incorporation of exogenous variables such as events, holidays, and external business drivers - Lead implementation of multi-step forecasting strategies including recursive, direct, and hybrid approaches, leveraging sequence modeling architectures such as LSTM and Transformer-based forecasting models where appropriate - Ensure forecasting solutions appropriately address time-based validation, temporal data leakage prevention, model interpretability, scalability, and sustained stakeholder trust You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in. Required Qualifications: - Bachelor's degree in Data Science, Statistics, Mathematics, Computer Science, Engineering, or related field - 15+ Years of overall experience in data science, forecasting, predictive, and prescriptive analytics at enterprise scale - Proven experience leading multi layer data science organizations or highly complex analytics programs - Deep expertise in time series forecasting, statistics, machine learning, simulation, optimization, and advanced mathematical techniques - Hands-on experience with forecasting methodologies including ARIMA/SARIMA, exponential smoothing (ETS), state space models, and machine learning based forecasting approaches - Demonstrated expertise in handling non-stationary time series data, seasonality decomposition, trend modeling, temporal feature engineering, and forecasting validation methodologies - Experience applying machine learning techniques (e.g., XGBoost, LightGBM) to forecasting problems including lag-based feature engineering, rolling window statistics, and proper handling of temporal dependencies - Solid understanding of forecasting evaluation metrics and validation approaches including MAE, RMSE, MAPE, rolling window validation, and backtesting frameworks - Demonstrated ability to drive analytically rigorous solutions that influence strategic, operational, and financial decision making - Exposure to Gen AI skill - Large Language Model, RAG Preferred Qualifications: - Master's or PhD in Data Science, Statistics, Applied Mathematics, Operations Research, or related discipline - Experience in healthcare, actuarial, or healthcare economics analytics environments - Experience leading high visibility, enterprise scale AI or advanced forecasting programs - Experience using advanced analytics platforms and tools including SQL, Python, R, Hadoop, and large scale data technologies - Experience with advanced forecasting frameworks and libraries such as Prophet, statsmodels, darts, Nixtla, scikit-learn, TensorFlow, or PyTorch - Solid background in forecasting governance, validation, and analytical risk management, and probabilistic forecasting methodologies - Familiarity with sequence-to-sequence forecasting architectures, Transformer models, and modern deep learning approaches for time series forecasting *All employees working remotely will be required to adhere to UnitedHealth Group's Telecommuter Policy. Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. The salary for this role will range from $159,300 to $273,200 annually based on full-time employment. We comply with all minimum wage laws as applicable. Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants. At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission. UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations. UnitedHealth Group is a drug - free workplace. Candidates are required to pass a drug test before beginning employment.
Senior Data Scientist Advisor
Edison InternationalEdison International has been a leader in electricity services since it was established in southern California in 1886. Today, through its subsidiaries, the com
Senior Data Scientist Advisor Location: Pomona United States Job ID: 6836 Job Family: System Planning & Engineering Location: Pomona, CA, US Pay: $182,800 – $274,100 Job Description: Join the Clean Energy Revolution Become a Senior Data Scientist Advisor at Southern California Edison (SCE) and build a better tomorrow. In this job, you'll be responsible for advancing AI-powered vegetation management by driving the development and refinement of AI models that interpret complex remote sensing data. In addition to remote sensing-focused initiatives, the role will also contribute specialized data science expertise to broader vegetation management projects. By solving highly technical challenges, this role directly enables a more efficient and proactive approach to managing vegetation risk and protecting critical infrastructure. As a Senior Data Scientist Advisor, your work will help power our planet, reduce carbon emissions and create cleaner air for everyone. Are you ready to take on the challenge to help us build the future? Responsibilities - Partners with stakeholders to understand business problems and objectives and translate them into data science solutions, providing thought leadership and creativity in solving unconventional problems. - Leads and provides direction to other analytic teams across the organization, spearheads development of highly complex predictive and prescriptive models to drive data insights. - Defines new data collection sources and methods and makes strategic recommendations for data practices (collection, preparation, quality). - Establishes standards and identifies techniques to solve data quality and integrity problems. - Leads development and maintenance of models, tools and methods for highly complex analytics problems. - Engages business OUs to solve critical issues and identify optimization opportunities, performs sophisticated analytics and translates into actionable insights. - Applies machine learning concepts and authors production-level code to provide innovative ideas and research to improve decision making. - Identifies opportunities to apply data science and predictive analytics to help solve the most critical business challenges. - Establishes scalable, automated processes for large-scale data analysis, modeling, validation and - Implementation. - Leverages relationships with data science individuals at other utilities and vendors and participate and present in relevant industry forums and groups to promote SCE's approach. - A material job duty of all positions within the Company is ensuring the protection of all its physical, financial and cybersecurity assets, and properly accessing and managing private customer data, proprietary information, confidential medical records, and other types of highly sensitive information and data with the highest standards of conduct and integrity. Minimum Qualifications - Bachelor's Degree (or higher) in a quantitative discipline such as Computer Science, Electrical Engineering, Statistics, Mathematics, or Business Analytics. - Seven or more years of experience in data analytics related work. *PhD in Computer Science, Electrical Engineering, Statistics, Mathematics, business analytics = 2 years of experience credit. Preferred Qualifications - Master's or Ph.D. degree in computer science, statistics, mathematics, engineering, or a closely related science and engineering field. - Five or more years hands-on experience working in data science roles, including experience with machine learning algorithms, statistical modeling, and big data platforms. - Hands-on experiences with languages like Python, R, and SQL, as well as knowledge of cloud computing environments (e.g., AWS, Azure, Google Cloud), and familiarity with data visualization tools like Tableau or Power BI. - Experience evaluating and deploying artificial intelligence solutions into business processes. - Experience with LiDAR data, processing and modeling. - Strong executive communication skills, including building concise narratives and presenting insights to senior leadership; experience communicating to both technical and non-technical audiences). Additional Information - This position's work mode is hybrid. The employee will report to an SCE facility for a set number of days with the option to work remotely on the remaining days. Unless otherwise noted, employees are required to work and reside in the state of California. Further details of this work mode will be discussed at the interview stage. The work mode can be changed based on business needs. - Visit our Candidate Resource page to get meaningful information related to benefits, perks, resources, testing information, hiring process, and more! - Qualified applications with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. - The primary work location for this position is Pomona, CA. However, the successful candidate may also be asked to work for an extended amount of time at (alternate work location). - Position will require up to 10% local traveling and being out in the field throughout the SCE service territory. - This position has been identified as a NERC/CIP impacted position - Prior to being hired, the successful candidate must pass a Personnel Risk Assessment (PRA) or Background Investigation. Once hired, the candidate must complete specified training prior to gaining un-escorted access to assigned work location and performing necessary job duties. - Relocation may apply to this position.


