Socure

The leading provider of digital identity verification and fraud solutions. Salesinfo@socure.com

Data Scientist II – Big Data R&D, Identity Graph, KYC

Data ScientistData ScientistFull Time Remote Mid LevelTeam 501-1,000Since 2012H1B SponsorCompany Site LinkedIn

Location

California

Posted

100 days ago

Salary

$140K - $170K / year

Seniority

Mid Level

Postgraduate Degree2 yrs expEnglishAirflow AWS DynamoDB ElasticSearch ETL Neo4j PySpark Python PyTorch Scala Spark SQL Tensorflow Unix

Job Description

• Contribute to the design and implementation of machine learning, data mining, statistical, and graph-based algorithms to analyze very large datasets for identity verification and anomaly detection. • Analyze large datasets to help develop and refine entity-resolution and identity-matching algorithms that drive Socure’s KYC and compliance solutions. • Build and maintain components of data-processing pipelines (ETL, feature generation, normalization) using tools such as Spark/PySpark and AWS (e.g., EMR, S3). • Support senior data scientists with feature engineering, data exploration, error analysis, and A/B test setup for new models and signals. • Help evaluate new third‑party and internal data sources: profile data quality, design offline experiments, and summarize impact on coverage and model performance. • Implement and maintain SQL and Python/R code for data extraction, transformation, and validation; contribute to code reviews and basic testing. • Provide analytical support to compliance and regulatory product teams, including ad hoc investigations, simple dashboards, and data deep dives. • Communicate findings in a clear, structured way to peers and cross‑functional partners (Product, Engineering, Client Analysis), focusing on key insights and trade‑offs. • Work effectively in a fast‑paced, cross‑functional environment; demonstrate ownership of well-scoped tasks and follow through to completion.

Job Requirements

Master’s degree with 2+ years of experience, or Ph.D. with 1+ years of experience in a data science or analytics role, or equivalent practical experience.
Proficiency in at least one general-purpose programming language used in data science (Python, or Scala).
Solid experience writing and optimizing SQL for large datasets; comfort working in data lake / warehouse environments.
Hands‑on experience with Spark or PySpark and common ML libraries (e.g., scikit‑learn, XGBoost, TensorFlow/PyTorch a plus).
Familiarity with UNIX environments and the AWS ecosystem (e.g., EMR, S3); Databricks experience is a plus.
Working knowledge of supervised/unsupervised ML and basic statistics (similarity measures, clustering, evaluation metrics).
Exposure to graph techniques or graph databases (Neo4j, AWS Neptune, GraphFrames) is a strong plus.
Bonus: experience with Elasticsearch or DynamoDB; workflow tools such as Airflow for automating data pipelines.
Ability to break down loosely defined problems, ask good clarifying questions, and iterate quickly with feedback.

Benefits

Offers Equity
Offers Bonus

Related Categories

Data Scientist

Related Job Pages

Data Scientist Jobs in California Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Scientist Jobs

Data Science / Decision Science Intern

Leidos

Leidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.

Data Scientist100 days ago

Full Time RemoteTeam 10,001+Since 1969H1B Sponsor

Company Site LinkedIn

General Program Information / Position Overview: The Health Sector at Leidos is seeking a Data Science / Decision Science Intern (Health Data Intelligence) to support advanced analytics initiatives in a remote, U.S.-based environment. This role targets high-performing graduate-level candidates with demonstrated experience applying data science to real-world, decision-driven problems. You will contribute to the development of scalable data products, reporting ecosystems, and decision-support frameworks built on standardized data sources. The role emphasizes applying advanced analytics, visualization, and emerging AI/ML techniques to generate a decision advantage across clinical, operational, and strategic domains. This internship is structured as a full-time summer experience (12–16 weeks) with the potential to extend into a part-time role during the academic year. Primary Responsibilities - Perform data extraction, transformation, and preparation using Python, SQL, and standardized data sources - Conduct exploratory data analysis (EDA) to identify trends, patterns, and data quality issues - Apply statistical analysis and machine learning techniques (e.g., regression, classification, clustering) to generate insights and support decision-making - Develop and operationalize scalable reporting frameworks and reusable data products aligned to standardized data models - Design and deliver advanced dashboards and visualization solutions using Tableau, Power BI, Looker, or similar platforms - Translate analytical outputs into decision-ready insights, including structured recommendations and trade-off analysis - Collaborate with stakeholders to define requirements and deliver production-oriented analytical solutions - Identify opportunities to improve data quality, standardization, and analytical efficiency - Communicate findings through executive-ready visualizations, storytelling, and concise written deliverables Basic Qualifications - Bachelor’s degree in a quantitative discipline (e.g., Data Science, Statistics, Mathematics, Computer Science, Engineering, or related field) - Currently pursuing a graduate degree (Master’s or Ph.D. candidate) in a relevant quantitative or analytical field (required) - Demonstrated, hands-on experience with Python and SQL for data analysis and manipulation - Experience working with healthcare data (clinical, claims, public health, or operational datasets) - Experience working in Jupyter notebooks or equivalent analytical environments - Proven experience developing data visualizations and dashboards using Tableau, Power BI, Looker, or similar tools - Demonstrated end-to-end project experience, including data ingestion, analysis, modeling, and presentation of results - Strong foundation in statistics, data analysis, and structured problem-solving - Experience applying machine learning techniques (e.g., regression, classification, clustering, or similar methods) in academic, research, or project settings - Familiarity with AI-assisted analytical workflows (e.g., use of generative AI or automation to enhance coding, analysis, or insight generation) - Ability to work independently on complex, ambiguous analytical problems and deliver high-quality outputs - Strong written and verbal communication skills, including the ability to present findings to technical and non-technical audiences - U.S. Citizenship preferred Preferred Qualifications - Prior internship, research, or project experience developing production-grade data products, analytical pipelines, or reporting systems - Experience applying analytics in a decision science, business intelligence, or strategy-oriented context - Experience working with cloud-based data platforms (e.g., AWS, Azure, GCP, Snowflake) - Familiarity with data engineering concepts, including ETL pipelines, data modeling, and data standardization - Familiarity with data governance, metadata management, or data quality frameworks - Demonstrated ability to create executive-level dashboards and presentations that drive decision-making - Portfolio, GitHub repository, or equivalent body of work demonstrating applied data science, analytics, or visualization projects - Evidence of intellectual curiosity, initiative, and creativity in applying data to solve complex, real-world problems If you're looking for comfort, keep scrolling. At Leidos, we outthink, outbuild, and outpace the status quo — because the mission demands it. We're not hiring followers. We're recruiting the ones who disrupt, provoke, and refuse to fail. Step 10 is ancient history. We're already at step 30 — and moving faster than anyone else dares. Original Posting: April 22, 2026 For U.S. Positions: While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above. Pay Range: Pay Range $48,100.00 - $86,950.00 The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.

AI/ML Python SQL Tableau Power BI Looker Jupyter AWS Azure GCP Snowflake Data Engineering ETL GitHub

View details: Data Science / Decision Science Intern

United States

$48.1K - $87.0K / year

Apply

Senior Data Scientist

Peerspace

Welcome to where extraordinary begins.

Data Scientist100 days ago

Contract RemoteTeam 51-200Since 2013H1B Sponsor

Company Site LinkedIn

• Work closely with PED to set goals, understand how users are engaging with our product, and learn from experiments in our two-sided marketplace • Develop measurement frameworks and institute new metrics or reporting systems to help us understand our marketplace and users • Align cross-functional stakeholders on the priorities and analytical roadmap for critical areas of our business • Design and analyze experiments • Dive deep on user behavior and the dynamics of our marketplace • Unpack business trends and diagnose drivers of business performance • Collaborate with the rest of the Analytics and Data Engineering teams and the broader data community to grow and nurture data culture at Peerspace

Python SQL Tableau

View details: Senior Data Scientist

United States

Apply

Job Closed

Data Science Manager – Finance and Strategy

Stripe

Help increase the GDP of the internet.

Data Scientist100 days ago

Full Time RemoteTeam 1,001-5,000Since 2010H1B Sponsor

Company Site LinkedIn

• Drive the roadmap and priorities for your team, and work with many Stripe leaders across the company to enhance our ability to be data-driven. • Collaborate with stakeholders across the organization such as engineering, analytics, operations, finance, and marketing. • Lead and manage processes to help the team do its best work and engage effectively with the rest of Stripe. • Manage a high-performing team of data scientists, supporting them to achieve a high level of technical excellence and advance in their careers. • Recruit and onboard great data scientists, in collaboration with Stripe's recruiting team. • Contribute to broad data science initiatives as a member of Stripe's data science management team.

View details: Data Science Manager – Finance and Strategy

New York + 1 more

Apply

Freelance Data Scientist

Eureka Labs

Excelling Product Factory Partner for fast-growing marketplaces & SaaS companies. #ThinkBuildEnjoy #ChallengeYourself

Data Scientist100 days ago

Part Time RemoteTeam 51-200Since 2017H1B No Sponsor

Company Site LinkedIn

• Design, train, evaluate, and deploy machine learning models for real-time and near–real-time applications. • Partner with the Senior Cloud Engineer to deploy models within cloud pipelines. • Collaborate on the design and optimization of data pipelines used for model training, inference, and monitoring. • Establish metrics for model performance, data drift, and system health. • Analyze the impact of new features and model changes in production environments. • Work closely with cloud, perception, and UI engineering teams to integrate ML outputs into customer-facing products.

AWS Cloud PostgreSQL Python PyTorch Scikit-Learn

View details: Freelance Data Scientist

United States

Apply

Data Scientist II – Big Data R&D, Identity Graph, KYC

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Scientist Jobs

Data Science / Decision Science Intern

Senior Data Scientist

Data Science Manager – Finance and Strategy

Freelance Data Scientist