Socure logo
Socure

The leading provider of digital identity verification and fraud solutions. Salesinfo@socure.com

Data Scientist II – Big Data R&D, Identity Graph, KYC

Data ScientistData ScientistFull TimeRemoteMid LevelTeam 501-1,000Since 2012H1B SponsorCompany SiteLinkedIn

Location

California

Posted

44 days ago

Salary

$140K - $170K / year

Seniority

Mid Level

Job Description

Data Scientist II – Big Data R&D, Identity Graph, KYC

Socure

• Contribute to the design and implementation of machine learning, data mining, statistical, and graph-based algorithms to analyze very large datasets for identity verification and anomaly detection. • Analyze large datasets to help develop and refine entity-resolution and identity-matching algorithms that drive Socure’s KYC and compliance solutions. • Build and maintain components of data-processing pipelines (ETL, feature generation, normalization) using tools such as Spark/PySpark and AWS (e.g., EMR, S3). • Support senior data scientists with feature engineering, data exploration, error analysis, and A/B test setup for new models and signals. • Help evaluate new third‑party and internal data sources: profile data quality, design offline experiments, and summarize impact on coverage and model performance. • Implement and maintain SQL and Python/R code for data extraction, transformation, and validation; contribute to code reviews and basic testing. • Provide analytical support to compliance and regulatory product teams, including ad hoc investigations, simple dashboards, and data deep dives. • Communicate findings in a clear, structured way to peers and cross‑functional partners (Product, Engineering, Client Analysis), focusing on key insights and trade‑offs. • Work effectively in a fast‑paced, cross‑functional environment; demonstrate ownership of well-scoped tasks and follow through to completion.

Job Requirements

  • Master’s degree with 2+ years of experience, or Ph.D. with 1+ years of experience in a data science or analytics role, or equivalent practical experience.
  • Proficiency in at least one general-purpose programming language used in data science (Python, or Scala).
  • Solid experience writing and optimizing SQL for large datasets; comfort working in data lake / warehouse environments.
  • Hands‑on experience with Spark or PySpark and common ML libraries (e.g., scikit‑learn, XGBoost, TensorFlow/PyTorch a plus).
  • Familiarity with UNIX environments and the AWS ecosystem (e.g., EMR, S3); Databricks experience is a plus.
  • Working knowledge of supervised/unsupervised ML and basic statistics (similarity measures, clustering, evaluation metrics).
  • Exposure to graph techniques or graph databases (Neo4j, AWS Neptune, GraphFrames) is a strong plus.
  • Bonus: experience with Elasticsearch or DynamoDB; workflow tools such as Airflow for automating data pipelines.
  • Ability to break down loosely defined problems, ask good clarifying questions, and iterate quickly with feedback.

Benefits

  • Offers Equity
  • Offers Bonus

Related Categories

Related Job Pages

More Data Scientist Jobs

Leidos logo

Data Science / Decision Science Intern

Leidos

Leidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.

Data Scientist44 days ago
Full TimeRemoteTeam 10,001+Since 1969H1B Sponsor

General Program Information / Position Overview: The Health Sector at Leidos is seeking a Data Science / Decision Science Intern (Health Data Intelligence) to support advanced analytics initiatives in a remote, U.S.-based environment. This role targets high-performing graduate-level candidates with demonstrated experience applying data science to real-world, decision-driven problems. You will contribute to the development of scalable data products, reporting ecosystems, and decision-support frameworks built on standardized data sources. The role emphasizes applying advanced analytics, visualization, and emerging AI/ML techniques to generate a decision advantage across clinical, operational, and strategic domains. This internship is structured as a full-time summer experience (12–16 weeks) with the potential to extend into a part-time role during the academic year. Primary Responsibilities - Perform data extraction, transformation, and preparation using Python, SQL, and standardized data sources - Conduct exploratory data analysis (EDA) to identify trends, patterns, and data quality issues - Apply statistical analysis and machine learning techniques (e.g., regression, classification, clustering) to generate insights and support decision-making - Develop and operationalize scalable reporting frameworks and reusable data products aligned to standardized data models - Design and deliver advanced dashboards and visualization solutions using Tableau, Power BI, Looker, or similar platforms - Translate analytical outputs into decision-ready insights, including structured recommendations and trade-off analysis - Collaborate with stakeholders to define requirements and deliver production-oriented analytical solutions - Identify opportunities to improve data quality, standardization, and analytical efficiency - Communicate findings through executive-ready visualizations, storytelling, and concise written deliverables Basic Qualifications - Bachelor’s degree in a quantitative discipline (e.g., Data Science, Statistics, Mathematics, Computer Science, Engineering, or related field) - Currently pursuing a graduate degree (Master’s or Ph.D. candidate) in a relevant quantitative or analytical field (required) - Demonstrated, hands-on experience with Python and SQL for data analysis and manipulation - Experience working with healthcare data (clinical, claims, public health, or operational datasets) - Experience working in Jupyter notebooks or equivalent analytical environments - Proven experience developing data visualizations and dashboards using Tableau, Power BI, Looker, or similar tools - Demonstrated end-to-end project experience, including data ingestion, analysis, modeling, and presentation of results - Strong foundation in statistics, data analysis, and structured problem-solving - Experience applying machine learning techniques (e.g., regression, classification, clustering, or similar methods) in academic, research, or project settings - Familiarity with AI-assisted analytical workflows (e.g., use of generative AI or automation to enhance coding, analysis, or insight generation) - Ability to work independently on complex, ambiguous analytical problems and deliver high-quality outputs - Strong written and verbal communication skills, including the ability to present findings to technical and non-technical audiences - U.S. Citizenship preferred Preferred Qualifications - Prior internship, research, or project experience developing production-grade data products, analytical pipelines, or reporting systems - Experience applying analytics in a decision science, business intelligence, or strategy-oriented context - Experience working with cloud-based data platforms (e.g., AWS, Azure, GCP, Snowflake) - Familiarity with data engineering concepts, including ETL pipelines, data modeling, and data standardization - Familiarity with data governance, metadata management, or data quality frameworks - Demonstrated ability to create executive-level dashboards and presentations that drive decision-making - Portfolio, GitHub repository, or equivalent body of work demonstrating applied data science, analytics, or visualization projects - Evidence of intellectual curiosity, initiative, and creativity in applying data to solve complex, real-world problems If you're looking for comfort, keep scrolling. At Leidos, we outthink, outbuild, and outpace the status quo — because the mission demands it. We're not hiring followers. We're recruiting the ones who disrupt, provoke, and refuse to fail. Step 10 is ancient history. We're already at step 30 — and moving faster than anyone else dares. Original Posting: April 22, 2026 For U.S. Positions: While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above. Pay Range: Pay Range $48,100.00 - $86,950.00 The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.

United States
$48.1K - $87.0K / year
Peerspace logo

Senior Data Scientist

Peerspace

Welcome to where extraordinary begins.

Data Scientist44 days ago
ContractRemoteTeam 51-200Since 2013H1B Sponsor

• Work closely with PED to set goals, understand how users are engaging with our product, and learn from experiments in our two-sided marketplace • Develop measurement frameworks and institute new metrics or reporting systems to help us understand our marketplace and users • Align cross-functional stakeholders on the priorities and analytical roadmap for critical areas of our business • Design and analyze experiments • Dive deep on user behavior and the dynamics of our marketplace • Unpack business trends and diagnose drivers of business performance • Collaborate with the rest of the Analytics and Data Engineering teams and the broader data community to grow and nurture data culture at Peerspace

United States
Stripe logo

Data Science Manager – Finance and Strategy

Stripe

Help increase the GDP of the internet.

Data Scientist44 days ago
Full TimeRemoteTeam 1,001-5,000Since 2010H1B Sponsor

• Drive the roadmap and priorities for your team, and work with many Stripe leaders across the company to enhance our ability to be data-driven. • Collaborate with stakeholders across the organization such as engineering, analytics, operations, finance, and marketing. • Lead and manage processes to help the team do its best work and engage effectively with the rest of Stripe. • Manage a high-performing team of data scientists, supporting them to achieve a high level of technical excellence and advance in their careers. • Recruit and onboard great data scientists, in collaboration with Stripe's recruiting team. • Contribute to broad data science initiatives as a member of Stripe's data science management team.

New York + 1 moreAll locations: New York | Washington
Eureka Labs logo

Freelance Data Scientist

Eureka Labs

Excelling Product Factory Partner for fast-growing marketplaces & SaaS companies. #ThinkBuildEnjoy #ChallengeYourself

Data Scientist44 days ago
Part TimeRemoteTeam 51-200Since 2017H1B No Sponsor

• Design, train, evaluate, and deploy machine learning models for real-time and near–real-time applications. • Partner with the Senior Cloud Engineer to deploy models within cloud pipelines. • Collaborate on the design and optimization of data pipelines used for model training, inference, and monitoring. • Establish metrics for model performance, data drift, and system health. • Analyze the impact of new features and model changes in production environments. • Work closely with cloud, perception, and UI engineering teams to integrate ML outputs into customer-facing products.

United States