Data Science Specialist – Feature Store & ML Platform

Data ScientistData ScientistFull TimeRemoteSeniorTeam 10,001+H1B SponsorCompany SiteLinkedIn

Location

Brazil

Posted

11 days ago

Salary

0

Seniority

Senior

Job Description

Data Science Specialist – Feature Store & ML Platform

Compass

• Lead the development and evolution of Feature Store capabilities: data lineage, feature views, feature recommendation, and new query engines; • Design and implement Apache Iceberg tables with a focus on read performance, versioning, and schema evolution; • Architect and optimize the serving layer with Redis for real-time features with strict latency SLOs; • Integrate and optimize Amazon EMR as a query and large-scale processing engine; • Define and implement feature selection and transformation pipelines with end-to-end traceability; • Establish standards for feature quality, versioning, and governance across the platform; • Act as the technical reference for data and data science teams that consume the Feature Store.

Job Requirements

  • Proven expertise in feature engineering on enterprise ML platforms (Feast, Tecton, Hopsworks, or equivalents)
  • Advanced proficiency in Apache Spark / PySpark for distributed processing at scale
  • Deep knowledge of Apache Iceberg and lakehouse architectures (comparative experience with Delta Lake and Hudi)
  • Expertise in Redis for low-latency feature serving, including cache invalidation strategies and efficient serialization
  • Solid production experience with AWS data services (S3, Glue, EMR, Redshift, Athena)
  • Preferred:
  • Production experience with data lineage and metadata catalogs (DataHub, OpenMetadata, Marquez)
  • Experience with Amazon EMR: cluster configuration, cluster optimization, and Spark job tuning
  • Expertise in MLOps practices focused on versioning and traceability of data artifacts
  • Prior experience in a financial context with high-cardinality, high-frequency data and regulatory requirements
  • Familiarity with data quality tools at scale (Great Expectations, Soda, dbt tests).

Related Categories

Related Job Pages

More Data Scientist Jobs

Wilfrid Laurier University logo

Object-Oriented Programming Instructor

Wilfrid Laurier University

Wilfrid Laurier University endeavors to fill positions with qualified candidates who have a combination of education, experience, skills, and abilities to successfully perform the duties of the position while demonstrating Laurier's Employee Success Factors. Diversity and creating a culture of inclusion is a key pillar of Wilfrid Laurier University's Strategic Academic Plan and is one of Laurier's core values. Laurier is committed to increasing the diversity of faculty and staff and welcomes applications from candidates from equity deserving groups. Indigenous candidates who would like to learn more about equity and inclusive programming at Laurier are welcomed to contact the Office of Indigenous Initiatives at indigenous@wlu.ca. Candidates from other equity deserving groups who would like to learn more about equity and inclusive programming at Laurier are welcomed to contact Equity and Accessibility at equity@wlu.ca.

Data Scientist12 days ago
ContractRemoteTeam 1,001-5,000

Role Description Fundamentals of object-oriented programming, classes, subclasses, inheritance, references, overloading, event-driven and concurrent programming, using modern application programming interface. The language Java will be used. Qualifications - Master’s degree - Computer Science or related field - PhD would be an asset - Preference will be given to applicants with demonstrated expertise in the subject field - Recent scholarly activity related to the course content Requirements - CV (required) - Candidate Application Form (CAF) - Names and Contact Information for Referees - Evidence of Good Teaching - Verification of highest degree - Optional: Cover Letter, Teaching Dossier, Sample Course Outline Benefits - Salary: $10,212.40 Application Process Please click the gold “Apply Now” button located on the top right hand side of the page. You will be asked to sign in if you have already created an account. If you are not a registered user you may create an account to apply to career opportunities. Once an account is created you will be able to sign in to apply for the position. Additional Information - This appointment is in accordance with the Contract Teaching Faculty and Part-time Librarians Collective Agreement. - All applicants are assessed using both the “Appendix H: Assessment of CTF Candidates under 13.6.1” in the collective agreement and the program specific rubric. - Assessment of your application will be based primarily on the Candidate Application Form (CAF). - Applications must be received by 23:59 local time of the date on the posting. - All course offerings will be contingent on adequate student registration and subject to budgetary funding.

Canada
C$10.2K / year
Memorial Hermann Health System logo

Sr Data Scientist

Memorial Hermann Health System

More than a century of patient-centered care. At Memorial Hermann, we are all about advancing health.

Data Scientist12 days ago
Full TimeRemoteTeam 10,001+H1B Sponsor

Role Description Leads complex projects to mine and analyze complex and unstructured data sets using advanced statistical methods for use in data driven decision making. Responsible for leading complex cross-functional teams and providing in-depth data insights for complex business problems that can be approached with advanced analytic techniques to collect, explore, and extract insights from structured and unstructured data. Leads engagement with the customer for complex data science efforts and projects. Typically reports to the Manager of Data Science. Qualifications - Bachelor’s Degree in science, engineering, computer science, mathematics, statistics or related STEM field required. - Master’s Degree in Data Science preferred. - Four (4) years of experience in data science is required. - Demonstrated experience performing multiple projects (academic or professional) using a programming language like Python and libraries (numpy, sciPy, etc.) is required. - Professional experience in a hospital setting, with medical informatics, healthcare information technology/finance/revenue cycle data management, or Electronic Health Record (EHR) data management is preferred. - Business analytical skills (process flows, procedures, spreadsheets, modeling, etc.), technical expertise, mathematical skills and good understanding of design and architecture principles are required. - Demonstrated knowledge of the Extract, Transform, and Load (ETL) process, SQL databases, and handling large data sets. - Demonstrates knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks. - Knowledgeable of advanced statistical techniques and concepts (regression, properties of distributions, statistical tests and proper usage, etc.) and experience with applications. - Ability to communicate, gather requirements and execute storytelling with data. - Possesses working knowledge of the data science project life cycle. - Advanced level programming skills in addition to a working knowledge and experience of statistical analysis tools. - Demonstrates advanced problem solving, analytical reasoning and decision-making skills. - Demonstrates ability to identify and seek needed information to perform problem/situation analysis. - Strong understanding and experience in researching and resolving data issues with a logical, instinctive, and problem-solving mentality working with large, complex and incomplete sources. - Exhibits strong project management skills, with an ability to work independently on multiple projects with competing priorities and a strong commitment to meeting goals and deadlines. - Understanding of database management tools. - Possesses analytical skills and ability to understand and interpret results based on advanced statistical techniques. - Strong written and verbal communication skills in IT and business environments; ability to communicate to technical and non-technical audiences. - Ability to work under minimal supervision in a fast-paced multidisciplinary environment. - Exhibits customer service skills in the form of first-rate work products and project management. - Demonstrates interest in learning new tools and processes. - Ability to manage challenging client situations. - Ability to troubleshoot and recommend solutions. - Ability to translate complex information for a wide range of stakeholders. Requirements - Develops custom data models and algorithms to apply to data sets. - Develops and applies algorithms or models to key business metrics with the goal of improving operations or answering business questions. Provides findings and analysis for use in decision making. - Handles moderately complex issues and problems, and refers more complex issues to higher-level staff. - May provide some leadership, coaching, and/or mentoring to subordinate group. - Maintains existing models and evaluates their goodness of fit. - Provides in-depth data insights from structured and unstructured data for complex business problems through use of advanced analytics techniques, predictive modeling, data mining/visualization and pattern analysis tools. - Performs research, analysis, and modeling on organizational data. - Develops and tests hypotheses and communicates findings in clear, precise and actionable manner to project and leadership teams. - Works closely with teams to identify, understand, and resolve data issues and improve efficiency, productivity and scalability of data processes. - Assists with the evaluation of data science vendors and tools. - Ensures safe care to patients, staff and visitors; adheres to all Memorial Hermann policies, procedures, and standards within budgetary specifications including time management, supply management, productivity and quality of service. - Promotes individual professional growth and development by meeting requirements for mandatory/continuing education and skills competency; supports department-based goals which contribute to the success of the organization; serves as preceptor, mentor and resource to less experienced staff. - Demonstrates commitment to caring for every member of our community by creating compassionate and personalized experiences. Models Memorial Hermann’s service standards by providing safe, caring, personalized and efficient experiences to patients and colleagues. - Other duties as assigned.

United States
Job Closed
CareSource logo

Senior Manager, AI and Data Science

CareSource

This job description is not all inclusive. CareSource reserves the right to amend this job description at any time. CareSource is an Equal Opportunity Employer. We are dedicated to fostering an environment of belonging that welcomes and supports individuals of all backgrounds.

Data Scientist12 days ago
Full TimeRemoteTeam 1,001-5,000Since 30+ yearsH1B Sponsor

• Provide leadership for a multidisciplinary data science team using AI and ML technologies • Develop predictive models, ML algorithms, and statistical techniques to enhance healthcare operations • Collaborate with architecture and data solutions teams to move solutions from prototype to production • Ensure compliance with HIPAA/PHI handling and internal governance standards

United States
$113K - $197.7K / year
Job Closed
General Dynamics logo

Senior Data Scientist

General Dynamics

General Dynamics is a global aerospace and defense company offering products designed to provide safety and security to people around the world. In the past, General Dynamics has p

Data Scientist12 days ago

Role Description Own your opportunity to turn data into measurable outcomes for our customers’ most complex challenges. As a Data Scientist Senior at GDIT, you’ll power innovation to drive mission impact and grow your expertise to power your career forward. The work you’ll do at GDIT will be impactful to the mission of the Centers for Medicare & Medicaid Services. You will play a crucial role in developing data-driven solutions to complex business challenges, using advanced tools and computational skills to interpret, connect, predict, and make discoveries in data. - Contribute to completion of specific programs and projects. - Utilize advanced tools (e.g., Python, PySpark, Databricks) and computational skills to interpret, connect, predict, and make discoveries in data. - Collaborate with internal and external team members to achieve the mission. - Evaluate the effectiveness and accuracy of new data sources and gathering techniques. - Use predictive modeling to increase and optimize customer experience, efficiencies, process improvements, and other business outcomes. - Optimize jobs performance and resource usage, identifying and addressing bottlenecks and inefficiencies in backend systems. - Write clean, well-structured, and maintainable code, adhering to established coding standards and best practices. - Perform thorough code reviews, providing constructive feedback to peers and identifying potential risks or areas for improvement. - Debug and resolve defects, proactively identifying and addressing potential issues before they impact users. - Create and maintain comprehensive technical documentation. - Actively participate in Agile ceremonies, such as stand-ups, sprint planning, and retrospectives, ensuring effective communication and collaboration across the team. - Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions. - May develop processes and machine learning based tools to monitor and analyze model performance and data accuracy. - May coach and review the work of less experienced professionals. - May serve as a team or task lead. Qualifications - Education: Bachelor of Science - Experience: 5+ years of related experience - Python / Apache Spark, Databricks. - Strong understanding of software design patterns, data structures, and algorithms. - Experience with Agile development methodologies. - Related experience in analytic programming, data extraction, querying databases/data warehouses and data analysis. - SAS, R, AWS experience preferred. Requirements - 5 + years of related experience - US Citizenship Required: No Benefits - Comprehensive benefits and wellness packages - 401K with company match - Competitive pay and paid time off - Full flex work weeks where possible - Variety of paid time off plans, including vacation, sick and personal time, holidays, paid parental, military, bereavement and jury duty leave - 15 days of paid leave per calendar year to be used for vacations, personal business, and illness - 10 paid holidays per year - GDIT Paid Family Leave program provides a total of up to 160 hours of paid leave in a rolling 12 month period for eligible employees - Short and long-term disability benefits, life, accidental death and dismemberment, personal accident, critical illness and business travel and accident insurance

United States
$114.8K - $155.3K / year