Job Closed
This listing is no longer active.
84.51° is a retail data science, insights and media company. We help The Kroger Co., consumer packaged goods companies, agencies, publishers, and affiliates create more personalized and valuable experiences for shoppers across the path to purchase. Powered by cutting-edge science, we utilize first-party retail data from more than 62 million U.S. households sourced through the Kroger Plus loyalty card program to fuel a more customer-centric journey using 84.51° Insights, 84.51° Loyalty Marketing, and our retail media advertising solution, Kroger Precision Marketing. 84.51° follows a 5-day in-office work schedule to support collaboration, alignment, and team connection.
Senior ML Data Engineer
Location
United States
Posted
129 days ago
Salary
$97K - $166.8K / year
Seniority
Senior
Job Description
Senior ML Data Engineer
8451
Role Description The Relevancy Sciences Team is responsible for creating relevant and personalized customer experiences for Kroger's E-commerce platform, which ranks among the top 10 ecommerce companies in the US. We generate trillions of recommendations at scale and deliver them to millions of Kroger customers daily. Our team maintains a comprehensive portfolio of machine learning solutions for search & product recommendations. We are seeking a talented and experienced Senior ML Data Engineer to join our data science team, with specialized expertise in building search and recommender systems. You will architect, build, and operate the critical data infrastructure that powers our machine learning models, spanning from feature engineering to training data generation. This role serves as the bridge between ML requirements and production data systems, with ownership of feature stores, training/evaluation pipelines, and ML-specific data operations. You will enable data scientists to iterate rapidly while ensuring production-grade reliability and scalability. What You'll Do - Feature Store Operations & Governance (40%) - Own the feature request lifecycle from intake through deployment, driving reusability and maintaining a searchable feature catalog. - Design and build scalable feature pipelines that compute features from diverse sources (BigQuery, Azure Data Lake) and write to Feature Store infrastructure (Vertex AI Feature Store + BigQuery). - Build streaming feature engineering pipelines using Apache Beam/Dataflow for real-time feature computation and low-latency model serving with sub-second data freshness. - Ensure point-in-time correctness and online/offline feature consistency to prevent data leakage. - Implement drift detection, data quality monitoring, and alerting mechanisms. - Develop self-service tools and templates that enable teams to independently create features. - Training & Evaluation Data Pipelines (30%) - Build automated pipelines that generate ML-ready training datasets by combining features with labeled target variables. - Implement point-in-time correctness logic and sophisticated sampling strategies to ensure balanced, representative datasets. - Maintain comprehensive dataset versioning for full traceability across model versions. - Generate detailed evaluation reports with performance metrics segmented by business dimensions. - Support operations across both Azure and Vertex AI environments during platform migration. - ML Data Operations & Reliability (20%) - Serve as Tier 2/3 on-call responder for feature data quality incidents, diagnosing and resolving pipeline failures and performance issues. - Maintain comprehensive lineage tracking and metadata management for full data traceability. - Support regulatory compliance through proper data governance and documentation. - Standards, Education & Collaboration (10%) - Establish and enforce feature naming conventions, data quality thresholds, and point-in-time correctness patterns. - Conduct workshops on feature engineering best practices and provide expert guidance on feature design. - Partner with Data Scientists, ML Engineers, Data Engineering, and MLOps teams to optimize infrastructure and align with technical strategy. Qualifications - 3+ years of hands-on experience building and maintaining ML data pipelines in production environments with demonstrated expertise in scaling and reliability. - Expert-level SQL skills and advanced Python programming capabilities with experience in data processing frameworks and ML libraries. - Proven experience with cloud data platforms, with strong preference for GCP ecosystem including BigQuery, Dataflow, Vertex AI Feature Store, and associated ML services. - Deep understanding of end-to-end ML workflows including training data preparation, model evaluation methodologies, and serving infrastructure requirements. - Production operations mindset with experience in monitoring, alerting, on-call responsibilities, and meeting SLA commitments. Requirements - Hands-on experience with Feature Store platforms such as Vertex AI Feature Store, Feast, Tecton, or similar enterprise solutions. - Deep knowledge of point-in-time correctness principles, temporal joins, and time-series data modeling best practices. - Multi-cloud experience with both Azure and GCP platforms, including data migration and hybrid cloud architectures. - Strong familiarity with core ML concepts including feature engineering, label creation, train/test/validation splits, and data leakage prevention. - Background spanning both analytics engineering and ML-specific data engineering with understanding of the unique requirements of each domain. Success Indicators - Improved Data Science Productivity: Data Scientists spend significantly less time on data preparation and infrastructure concerns, enabling more focus on model development and experimentation. - Increased Feature Reuse: Measurable increase in feature reuse across multiple models and teams, reducing redundant development effort and improving consistency. - Reliable Automation: Training and evaluation data generation processes operate reliably with minimal manual intervention and high uptime. - Efficient Incident Response: Data quality incidents are triaged quickly with clear escalation paths and rapid resolution times. - Accelerated ML Iteration: Overall ML model development and iteration velocity improves measurably across all teams using the platform. Benefits - Health: Medical with competitive plan designs and support for self-care, wellness, and mental health. Dental and vision benefits available. - Wealth: 401(k) with Roth option and matching contribution. Health Savings Account with matching contribution (requires participation in qualifying medical plan). AD&D and supplemental insurance options. - Happiness: Paid time off with flexibility to meet your life needs, including 5 weeks of vacation time, 7 health and wellness days, 3 floating holidays, and 6 company-paid holidays per year. Paid leave for maternity, paternity, and family care instances. Pay Transparency and Benefits The stated salary range represents the entire span applicable across all geographic markets from lowest to highest. Actual salary offers will be determined by multiple factors including but not limited to geographic location, relevant experience, knowledge, skills, other job-related qualifications, and alignment with market data and cost of labor. In addition to salary, this position is also eligible for variable compensation. Pay Range: $97,000 - $166,750 USD
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Senior Data Engineer – Data Governance Lead
XsollaXsolla's video game business engine helps game developers and publishers operate more efficiently and sell more games.
• This role will assign the data engineering efforts for the **Uer Platform (CDP)** and **Recommendation Engine**, ensuring data accuracy, performance, data alignment and security across pipelines connecting **Snowflake, Postgres, Kafka, and API Gateway** services. • You’ll collaborate with ML engineers, backend teams, and business stakeholders to build reliable, high-performance data systems that support insights, automation, and machine learning use cases • This role will lead the data governance best practices and communication cross the organization.
• Designing, developing, and implementing efficient and scalable data pipelines • Building and maintaining modern data infrastructure • Optimizing the performance of data platforms • Implementing security and compliance best practices • Working closely with data science and analytics teams • Researching and evaluating new data engineering tools and technologies • Creating and maintaining technical documentation
Data Architect – Azure Synapse Analytics, Microsoft Fabric
Multiplica TalentWe connect extraordinary talent with forward thinking companies.
• Diseñar arquitecturas analíticas modernas • Optimizar rendimiento y calidad de datos • Implementar soluciones utilizando Azure Synapse Analytics y Microsoft Fabric
Senior Principal Data Engineer
AutodeskAutodesk is an award-winning Fortune 1000 company based in San Rafael, California. Over the years, the company has made significant contributions toward revolutionizing the movemen
• Work with multiple scrum teams (each has 7-9 engineers), and act as a force multiplier by coaching, mentoring, and developing high-performing data engineering teams and individuals. • Establish and uphold high standards for code quality, readability, and maintainability across multiple engineering teams. • Quickly and confidently navigate large, unfamiliar codebases, making sound technical decisions in ambiguous or evolving environments. • Own and drive the data engineering approach to data quality, including framework design, enforcement, and ongoing improvement. • Lead engineering teams through complex production incidents and outages, driving effective triage, root cause analysis, and durable remediation. • Guide teams toward mature, high-performing DataOps practices that improve reliability, observability, and delivery velocity. • Apply deep expertise in SQL best practices, with an emphasis on performance optimization, readability, and long-term maintainability. • Demonstrate strong understanding of conceptual, logical, and physical data modeling, and apply these principles effectively at enterprise scale. • Solve complex, enterprise-scale data engineering challenges across GTM systems, balancing technical rigor with business impact. • Define, standardize, and enforce testing frameworks and quality gates for data engineering workloads. • Serve as a technical decision-maker for best practices, resolving tradeoffs and driving alignment across teams when standards or approaches are unclear. • Business domain experience in subscription and consumption business models. • Work closely with different stakeholders: Business owners, users, product managers, program managers, architects, engineering managers & developers, etc. to translate business needs and product requirements to well-documented engineering solutions. • Constantly communicating updates to stakeholders and other partners with stakeholders in different phases in terms of requirements clarification, solution/planning review, status/progress sharing etc.



