Job Closed
This listing is no longer active.
Scratch Financial is the world's simplest patient financing solution.
Senior Data Engineer, Data Platform Operations
Location
New York
Posted
149 days ago
Salary
$140K - $180K / year
Seniority
Senior
Job Description
Senior Data Engineer, Data Platform Operations
Scratch Financial
• Define partner onboarding and clean room architecture patterns across Snowflake, LiveRamp, and Databricks that are secure, scalable, and repeatable. • Configure and manage partner-specific clean room environments; deploy and manage Python-based libraries within the platform ecosystem. • Establish and maintain MLOps practices, including model serving, monitoring, and pipeline orchestration for AI/ML features deployed within the platform ecosystem. • Own design and enforcement of granular RBAC policies and least-privilege service accounts. • Serve as the technical lead for onboarding new partners, implementing privacy-preserving controls (e.g., aggregation thresholds and anonymization techniques). • Design, build, and operate scalable ELT pipelines using Snowpark and/or PySpark and advanced SQL to provision Gold datasets. • Implement and evolve identity resolution logic mapping internal data to 3P identifiers (including LUIDs, RampIDs, TransUnion IDs), ensuring privacy-safe practices. • Design and operate scalable data architectures across Snowflake and Databricks supporting batch and near real-time processing patterns. • Build robust automated checks (e.g., Great Expectations or custom SQL assertions) and define quality standards to detect schema drift, null rate spikes, and volume anomalies. • Lead performance optimization across platforms (query tuning, caching, incremental processing) and define and implement query tagging and chargeback models for accurate cost attribution. • Establish monitoring, alerting, runbooks, and standard operating procedures to improve platform reliability and reduce incident time-to-resolution. • Validate that output data adheres to privacy and business requirements, and define test strategies for partner-facing releases. • Serve as the escalation point for diagnosing connection failures, data discrepancies, or latency issues with partner technical teams. • Design and build internal AI agents (using frameworks like LangChain, Snowflake Cortex) and mentor other engineers through code reviews, design discussions, and operational best practices.
Job Requirements
- Bachelor’s degree or higher in Computer Science, Information Systems, Software, Electrical or Electronics Engineering
- 5+ years of Data Engineering experience, with deep proficiency in advanced SQL and Python
- 3+ years of hands-on experience with cloud data platforms, specifically Snowflake or Databricks
- Proven experience building and operating scalable ELT pipelines using orchestration tools (e.g., Airflow, dbt)
- Strong track record designing production-grade systems (observability, reliability, performance tuning, incident response)
- Clean Room Knowledge: Exposure to Data Clean Room concepts and Clean Room platforms like LiveRamp, Snowflake or Databricks
- AI/LLM Experience: Experience building applications with LLMs, RAG, Vector Databases, or frameworks like LangChain/LlamaIndex.
- Ability to mentor other engineers through code reviews, design discussions, and operational best practices
- SnowPro Core Certification OR Databricks Certified Data Engineer Associate (preferred)
- SnowPro Advanced: Data Engineer OR Databricks Certified Data Engineer Professional (highly preferred)
Benefits
- medical, dental and vision insurance
- 401(k)
- paid leave
- tuition reimbursement
- a variety of other discounts and perks
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Senior Data Engineer
Roo VeterinaryRoo Veterinary is a service platform that gives veterinarians, hospitals, and vet techs complete control over where and how they work. The company aims to solve
• Design, develop, and maintain reliable end-to-end data pipelines (both batch and streaming) that connect internal and external systems in ways that best support marketplace growth, customer experience, and operational efficiency. • Contribute to the performance, scalability, and reliability of our entire data ecosystem. • Work with analysts and other data stakeholders to engineer data structures and orchestrate workflows that encode core business logic. • Implement observability, testing, monitoring, validation, and documentation to ensure accuracy, stability, and consistency throughout the data stack. • Join cross-functional squads and tiger teams to rapidly translate evolving data needs into scalable and extensible data models, metrics, and analytical frameworks. • Mentor data stakeholders throughout the organization, share best practices, and meaningfully contribute to architectural and tooling decisions as the data stack evolves.
Senior Manager, Data Engineering
Carrum HealthCarrum Health is a healthcare company that partners with employers to provide employees access to high-quality medical care through a network of top providers. Carrum Health aims t
• Team Development: Lead, mentor, and manage a team of Data Engineers, fostering a culture of ownership, continuous improvement, and technical excellence. • Cross-Functional Partnering: Serve as the primary liaison between the Data Engineering team and other departments (Client Success, Partnerships, Product, Engineering, Clinical, and Business Intelligence) to translate business needs into technical requirements and roadmaps. • Project Management: Work with leadership, stakeholders, and product managers to coordinate roadmap commitments, keep projects on track, and communicate and roll with changes as they inevitably occur. • Process Improvement: Drive the adoption of DevOps and DataOps methodologies, automating workflows, improving deployment processes, and reducing manual operational burden. • Operational Excellence: Oversee daily data operations of our Data Platform, including monitoring, incident response, and performance tuning of data pipelines and databases to ensure high uptime and meet defined SLAs. • Data Quality & Governance: Implement proactive data quality checks and monitoring frameworks. Partner with stakeholders to establish and enforce data governance policies. • HIPAA & Compliance: Ensure all data operations and infrastructure adhere strictly to healthcare regulatory requirements, including HIPAA and other relevant data privacy standards. • Disaster Recovery: Develop and maintain robust backup, recovery, and business continuity plans for critical data assets. • Architect and Build: Lead the design and implementation of scalable, reliable, and performant ETL/ELT data pipelines and data warehouse solutions that meet the demands of a growing organization. • Multi-Cloud Infrastructure: Own and optimize the data infrastructure across AWS and Azure cloud environments, ensuring interoperability, cost efficiency, and robust security. • Technology Expertise: Drive the selection and adoption of best-in-class data technologies, including modern data warehouses (e.g., Snowflake, Azure Synapse), orchestration tools (e.g., AWS Glue, Azure Data Factory), and real-time streaming solutions. • Code Quality: Set and enforce standards for code quality, testing, version control (Git), and documentation for all data engineering projects.
Data Engineer
NPS PrismNPS Benchmarking for a Better Business & Happier Customers - from Bain, the Inventors of NPS
• Design, build, and optimize ETL/ELT workflows using tools like Databricks, SQL, Python/pyspark & Alteryx (Good to have) • Develop and maintain robust, scalable, and efficient data pipelines for processing large datasets • Work on cloud platforms (Azure, AWS) to build and manage data lakes, data warehouses, and scalable data architectures • Utilize cloud services like Azure Data Factory, AWS Glue, or similar for data processing and orchestration • Use Databricks for big data processing, analytics, and real-time data processing • Create and manage SQL-based data solutions, ensuring high availability, scalability, and performance • Collaborate with cross-functional teams to deliver impactful data solutions • Leverage CI/CD pipelines to streamline development, testing, and deployment of data engineering workflows • Maintain clear documentation for data workflows, pipelines, and processes
• Design and implement enterprise-scale data pipelines using Databricks on AWS, leveraging both cluster-based and serverless compute paradigms • Architect and maintain medallion architecture (Bronze/Silver/Gold) data lakes and lakehouses • Develop and optimize Delta Lake tables for ACID transactions and efficient data management • Build and maintain real-time and batch data processing workflows • Create reusable, modular data transformation logic using DBT to ensure data quality and consistency across the organization • Develop complex Python applications for data ingestion, transformation, and orchestration • Write optimized SQL queries and implement performance tuning strategies for large-scale datasets • Implement comprehensive data quality checks, testing frameworks, and monitoring solutions • Design and implement CI/CD pipelines for automated testing, deployment, and rollback of data artifacts • Configure and optimize Databricks clusters, job scheduling, and workspace management • Implement version control best practices using Git and collaborative development workflows • Partner with data analysts, data scientists, and business stakeholders to understand requirements and deliver solutions • Mentor junior engineers and promote best practices in data engineering • Document technical designs, data lineage, and operational procedures • Participate in code reviews and contribute to team knowledge sharing


