Job Closed
This listing is no longer active.
Scratch Financial is the world's simplest patient financing solution.
Senior Staff Data Engineer
Location
New York
Posted
173 days ago
Salary
$150K - $185K / year
Seniority
Senior
Job Description
Senior Staff Data Engineer
Scratch Financial
• Designs, builds, and oversees the deployment and operation of technology architecture, solutions and software to capture, manage, store, and utilize structured and unstructured data • Contributor to the overall Data Product roadmap by working closely with our business partners to understand their challenges and develop analytical tools to help drive business decisions • Develops technical tools and programming that leverage artificial intelligence, machine learning, and big-data techniques to cleanse, organize and transform data • Leverage prototyping methodologies to propose and design creative business solutions that exploit our broad toolset of technologies • Creates and establishes design standards and assurance processes for software, systems, and applications development • Reviews internal and external business and product requirements for data operations and activity and suggests changes and upgrades to systems and storage • Design, develop, and maintain CI/CD pipelines using GitHub Actions to automate deployment, testing, and monitoring of applications. • Implement and manage serverless solutions • Implement infrastructure as code (IaC) practices • Work with development teams to set up automated testing frameworks • Understands the basics of relational data modeling
Job Requirements
- Strong Computer Science/Engineering/Information Systems background
- 10+ Years of Experience in Data Modeling, Data architecture, Data Quality, Metadata, ETL, and Data Warehouse methodologies and technologies.
- 5+ years experience with AWS technologies.
- Strong experience using Python and Pandas in an AWS Lambda framework is highly desired.
- Experience using EMR and/or DataBricks or the ability to read EMR code and translate it into Lambdas.
- Proven experience (3+ years) in designing and managing CI/CD pipelines, specifically using GitHub Actions.
- Demonstrated experience with Python, APIs, Spark, and Scala.
- Experience with advanced SQL, Linux, MicroStrategy, Tableau, and Pandas.
- Bachelor’s degree in Engineering, Computer Science, Information Systems or related field with 10+ years of relevant experience.
Benefits
- medical, dental and vision insurance
- 401(k)
- paid leave
- tuition reimbursement
- a variety of other discounts and perks
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Reporting to our Director of BI & Analytics, you will join our data team as a key internal consultant and technical expert. Whether you are a seasoned architect or a rising talent in the analytics space, your primary mission is the same: to move beyond simple reporting and transform complex data into compelling data stories. • This is a high-impact role where technical rigor meets design strategy. You will serve as the bridge between our data infrastructure and our stakeholders, owning the end-to-end data lifecycle. This includes architecting scalable data models and ELT pipelines, as well as partnering directly with business lines to build polished, intuitive Tableau dashboards. • If you are a versatile professional who enjoys both writing complex Python/SQL transformation logic and crafting data stories that answer the "so what?" for leadership, we encourage you to apply. • What you'll do: Visualization & Storytelling • Tableau Design: Design and build highly interactive dashboards that guide users through a clear narrative flow. You don’t just display data; you highlight trends, anomalies, and actionable insights. • UX/UI Strategy: Apply design best practices (layout, color theory, pre-attentive attributes) to ensure dashboards are intuitive, consistent, and reduce cognitive load. • Performance Tuning: Proactively monitor and optimize both SQL queries and Tableau workbooks to ensure fast load times and a seamless user experience. • Data Engineering & Architecture • Data Modeling: Own the design and optimization of dimensional data models (Star Schemas) in BigQuery to create a clean, accessible, and performant "semantic layer" for analytics. • Pipeline Development: Design, build, and maintain scalable ELT / ETL pipelines (using SQL, Python, and orchestration tools) to transform raw data into analytics-ready datasets. • Data Quality & Governance: Establish and advocate for data integrity by implementing automated testing, validation frameworks, and consistent metric definitions. • Strategy & Consultation • Internal Consulting: Act as a trusted advisor to stakeholders. Translate vague business questions into strict technical requirements and analytical stories that get to the root of the problem. • Mentorship: Foster a culture of data literacy by mentoring business users on dashboard interpretation and training junior analysts on SQL best practices. • Documentation: Maintain clear technical documentation for data lineages, metric definitions, and pipeline logic.
AI Data Engineer
VeevaHeadquartered in Pleasanton, California, Veeva is a leading provider of cloud-based software and services for the life sciences industry. As an employer, Veeva
• Evaluation Strategy & Planning: Define and establish comprehensive evaluation strategies for new AI Agents. Prioritize the integrity and coverage of test data sets to reflect real-world usage and potential failure modes • LLM Output Integrity Assessment: Programmatically and manually evaluate the quality of LLM-generated content against predefined metrics (e.g., factual accuracy, contextual relevance, coherence, and safety standards) • Creating High-Fidelity Datasets: Design, curate, and generate diverse, high-quality test data sets, including challenging prompts and scenarios. Evaluate LLM outputs to proactively identify system biases, unsafe content, hallucinations, and critical edge cases • Automation of Evaluation Pipelines: Develop, implement, and maintain scalable automated evaluations to ensure efficient, continuous validation of agent behavior and prevent regressions with new features and model updates • Root Cause Analysis: Understand model behaviors and assist in the trace and root-cause analysis of identified defects or performance degradations • Reporting & Performance Metrics: Clearly document, track, and communicate performance metrics, validation results, and bug status to the broader development and product teams
AI Data Engineer
VeevaHeadquartered in Pleasanton, California, Veeva is a leading provider of cloud-based software and services for the life sciences industry. As an employer, Veeva
• Define and establish comprehensive evaluation strategies for new AI Agents. Prioritize the integrity and coverage of test data sets to reflect real-world usage and potential failure modes • Programmatically and manually evaluate the quality of LLM-generated content against predefined metrics (e.g., factual accuracy, contextual relevance, coherence, and safety standards) • Design, curate, and generate diverse, high-quality test data sets, including challenging prompts and scenarios. Evaluate LLM outputs to proactively identify system biases, unsafe content, hallucinations, and critical edge cases • Develop, implement, and maintain scalable automated evaluations to ensure efficient, continuous validation of agent behavior and prevent regressions with new features and model updates • Understand model behaviors and assist in the trace and root-cause analysis of identified defects or performance degradations • Clearly document, track, and communicate performance metrics, validation results, and bug status to the broader development and product teams
Mid-Level Data Engineer
NOUS LATAMConnect with LATAM-based tech talent for your most challenging projects!
• Design and implement efficient data pipelines using Python and AWS services (Lambda, S3, Glue, etc.) • Ensure data quality, reliability, and scalability across multiple sources and formats • Collaborate closely with data analysts, software engineers, and product teams to deliver actionable insights • Contribute to continuous improvement of our data infrastructure and best practices



