Ceresti Health

Everyone else treats the patient. We activate the caregiver—because that’s where dementia care really begins.

Senior Data Engineer

Data EngineerData EngineerFull Time Remote SeniorTeam 11-50Since 2013H1B No SponsorCompany Site LinkedIn

Location

United States

Posted

9 days ago

Salary

Seniority

Senior

Bachelor Degree8 yrs expEnglishAirflow AWS Cloud PostgreSQL Python SQL Vault

Job Description

• Design and own Ceresti’s end-to-end data architecture: a landing zone with secure cloud object storage for raw partner files and API payloads, validated ingestion pipelines into our transactional Postgres, and a curated analytics layer that decouples reporting and AI workloads from production • Build ingestion pipelines for the data we receive today, including partner data files (CSV/JSON/XML/HL7/X12 as applicable) and REST/SFTP API integrations with schema validation, quarantine of bad records, and full lineage from raw bytes to curated row • Stand up and operate the curated layer (data warehouse / lakehouse-lite) so analytics and ML models can consume data without slowing down the transactional system • Choose, integrate, and operate the smallest set of tools needed, including object storage, an orchestrator (Dagster, Prefect, Airflow, etc.), dbt or similar for transformations, a single validation library (Great Expectations / Pandera / Soda) • Design and enforce data governance for a HIPAA-regulated environment: PHI/PII classification, encryption in transit and at rest, role-based access, audit logging, retention and minimum-necessary policies, and de-identification where appropriate • Partner with backend, ML, product, and clinical stakeholders to define data contracts with our health plan and ACO partners and hold the line on data quality • Build and maintain reliable feature data for ML models, including embeddings (e.g., pgvector) and curated feature tables for risk stratification, engagement, and outcomes work • Instrument the data platform for observability including pipeline SLAs, data freshness, schema drift, quality metrics, and act on what the data tells you • Participate fully in our Agile process: backlog grooming, sprint planning, demos, and retrospectives • Mentor engineers across the team on SQL, schema design, and the craft of building data systems that are boring in the best possible way

Job Requirements

BS/BA degree or higher in Computer Science, Engineering, or a related technical field
8+ years of professional data engineering experience, with a track record of shipping production data systems end-to-end
Mastery of PostgreSQL: schema design, indexing, query tuning, partitioning, logical replication, JSONB, extensions (pg_partman, pg_cron, pgvector, etc.), and operating Postgres at scale
Strong experience designing and operating data pipelines, including file-based ingestion (SFTP / object storage drops) and API-based ingestion (REST, webhooks)
Hands-on experience with one or more cloud platforms (AWS preferred) and their data primitives: object storage (S3), managed Postgres
Experience designing data warehouses and/or data lakes and the judgment to know which one a given problem actually needs
Strong experience with dbt (or equivalent SQL-based transformation framework) and modern data modeling patterns (Kimball dimensional, Data Vault, One Big Table — and an opinion about when each is right)
Experience with at least one orchestration framework (Dagster, Prefect, or Airflow) and a clear point of view on which to use when
Strong Python skills for ingestion, validation, and tooling
Experience with data validation and data-quality frameworks (Great Expectations, Pandera, Soda, or equivalent)
Experience with change-data-capture from Postgres (logical replication, or equivalent)
Data governance experience in a HIPAA-regulated environment or, at minimum, demonstrated instincts for protecting PHI and PII (encryption, least privilege, audit, de-identification, BAA-aware vendor selection); HITRUST or SOC 2 experience is a strong plus
Comfortable with infrastructure-as-code and CI/CD for data systems
Experience supporting ML workloads: building feature tables, managing training data, serving features at inference time; familiarity with embeddings, vector search (pgvector or equivalent), and LLM integration patterns (RAG, prompt-grounded analytics) is a plus
Excellent written and verbal communication skills: you can explain a tricky schema decision to a business stakeholder and a data contract to a partner with equal clarity
Demonstrated experience working in Agile/Scrum teams

Benefits

Competitive salary and benefits package
Opportunities for professional growth and development
Collaborative and dynamic work environment
Flexible work arrangements and remote work options
Access to cutting-edge technologies and tools

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Data Engineering Team Lead

Ocean Technologies Group

Powering teams that deliver for people & planet, with maritime learning, crew and fleet management and GRC solutions

Data Engineer9 days ago

Full Time RemoteTeam 201-500Since 2020H1B No Sponsor

Company Site LinkedIn

• Lead a team of data engineers, ensuring alignment on goals, quality and delivery timelines. • Mentor and coach team members to support their technical and professional growth. • Drive engineering excellence by promoting best practices in coding, architecture, testing and observability. • Plan and manage team capacity, sprints and milestones to ensure predictable delivery. • Own the design, evolution and operation of ingestion and transformation pipelines on Apache Airflow and the analytical serving layer on Apache Druid. • Make architectural calls on concurrency, partitioning, memory sizing and cost — including JVM heap and direct-memory tuning on the Druid cluster. • Collaborate closely with DevOps on the Kubernetes / EKS platform that hosts our Druid and Airflow workloads. • Ensure robust data validation, reconciliation and verification so that reporting is trustworthy. • Collaborate with other Team Leaders, Development Managers, Architects and Product Owners to align engineering execution with business objectives. • Contribute to the evolution of development processes, CI/CD pipelines and DevOps practices. • Foster a culture of continuous improvement, innovation and knowledge sharing.

Airflow Apache AWS EC2 Grafana Kafka Kubernetes Oracle PostgreSQL Prometheus PySpark Python Spark SQL Terraform Yarn

View details: Data Engineering Team Lead

Philippines

Apply

Data Engineer

Nuvitek

Speed Up True Modernization

Data Engineer9 days ago

Full Time RemoteTeam 51-200Since 2012H1B No Sponsor

Company Site LinkedIn

• Design, develop, and maintain scalable RAG/CAG pipelines for AI-powered applications • Build and optimize document ingestion workflows for structured and unstructured data sources • Manage and maintain vector stores to support semantic search and retrieval capabilities • Develop OCR processing pipelines for historical and modern document collections spanning 1781–2025 • Optimize retrieval performance, relevance tuning, and ranking strategies for LLM-based systems • Build reliable data pipelines that support integrations with large language models and AI services • Collaborate with engineers, UX teams, product owners, and stakeholders to deliver scalable AI solutions • Ensure data quality, integrity, security, and performance across ingestion and retrieval systems • Implement monitoring, logging, and troubleshooting for AI and data processing workflows • Contribute to architecture decisions, technical documentation, and engineering best practices • Participate in agile pod-based development teams and continuous improvement initiatives

Cloud ETL Python

View details: Data Engineer

United States

$115K - $125K / year

Apply

Job Closed

Enterprise Data Warehouse ETL/Data Engineer

CSpring

Unlocking the power and potential of data.

Data Engineer9 days ago

Full Time RemoteTeam 51-200H1B Sponsor

Company Site LinkedIn

• Design and develop reusable, parameter-driven ingestion and transformation pipelines • Build and maintain medallion architecture solutions • Develop performant ELT workflows • Create and optimize PySpark notebooks and distributed processing jobs • Design dimensional data models • Implement data vault patterns • Optimize distributed SQL workloads • Implement CI/CD processes • Build monitoring, logging, and auditing solutions • Lead or contribute to cloud modernization initiatives

Azure Cloud PySpark Python Spark SQL Vault

View details: Enterprise Data Warehouse ETL/Data Engineer

Illinois

Apply

Data Engineer

UnitedHealth Group

UnitedHealth Group is a healthcare and well-being company that’s dedicated to improving the health outcomes of millions around the world. We are comprised of

Data Engineer9 days ago

Full Time Remote

Company Site

Role Description We are seeking a highly skilled Senior Data Engineer to design, build, and optimize scalable data and AI platforms on Azure. This role will focus on enabling enterprise data pipelines, real-time processing, and AI/ML model integration using Databricks and modern cloud technologies. You will enjoy the flexibility to telecommute* from anywhere within the U.S. as you take on some tough challenges. - Design and develop scalable data pipelines using Databricks, Apache Spark, and Python on Azure - Build cloud-native solutions leveraging Azure Data Lake, Azure Data Factory, and Delta Lake - Collaborate with Data Science and AI teams to operationalize ML models and embed them into production workflows - Develop and maintain feature stores, model input pipelines, and real-time/streaming frameworks - Ensure data quality, governance, and security across the full data lifecycle - Build reusable frameworks, accelerators, and automation scripts to improve engineering efficiency - Optimize performance, scalability, and reliability of data workflows and batch/streaming pipelines - Participate in Agile development processes, including sprint planning, code reviews, and CI/CD pipelines - Provide production support and on-call coverage, ensuring system stability and rapid issue resolution - Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI Qualifications - Bachelor’s degree in Computer Science, Engineering, or IT related field - 6+ years of experience in Data Engineering with Python/PySpark - 6+ years of experience in building ETL/ELT pipelines using Databricks - 6+ years of experience working in Agile environments - 5+ years of strong experience in SQL / PL-SQL - 4+ years of experience with Azure Databricks and Delta Lake architecture - 4+ years of hands-on experience with CI/CD (GitHub Actions, Azure DevOps) - 3+ years of hands-on experience with Azure cloud services (ADF, ADLS, Databricks) - 2+ years of experience with Databricks Delta Live Tables (DLT) - 2+ years of experience with unit testing, validation, and pipeline testing frameworks Requirements - Familiarity with medallion architecture and SCD2 implementations - AI builder: Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI - Experience building enterprise-scale data platforms - Strong skills in performance tuning and debugging large-scale pipelines - Experience with real-time/streaming frameworks (Structured Streaming) - Ability to work in distributed, cross-functional global teams - Exposure to GenAI tools (e.g., GitHub Copilot) for engineering productivity - Strong understanding of secure coding practices and vulnerability remediation - Proven ability to analyze logs, troubleshoot production issues, and optimize performance - Demonstrated capability to design and deploy AI-powered solutions responsibly Benefits - Comprehensive benefits package - Incentive and recognition programs - Equity stock purchase - 401k contribution (all benefits are subject to eligibility requirements)

AI Azure AI/ML Databricks Apache Spark Python CI/CD Data Engineering PySpark ETL SQL GitHub Actions Azure DevOps

View details: Data Engineer

United States

$72.8K - $130K / year

Apply

Job Closed

Senior Data Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Data Engineering Team Lead

Data Engineer

Enterprise Data Warehouse ETL/Data Engineer

Data Engineer