Guidehouse logo
Guidehouse

Solving big problems, building trust in society, and empowering our clients to shape the future.

Data Engineer – Healthcare Analytics

Data EngineerData EngineerFull TimeRemoteSeniorTeam 10,001+Since 2018H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

1 day ago

Salary

$77K - $129K / year

Seniority

Senior

Bachelor Degree5 yrs expEnglishAWSAzureCloudETLPugPythonSQL

Job Description

Data Engineer – Healthcare Analytics

Guidehouse

• Support the design and development of an enterprise Contract Performance Analytics platform for a large healthcare system • Focus on data architecture, ELT/ETL pipeline development, and integration of clinical, claims, and operational data into a scalable analytics ecosystem • Build a unified data system that enables insights across value-based care contracts (MSSP, Medicare Advantage, Commercial, and Employer Health Plans) • Integrate EHR cloud-based data processing and an enterprise data warehouse to support analytics and reporting • Design, develop, and maintain robust ETL/ELT pipelines to ingest, transform, and load healthcare data from diverse structured and unstructured sources • Develop pipelines to process data from CMS and payer files (CCLF, paid claims, PUG) as well as Epic (Caboodle, Clarity) data models and extracts • Build and optimize data models to support analytics, reporting, and operational use cases, including BI and downstream analytics consumption • Transform raw data into standardized, analytics-ready canonical data models and curated data marts • Build lakehouse/medallion architecture, data ingestion patterns, and orchestration frameworks • Implement and maintain CI/CD pipelines for data engineering workflows, including pipelines and scheduled jobs, using version control and automation tools • Collaborate with database administrators, analysts, and application teams to integrate data sources, design schemas, and support downstream data consumers • Ensure data quality, integrity, and accuracy through validation, monitoring, logging, and alerting • Support data migration, integration, and modernization initiatives, including legacy system upgrades, optimization of large-scale ETL pipelines, query performance, and cloud adoption efforts • Troubleshoot and resolve issues in development and production environments to maintain stable and reliable data pipelines • Document data flows, pipelines, test cases, and technical solutions to support knowledge sharing and compliance requirements • Stay current with emerging tools, technologies, and best practices in data engineering and cloud platforms

Job Requirements

  • US Citizenship or a Green Card is required
  • Bachelor’s degree in Computer Science, Data Analytics, Software Engineering, Information Systems, or related fields
  • A minimum of FIVE (5) years of experience in data engineering, ETL/ELT development, or data platform engineering in a healthcare setting
  • Experience working with healthcare data, including claims, clinical, payer, or population health datasets
  • Experience working with healthcare data interoperability standards (e.g., FHIR, HL7)
  • Proficiency in Python and SQL for data engineering and transformation workloads
  • Hands-on experience designing and building ETL/ELT pipelines and data ingestion frameworks
  • Experience working with modern cloud data platforms or ETL/ELT tools (e.g., Databricks, Azure Data Factory, AWS Glue)
  • Experience working with lakehouse or medallion-style architectures for analytics platforms
  • Strong knowledge of relational database design, data warehouses, and/or data lakes (e.g., star/snowflake schemas)
  • Experience working with relational and/or distributed data systems, including data modeling
  • Experience working in a cloud environment (AWS or Azure) supporting data solutions
  • Experience with CI/CD practices and version control tools (e.g., Git)
  • Experience using monitoring and logging tools to support data pipeline reliability
  • Experience working with PHI and healthcare data privacy/security requirements
  • Ability to work effectively in an Agile development environment
  • Strong analytical and troubleshooting skills, and the ability to communicate technical concepts clearly to clients, engineers, and business stakeholders
  • Ability to work independently in a fast-paced, client-facing environment

Benefits

  • Medical, Rx, Dental & Vision Insurance
  • Personal and Family Sick Time & Company Paid Holidays
  • Parental Leave and Adoption Assistance
  • 401(k) Retirement Plan
  • Basic Life & Supplemental Life
  • Health Savings Account, Dental/Vision & Dependent Care Flexible Spending Accounts
  • Short-Term & Long-Term Disability
  • Student Loan PayDown
  • Tuition Reimbursement, Personal Development & Learning Opportunities
  • Skills Development & Certifications
  • Employee Referral Program
  • Corporate Sponsored Events & Community Outreach
  • Emergency Back-Up Childcare Program
  • Mobility Stipend

Related Categories

Related Job Pages

More Data Engineer Jobs

Henry Schein logo

Lead Data Architect

Henry Schein

Henry Schein started out as a Queens, New York-based pharmacy in 1932 and is now a Fortune 500 company specializing in healthcare products and solutions for hea

Data Engineer1 day ago

• Define and implement a scalable, enterprise-wide data architecture aligned with business and technology goals • Develop a data strategy roadmap, ensuring long-term sustainability, scalability, and efficiency • Partner with executive leadership, product teams, and engineering to ensure data initiatives drive business value • Establish enterprise data governance, security, and compliance frameworks leveraging tools like Collibra or Alation • Oversee the design and evolution of data lakes, data warehouses, and cloud-based analytics platforms using Databricks, Snowflake, BigQuery, or Redshift • Lead the adoption of modern data architecture patterns, including event-driven architectures, real-time data streaming (Kafka, Pulsar), and AI-driven analytics • Provide guidance on database optimization, indexing, partitioning, and storage strategies for tools like PostgreSQL, MySQL, and NoSQL solutions like MongoDB or Cassandra • Evaluate emerging technologies, making recommendations for tools and platforms that enhance data capabilities • Direct ETL/ELT strategies, ensuring seamless data flow across systems with Python, Apache Airflow, dbt, or Informatica • Architect cloud-based solutions (AWS, Azure, or GCP) using services such as AWS Glue, Azure Synapse, and Google Cloud Dataflow to support analytics, AI, and operational use cases • Ensure API-first design for data integration using GraphQL, RESTful APIs, or event-driven architectures (Kafka, AWS Kinesis, Pub/Sub) • Define and oversee data quality, lineage, and cataloging efforts using Great Expectations, Monte Carlo, or DataHub • Develop policies for data privacy, access control, and encryption, ensuring compliance with GDPR, CCPA, HIPAA, or other relevant regulations • Implement enterprise-wide metadata management and data lineage tracking using Collibra, Alation, or Data Catalog solutions • Drive best practices for data security and compliance audits, leveraging IAM tools and cloud security solutions • Lead a team of data architects, engineers, and analysts, mentoring them on best practices • Act as a liaison between business and technical teams, translating business needs into scalable data solutions • Champion a culture of innovation, ensuring the data team is adopting cutting-edge methodologies • Conduct data architecture reviews, ensuring alignment with organizational standards

United States
$181.0K - $259.1K / year

Staff Data Architect

Jellyfish - Orthogonal Networks, Inc.

Jellyfish, also known as Orthogonal Networks, Inc., is self-described as a pioneer in engineering management platforms (EMPs). Founded with the mission to help

Data Engineer1 day ago

Staff Data Architect Location: Remote - US Full time Department: Engineering Compensation - $200K – $260K • Offers Equity The posted range represents the possible base pay for this role. Actual compensation will depend on your experience, skills, role scope, and alignment with the position. Some postings may include more than one salary band to reflect different levels. Jellyfish is the backbone for elite engineering organizations, and our data infrastructure needs to be as high-performing and insightful as the teams we serve. We are looking for a Staff/Lead Data Architect to help us design, automate, and scale the next generation of our Jellyfish data platform. You’ll be responsible for maturing our core data models, automating environment boundaries, and driving advanced observability and cost-attribution deeper into our data pipeline architecture. If you view manual data intervention as a technical debt to be solved and want to work in an environment where your architectural decisions directly impact how the world’s best engineering leaders measure their productivity, you’re the perfect fit. What you’ll actually be doing: - Architectural Evolution & Blueprinting – You’ll own the blueprint for the next-generation Jellyfish data platform. You'll tackle our existing data footprint, refactoring pipelines and structures into highly efficient, scalable patterns (like Medallion-style schemas or unified semantic layers). - Automated Data Governance – You’ll design and automate strict, code-driven environment isolation boundaries. You'll ensure dev, staging, and production data catalogs (and their underlying cloud storage) never dangerously cohabitate, eliminating the risk of "fat-finger" data drops or PII leakage. - Orchestration & Compute Scaling – You’ll lead the modernization of our workflow orchestration and distributed compute engines. You’ll focus on slashing engine runtime overhead, eliminating API bottlenecks, and streamlining heavy parallelized or mapped data tasks. - Modern Integration Middleware – You'll partner with application teams to ensure our React frontends and backend services hit highly secure, cached API and Backend-for-Frontend (BFF) layers rather than querying raw data services directly, protecting our warehouses from concurrency spikes. - Proactive Data Observability & FinOps – You’ll build and maintain granular data-quality monitors and cost-allocation frameworks. You won't just track overall warehouse spend; you’ll implement systems to map execution cost and token usage directly down to the tenant, team, or user level. You’re a great fit if: - Data Tooling Fluency – You have deep, production-level experience with Python, advanced SQL, and modern data stack essentials. You are deeply familiar with programmatic orchestrators (like Prefect, Dagster, or Airflow) and modern data validation engines (like Pydantic v2). - Catalog & Warehouse Practitioner – You have hands-on mastery of enterprise-scale data platforms and governance layers (e.g., Snowflake, Databricks Unity Catalog, BigQuery) and know exactly how to map environments to catalogs and data quality to schemas. - Automation Mindset – You look at a manual data backfill or a clicked-together database permission and immediately think about how to automate it via Infrastructure-as-Code (Terraform) or programmatic workflows. - Collaborative Systems Thinker – You don’t design in a vacuum. You are excellent at documenting data lineage, mentoring data engineers, and collaborating across DevOps and Product teams to align infrastructure with business goals. - Pragmatic Problem Solver – You know the difference between data quality stages and software development lifecycles. You know when a "perfect" distributed cluster is required and when a "good enough" cached view keeps the business moving. Bonus Points: - You’ve survived (and thrived in) a rapidly scaling B2B SaaS startup handling massive multi-tenant data sets. - You have strong opinions on the future of Git-like data versioning and zero-copy cloning (e.g., Iceberg, Nessie). - You’ve managed complex cloud-billing attributions or scaled heavy LLM/vector-embedding data workloads and lived to tell the tale. A list of job experiences and qualification requirements is great, but humility, a performance-driven attitude, and a team-player approach are most important to us. We love to have fun and win in the process. We only hire people who have a passion for building great companies in an environment where a sense of humor is a must. Occasional travel may be required. Applicants must be authorized to work for any employer in the US. We are unable to sponsor or take over sponsorship of an employment visa at this time. Let’s talk about us! This is all about you, but you want to know a little about us. Jellyfish enables leaders to effectively build AI-integrated engineering teams, align engineering decisions with business initiatives and deliver the right software efficiently and on time. AI tools alone won’t transform your org—Jellyfish shows you what’s working, what’s not, and how to build high-performing teams that know how to use AI the right way.

United States
$200K - $260K / year

Senior Data Engineer

Jellyfish - Orthogonal Networks, Inc.

Jellyfish, also known as Orthogonal Networks, Inc., is self-described as a pioneer in engineering management platforms (EMPs). Founded with the mission to help

Data Engineer1 day ago

Senior Data Engineer Location: Remote - US Full time Department: Engineering Compensation - $190K – $240K • Offers Equity The posted range represents the possible base pay for this role. Actual compensation will depend on your experience, skills, role scope, and alignment with the position. Some postings may include more than one salary band to reflect different levels. Jellyfish is the backbone for elite engineering organizations, and our data pipelines need to be as high-performing and reliable as the teams we serve. We are looking for a Senior Data Engineer to help us build, automate, and execute the next generation of our Jellyfish data platform. Working closely with our Lead Data Architect, you’ll be responsible for implementing core data models, building production-grade CI/CD for data pipelines, and transforming raw engineering signals into highly optimized analytical layers. If you view broken pipelines and manual data patches as a technical debt to be solved and want to write code that directly impacts how the world’s best engineering leaders measure their output, you’re the perfect fit. What you’ll actually be doing: - Pipeline Execution & Modeling – You’ll maintain our end-to-end data pipelines, writing clean, modular Python and SQL. You will help translate the architectural blueprint into reality, structuring data across our Medallion layers (Bronze > Silver > Gold) for maximum performance and reliability. - Orchestration Modernization – You’ll take the lead on migrating, optimizing, and maintaining our workflow orchestration engines. You’ll eliminate pipeline bottlenecks, leverage modern fast-paths (like Pydantic v2 and async database clients), and ensure distributed tasks scale seamlessly without hitting API limits. - Data CI/CD & Infrastructure Automation – You’ll build the "paved road" for data deployments. You’ll use Terraform to provision data resources and write automated tests to validate schemas and data quality before code ever hits our isolated staging or production catalogs. - API & Caching Integration – You’ll collaborate with product developers to expose data safely. You’ll help design and optimize the application backend tiers, backend-for-frontend (BFF) layers, and Redis caching structures that protect our core data warehouse from frontend concurrency spikes. - On-Call & Observability Triage – You’ll participate in the data platform's incident response rotation. You won't just patch a failing pipeline; you’ll build deep observability, refine alerts to reduce noise, and write programmatic fixes to ensure the issue never happens again. You’re a great fit if: - Data Engineering Fluency – You have solid, production-level experience with Python, advanced SQL, and data transformation frameworks (like dbt or PySpark). You are highly comfortable working with programmatic orchestrators (such as Prefect, Dagster, or Airflow). - Warehouse & Catalog Practitioner – You know your way around enterprise data platforms (e.g., Snowflake, Databricks, BigQuery). You understand how to safely navigate environment boundaries, manage access keys securely, and write performant queries that don't balloon the cloud bill. - Automation Mindset – You look at a repeated data backfill, a manual schema fix, or an untracked data quality bug and immediately think about how to script a permanent, automated solution. - Collaborative Builder – You love working in a team. You write readable code, value thorough documentation and clear data lineage, and enjoy collaborating with application engineers to solve complex data delivery problems. - Pragmatic Problem Solver – You know when to write a perfectly optimized distributed processing job and when a simple, well-indexed database table or cached view is the smartest move to keep the business moving. Bonus Points: - You’ve survived (and thrived in) a rapidly scaling startup handling complex, multi-tenant B2B SaaS data. - You have strong opinions on data quality testing frameworks (like Great Expectations or Soda) and data-observability patterns. - You’ve worked extensively with cloud cost allocation or tracked token-level spend for LLM/AI model integrations. A list of job experiences and qualification requirements is great, but humility, a performance-driven attitude, and a team-player approach are most important to us. We love to have fun and win in the process. We only hire people who have a passion for building great companies in an environment where a sense of humor is a must. Occasional travel may be required. Applicants must be authorized to work for any employer in the US. We are unable to sponsor or take over sponsorship of an employment visa at this time. Let’s talk about us! This is all about you, but you want to know a little about us. Jellyfish enables leaders to effectively build AI-integrated engineering teams, align engineering decisions with business initiatives and deliver the right software efficiently and on time. AI tools alone won’t transform your org—Jellyfish shows you what’s working, what’s not, and how to build high-performing teams that know how to use AI the right way.

United States
$190K - $240K / year

Data Engineer

Jellyfish - Orthogonal Networks, Inc.

Jellyfish, also known as Orthogonal Networks, Inc., is self-described as a pioneer in engineering management platforms (EMPs). Founded with the mission to help

Data Engineer1 day ago

Data Engineer Location: Remote - US Full time Department: Engineering Pay: $165K – $205K + Equity The posted range represents the possible base pay for this role. Actual compensation will depend on your experience, skills, role scope, and alignment with the position. Some postings may include more than one salary band to reflect different levels. Jellyfish is the backbone for elite engineering organizations, and our data pipelines need to be as high-performing and reliable as the teams we serve. We are looking for a Data Engineer to join our data platform team and help us execute, automate, and maintain the next generation of our Jellyfish data platform. In this role, you’ll be a core builder—fully autonomous, highly proficient, and responsible for translating architectural blueprints into clean, production-grade pipelines. If you view manual data patches and unmonitored workflows as bugs to be squashed and want to write code that directly impacts how the world’s best engineering leaders measure their output, you’re the perfect fit. What you’ll actually be doing: - Core Pipeline Engineering – You’ll write the clean, modular Python and optimized SQL that drives our daily data transformations. You will be responsible for implementing our Medallion-layer data models (Bronze → Silver → Gold), ensuring high performance and data integrity. - Modern Orchestration & Tuning – You’ll manage and tune our workflow orchestration engines (like Prefect or Dagster). You’ll hunt down slow execution paths, optimize parameter serialization (e.g., leveraging Pydantic v2), and ensure our distributed processing jobs run efficiently. - Infrastructure as Code (IaC) – You won't just write data scripts; you'll own your infrastructure deployment. You will use Terraform to manage and provision data warehouse schemas, permissions, and tables across securely isolated staging and production catalogs. - API & Caching Integration – You’ll collaborate with product developers to expose data safely. You’ll help implement and maintain the application backend tiers, backend-for-frontend (BFF) layers, and Redis caching structures that protect our core data warehouse from frontend concurrency spikes. - On-Call & Pipeline Observability – You’ll participate in our data platform's incident response rotation. When a pipeline breaks, you won't just fix the data; you’ll refine the Datadog dashboards and alerts to ensure we catch the issue earlier next time. You’re a great fit if: - Data Engineering Fluency – You have solid, hands-on production experience with Python, advanced SQL, and data transformation concepts. You are comfortable building and scheduling workflows using programmatic orchestrators (such as Prefect, Dagster, or Airflow). - Warehouse & Catalog Practitioner – You know your way around enterprise data platforms (e.g., Snowflake, Databricks, BigQuery). You understand how to navigate environment boundaries, manage access keys securely, and write performant queries. - Automation Mindset – You look at a repeated data backfill, a manual schema fix, or an untracked data quality bug and immediately think about how to script a permanent, automated solution. - Collaborative Builder – You love working in a team. You write readable code, value thorough documentation and clear data lineage, and enjoy collaborating with application engineers to solve complex data delivery problems. - Pragmatic Problem Solver – You know when to write a perfectly optimized distributed processing job and when a simple, well-indexed database table or cached view is the smartest move to keep the business moving. Bonus Points: - You’ve worked in a rapidly scaling startup handling complex, multi-tenant B2B SaaS data. - You have experience with data quality testing frameworks (like Great Expectations or Soda). - You’ve interacted with cloud cost allocation tracking or token-level spend for LLM/AI model integrations. A list of job experiences and qualification requirements is great, but humility, a performance-driven attitude, and a team-player approach are most important to us. We love to have fun and win in the process. We only hire people who have a passion for building great companies in an environment where a sense of humor is a must. Occasional travel may be required. Applicants must be authorized to work for any employer in the US. We are unable to sponsor or take over sponsorship of an employment visa at this time. Let’s talk about us! This is all about you, but you want to know a little about us. Jellyfish enables leaders to effectively build AI-integrated engineering teams, align engineering decisions with business initiatives and deliver the right software efficiently and on time. AI tools alone won’t transform your org—Jellyfish shows you what’s working, what’s not, and how to build high-performing teams that know how to use AI the right way.

United States
$165K - $205K / year