Lead Data Engineer

Data EngineerData EngineerOtherRemoteSeniorTeam 201-500H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

153 days ago

Salary

0

Seniority

Senior

Bachelor Degree10 yrs expEnglishAzurePySparkSQL

Job Description

Lead Data Engineer

Awign Expert

• Engage in a large-scale data transformation initiative • Analyze and modernize complex data transformation logic • Architect and implement end-to-end data ingestion frameworks • Define and analyze performance metrics • Perform advanced performance tuning for Databricks operations • Provide technical leadership and mentorship

Job Requirements

  • 10+ years of experience in data engineering
  • Extensive hands-on experience in Databricks and PySpark
  • Strong architectural understanding
  • Ability to lead and influence data engineering practices
  • Proficiency in T-SQL, Azure Data Factory, C# (basic knowledge)
  • Experience in performance metric analysis
  • Competence in data ingestion pipelines optimization
  • Strong motivation and ability to provide guidance

Benefits

  • Professional development opportunities

Related Categories

Related Job Pages

More Data Engineer Jobs

AcuityMD logo

Director of Data Engineering

AcuityMD

Accelerate access to medical technology.

Data Engineer153 days ago
OtherRemoteTeam 11-50Since 2019H1B Sponsor

• Set and own the technical and organizational north star for data engineering at AcuityMD—defining how we build the world’s most accurate and actionable model of healthcare reality. • Lead the design, implementation, and evolution of our core data platform, powering every aspect of our product and agentic experiences—from analytics and workflows to AI-driven insights. • Translate company and product strategy into clear data investments, and clearly articulate the customer value of those investments to engineering, product, and leadership partners. • Go deep when it matters: drive architectural decisions, review critical designs, debug hard problems, and coach the team through complex technical tradeoffs. • Build data systems that are constantly improving, reliable, and trusted—supporting massive scale, complex healthcare data, and customer-facing use cases where correctness truly matters. • Partner closely with product, data science, and engineering leaders to ensure tight feedback loops between data generation, modeling, and real-world customer outcomes. • Lead, mentor, and grow a high-impact team of data engineers, data scientists, and domain experts, setting a high bar for methodological and technical rigor, ownership, and velocity. • Establish strong—but pragmatic—standards for data quality, testing, observability, lineage, and governance, without slowing the team down. • Shape the culture of the data organization: how we plan, how we ship, how we review work, and how we learn from mistakes. • Stay close to the evolving state of modern data engineering and analytics engineering, bringing in new ideas when they meaningfully raise our ceiling.

Massachusetts
$250K - $300K / year
Job Closed
Arine logo

Staff Data Engineer

Arine

Arine optimizes medication to ensure each patient is on the safest, most effective therapy for their unique health needs

Data Engineer153 days ago
OtherRemoteTeam 11-50H1B No Sponsor

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description As a key technical leader and team architect working in a fast-paced environment, you will drive the design, development, and optimization of scalable data ingestion pipelines within the Arine platform. Leveraging expert-level proficiency in Python and AWS, you will architect solutions that handle diverse file types and large-scale healthcare datasets. You will have a direct impact on building reusable, configurable tools set for handling data needs for the entire company. What You'll be Doing: - Act as the team architect by leading system design reviews, offering recommendations, conducting comprehensive peer reviews, and demonstrating expert-level proficiency in Python and AWS services. - Architect and implement scalable data ingestion pipelines that handle different file types into the Arine platform. - Develop reusable components that integrate into data pipelines to increase efficiency and reduce future implementation time. - Create configuration-driven, containerized toolsets that are easy to use and maintain across diverse engineering profiles. - Work collaboratively with cross-functional teams to meet data requirements through ETL components. - Design and maintain data transformation pipelines using DBT, including macros, incremental models, and DBT tests. - Implement incremental data ingestion strategies for large-scale healthcare datasets. - Build monitoring and alerting systems for data ingestion processes and overall pipeline health. - Apply software engineering best practices, including test-driven development and modular design, to data infrastructure. - Refactor and rebuild existing data ingestion processes to improve scalability and operational efficiency. - Work with containerization technologies (Docker, Kubernetes) to create portable and maintainable data solutions. - Identify and escalate inefficiencies within and across teams. - Provide technical guidance and mentorship to junior engineers, and promote best practices and coding standards. - Author and maintain high-quality technical documentation, and support junior engineers in doing the same. - Collaborate with the DE Manager to report on DE contractor performance issues. Qualifications - 10+ years working in data engineering, with a focus on large-scale data ingestion and infrastructure. - Deep expertise in Python and modern data engineering tools. - A track record of building automated, production-grade ETL processes using Python and dbt SQL. - Strong understanding of ETL/ELT frameworks and distributed data processing. - Hands-on proficiency with modern data technologies and comfort leveraging AI coding assistants to accelerate development, improve code quality, and enhance productivity. - Skilled in data processing, validation, cleaning, and debugging. - Strong capability integrating APIs for seamless data exchange between systems. - Proven ability to handle and process varied file types and formats, including healthcare standards such as HL7, 834, 837, and NCPDP. - Demonstrated success integrating and consolidating data from diverse source systems into a unified repository, including EHR and claims systems, via both file-based and API integrations. - Comfort working with large-scale datasets (10GB+). - Strong capability implementing incremental processing and change data capture (CDC) methodologies. - Extensive background designing scalable data architectures in AWS environments. - Solid grounding in software engineering principles, including test-driven development, loose coupling, single responsibility, and modular design. - Hands-on familiarity with containerization (Docker, Kubernetes) and building configuration-driven, maintainable systems. - Proven ability to build tools and systems that diverse engineering profiles can operate through configuration rather than code changes. - A passion for building new data infrastructure and continuously improving existing systems with robustness, maintainability, and operational excellence. - Familiarity with healthcare data and regulatory environments (HIPAA) as a plus. - Strong collaboration skills, with comfort partnering across technical and non-technical stakeholders. - Excellent written and verbal communication, with the ability to explain technical infrastructure concepts to diverse audiences. Requirements - Ability to pass a background check. - Must live in and be eligible to work in the United States. Benefits - Dynamic role with the opportunity to contribute to the company's growth and shape its future. - Unparalleled learning and growth prospects, collaborating closely with experienced Clinicians, Engineers, Software Architects, and Digital Health Entrepreneurs. - Salary range for this position is: $170,000-185,000/year. Remote Work Requirements - An established private work area that ensures information privacy. - A stable high-speed internet connection for remote work. - This role is remote, but you will be required to come to on-site meetings multiple times per year.

United States
$170K - $185K / year
Job Closed
Health Services Advisory Group logo

Data Manager

Health Services Advisory Group

HSAG is an EEO Employer of Veterans protected under Section 4212. If you have special needs and require assistance completing our employment application process, please feel free to contact us. EOE M/F/Veteran/Disability. We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Data Engineer153 days ago

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description HSAG is seeking a Data Manager I to join the Data Science & Advanced Analytics (DSAA) division, focusing on managing claims and encounter data within a Microsoft SQL Server environment. This role emphasizes data integrity, compliance, and quality improvement in healthcare analytics. The Data Manager I position is a key contributor to healthcare data management projects, with a focus on claims and encounter data. Responsibilities include: - Supporting extraction, transformation, and loading (ETL) processes - Data validation and reporting in a Microsoft SQL Server environment - Developing and maintaining data dictionaries - Conducting audits - Collaborating with internal teams to ensure accurate data collection and reporting - Training team members on data management best practices - Ensuring compliance with health data regulations The Data Manager I will work with a wide array of healthcare data types, including but not limited to: - Survey - Case review - Medical and prescription drug claims and encounters - Eligibility - Demographic - Clinical - Electronic health record - Registry - Vital statistics - Operational Qualifications - Bachelor’s degree in computer science, computer information systems, or a quantitative discipline - At least three (3) years of experience with ETL processes in a Microsoft SQL Server environment - At least three (3) years of data warehousing experience - Excellent SQL programming skills - Experience gathering and refining requirements, interviewing business users to understand and document data requirements - Strong attention to detail, documenting business and technical requirements based on user interviews - Database testing experience - Experience with Microsoft SQL Server Requirements - Experience working with healthcare data - Proficient in Microsoft Word, Excel, and Access - Excellent verbal and written communication skills - Ability to handle several projects simultaneously and work with multiple teams Benefits - Formal internal training on an assortment of healthcare-related topics during the first year

United States + 171 moreAll locations: United States | Canada | Brazil | Colombia | Argentina | Chile | Venezuela | Bolivia | Ecuador | French Guiana | Guyana | Paraguay | Peru | Suriname | Uruguay | Mexico | Costa Rica | El Salvador | Guatemala | Honduras | Nicaragua | Panama | Dominican Republic | Puerto Rico | Bahamas | Guadeloupe | Haiti | Jamaica | Martinique | Montserrat | United Kingdom | Germany | France | Estonia | Portugal | Hungary | Poland | Ukraine | Romania | Bulgaria | Czechia | Slovakia | Belarus | Moldova | Sweden | Greece | Belgium | Italy | Ireland | Switzerland | Netherlands | Finland | Malta | Denmark | Lithuania | Croatia | Spain | Austria | Bosnia And Herzegovina | Iceland | Luxembourg | North Macedonia | Montenegro | Norway | Serbia | Slovenia | Albania | Cyprus | Latvia | Monaco | South Africa | Egypt | Algeria | Angola | Benin | Botswana | Burkina Faso | Burundi | Cameroon | Cabo Verde | Central African Republic | Chad | Congo | Côte D'ivoire | Democratic Republic of the Congo | Equatorial Guinea | Eritrea | Ethiopia | Gabon | Gambia | Ghana | Guinea | Guinea-bissau | Kenya | Lesotho | Liberia | Libya | Madagascar | Malawi | Mali | Mauritania | Mauritius | Mayotte | Morocco | Mozambique | Namibia | Niger | Nigeria | Réunion | Rwanda | Senegal | Seychelles | Sierra Leone | Somalia | Sudan | Eswatini | Tanzania | Togo | Tunisia | Uganda | Zambia | Zimbabwe | Georgia | Turkey | Israel | United Arab Emirates | Armenia | Azerbaijan | Bahrain | Iraq | Jordan | Kuwait | Lebanon | Oman | Qatar | Saudi Arabia | Palestine | Yemen | India | Japan | Philippines | Pakistan | Thailand | Singapore | Vietnam | Taiwan | Indonesia | Cambodia | Laos | Malaysia | Myanmar | South Korea | China | Afghanistan | Bangladesh | Bhutan | Kazakhstan | Kyrgyzstan | Maldives | Mongolia | Nepal | Sri Lanka | Tajikistan | Turkmenistan | Uzbekistan | Australia | Papua New Guinea | Kiribati | Palau | French Polynesia | Tuvalu | New Zealand
$75K - $90K / year
Job Closed
TRACTIAN logo

Senior Data Engineer

TRACTIAN

Artificial Intelligence Quarterbacking Your Maintenance

Data Engineer153 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

• Develop and maintain scalable data pipelines and ETL processes. • Design, implement, and optimize existing data extraction and loading processes with adequate data engineering design patterns. • Lead data engineering reliability and observability, increasing analytics team awareness of the data flow processes before it becomes an issue. • Collaborate with backend and analytics engineers in a holistic data engineering process, loading data accordingly with the technical requirements. • Ensure data quality and consistency across various sources by implementing data validation and cleansing techniques. • Work with cloud-based data warehouses and analytics platforms to manage and store large datasets. • Monitor and troubleshoot data pipelines to ensure reliable and timely delivery of data. • Document data processes, workflows, and best practices to enhance team knowledge and efficiency. • Create dashboards as data products as internal

Brazil