Job Closed
This listing is no longer active.
Patient access at the speed of flight
Data Engineering/AI Internship
Location
United States
Posted
96 days ago
Salary
0
Seniority
Entry Level
Job Description
Data Engineering/AI Internship
CareMetx, LLC
• Assist with building and maintaining data pipelines to ensure smooth data integration and processing. • Support the development and optimization of ETL (Extract, Transform, Load) workflows. • Help in cleaning, organizing, and validating large datasets for analysis and reporting. • Collaborate with data engineers to troubleshoot and resolve data-related issues. • Work with various data storage and processing tools such as databases, cloud platforms, and scripting languages. • Participate in team meetings to discuss data solutions and contribute ideas for process improvement. • Perform other ad-hoc data engineering tasks and projects as assigned by senior team members.
Job Requirements
- Currently pursuing a degree in Computer Science, Data Engineering, Information Systems, or a related field.
- Familiarity with SQL and at least one programming language (e.g., Python, Java).
- Basic knowledge of data processing concepts and tools.
- Strong analytical and problem-solving skills.
- Excellent written and oral communication skills.
- Familiarity with cloud platforms (e.g., AWS, Azure, GCP) or data visualization tools is a plus.
Benefits
- Remote work
- Professional development opportunities
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Data Warehouse Engineer
AspirionRevenue Cycle Management Services | Advanced Technology, Top Talent, Optimal Revenue Results
• Collaborate with our Business Intelligence team to understand business reporting requirements and define denormalized schema for the warehouse • Collaborate with our DBA team on capacity planning and deploying infrastructure resources • Build and maintain ETL solutions that centralize data from disparate sources into a centralized data warehouse • Collaborate with DBA, Security and Infrastructure teams to establish secure connections between data tiers • Build and maintain ETL solutions that denormalize OLTP source data into reportable OLAP data • Collaborate with stakeholders to RCA and remediate data and reporting defects and inconsistencies • Establish monitoring and alerting solutions to support the data warehouse and related ETL jobs
• Implement robust data infrastructure in AWS, using Spark with Scala • Evolve our core data pipelines to efficiently scale for our massive growth • Store data in optimal engines and formats • Collaborate with our cross-functional teams to design data solutions that meet business needs • Built out fault-tolerant batch and streaming pipelines • Leverage and optimize AWS resources while designing for scale • Collaborate closely with our Data Science and Product teams
• Design and maintain a scalable identity resolution platform • Build pipelines and services to ingest, normalize, link, and version identity data across multiple sources • Ensure deterministic and probabilistic matching logic that is transparent, auditable, and measurable • Partner with product and analytics teams to expose identity data through reliable, well-documented APIs and datasets • Build and operate batch and streaming pipelines using modern data stack tools • Create clear documentation, standards, and runbooks for identity and governance systems • Own data governance foundations including data lineage, quality checks, schema enforcement, and access controls • Implement privacy-by-design principles (PII handling, consent enforcement, retention policies) • Collaborate with legal, privacy, and security teams to operationalize regulatory requirements (e.g., GDPR, CCPA) • Establish monitoring and alerting for data quality, freshness, and integrity
• Modeling datasets and schemes for consistency and easy access, • Design and implement data transformations and data marts, • Integrating third-party systems and external data sources into data warehouse, • Building data flows for fetching, aggregation and data modeling using batch pipelines.



