C the Signs is a cancer prediction system that identifies patients at risk of cancer at the earliest, most curable stage
AI Data Engineer
Location
New Hampshire + 4 moreAll locations: New Hampshire | New Jersey | New York | Massachusetts | Rhode Island
Posted
29 days ago
Salary
0
Seniority
Senior
Job Description
AI Data Engineer
C the Signs
• Collaborate with data scientists and machine learning engineers to understand data requirements for LLM and machine learning model fine-tuning. • Design, build, and maintain scalable data pipelines to ingest, process, and store massive and diverse healthcare datasets. • Implement robust data validation and monitoring to ensure the integrity, accuracy, and consistency of all training datasets. • Implement robust data cleaning, validation, and transformation processes to ensure data quality and integrity. • Develop and optimize data structures and schemas for efficient access and utilization by LLMs and machine learning models. • Work with the team to identify and acquire new data sources, ensuring compliance with relevant healthcare regulations (e.g., HIPAA). • Monitor data pipeline performance, troubleshoot issues, and implement optimizations to improve efficiency and reliability. • Document data engineering processes, data models, and data dictionaries. • Stay up-to-date with the latest advancements in data engineering, big data technologies, and machine learning.
Job Requirements
- Required
- Bachelor's degree in Computer Science, Engineering, or a related field.
- Proven experience as a Data Engineer, with a focus on big data technologies.
- Strong proficiency in programming languages such as Python, Scala, or Java.
- Extensive experience with data warehousing, ETL processes, and data modeling.
- Experience with major cloud providers (e.g., AWS, GCP, Azure) and their data storage and processing services.
- Hands-on experience with big data frameworks like Apache Spark for distributed processing.
- Excellent problem-solving skills and the ability to work independently and as part of a team.
- Strong communication and interpersonal skills.
- Preferred
- Master's degree in a related field.
- Experience with healthcare data and a good understanding of healthcare data standards (e.g., FHIR, HL7).
- Familiarity with machine learning concepts and LLM fine-tuning processes.
- Experience with data orchestration tools (e.g., Apache Airflow).
- Work Authorization:
- Must be a US Citizen, Green Card holder, or currently in the US have valid H1B visa
Benefits
- Why Join Us?**
- Joining **C the Signs** is not just about building AI; it’s about shaping the future of healthcare. If you are a technical leader with an unshakable belief in the power of AI to save lives and the ability to make it happen at scale, this is your opportunity to create a tangible, global impact.
- Benefits:**
- Competitive salary and benefits package.
- Flexible working arrangements (remote or hybrid options available).
- The opportunity to work on life-changing AI technology that directly impacts patient outcomes.
- Join a team that combines cutting-edge innovation with a mission to save lives and improve health equity.
- Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Diseño e implementación de pipelines de datos: Construir y optimizar procesos ETL/ELT en entorno GCP, utilizando BigQuery y herramientas de orquestación, garantizando flujos eficientes, escalables y fiables. • Migración y transformación de datos: Adaptar y trasladar lógica existente desde entornos como Talend/MySQL hacia SQL nativo en BigQuery, mejorando rendimiento y mantenibilidad. • Arquitectura cloud de datos: Diseñar soluciones de procesamiento y almacenamiento en la nube (GCP principalmente, valorable AWS/Azure), asegurando escalabilidad, seguridad y disponibilidad. • Calidad e integridad del dato: Aplicar buenas prácticas de validación, limpieza y monitorización para garantizar datos precisos, consistentes y confiables. • Innovación tecnológica: Explorar e incorporar nuevas herramientas, frameworks y metodologías que optimicen la ingesta, transformación y análisis de datos. • Documentación y buenas prácticas: Documentar pipelines, modelos y procesos para asegurar su comprensión, mantenimiento y reutilización dentro del equipo. • Integración de sistemas: Conectar distintas fuentes de datos (internas y externas) mediante APIs y otros mecanismos, garantizando flujos eficientes e interoperables. • Colaboración con equipos de analítica y BI: Facilitar datos estructurados y de calidad para su explotación en dashboards, análisis y modelos avanzados.
Role Description A purposeful career is what you will find at Q-Centrix. Making a meaningful impact is what we do every day. Quality data abstraction has become critical in identifying positive patient outcomes as the healthcare industry shifts to value-based care. In fact, medical record abstraction is the preferred data collection method for clinical research, quality improvement, performance measurement, disease surveillance, and other secondary data uses. Our dedicated data abstractors, otherwise known here at Q-Centrix as Senior Clinical Data Specialists, SCDS uses Q-Centrix proprietary technology to contribute to healthcare’s most exciting advancements. The Data Abstraction Specialist (SCDS - Senior Clinical Data Specialist) delivers quality solutions to hospital partners across the country. They approach each hospital engagement as an opportunity to apply their clinical expertise with precision to advance patient outcomes and research. Find your purpose by joining the Q-Centrix team to make a meaningful impact! Our MBSAQIP abstractors work with multiple hospital partners to: - Apply specialized, clinical knowledge of hospital partners: categorize, code, summarize, interpret, and calculate registry/case information from nuanced, patient medical records. - Ensure quality submission of all data in specified registries or measure data repositories, maintaining a high accuracy threshold. - Prioritize, organize, and meet tight deadlines for multiple concurrent tasks and team requests; uses tact and judgment to manage expectations, flag obstacles, and propose solutions in a timely manner. - Navigate new technical systems: electronic medical records (EMR) and registry/case entry tools; use team resources to troubleshoot technical issues with systems and applications with a focus on solutions. - Contribute to team best practices, data dictionaries, abstraction guidelines, and other business rule documents; identifies process improvement opportunities to help streamline tasks and processes. - Keeps up to date on mandated regulatory/publicly reported data requirements as specified by federal, state, payer, and other agencies. - Any or other additional responsibilities as assigned. Qualifications - Must have direct MBSAQIP clinical data abstraction experience within the past year. - Have an active MBSCR certification. - Exposure to multiple patient medical record systems (EMRs) and clinical databases. - Intermediate proficiency with MS Office (Microsoft Excel). - All applicants for employment with Q-Centrix must have legal authorization to work in the United States now or in the future without sponsorship. Preferred Qualifications - Direct clinical experience. - RN, LPN, RT, or RCIS credentials. Skills & Abilities - Strong analytical and critical thinking skills to approach problems in a systematic method using the ability to synthesize data and suggest recommendations. - Demonstrates high standards for accuracy and attention to detail. - Demonstrates technical savvy and a strong desire to learn new systems and technology. - Thrives working independently and takes ownership of projects/patient records. - Consistently and clearly communicates, adjusting style and tone as needed to collaborate with hospital partners, peers, team leads, and others. - Demonstrates strong self-organizational and time management skills to manage multiple accounts, adjusting as needed to shifting timelines and priorities. - Adapts to changes in hospital partner timelines, requirements, and project assignments. - Maintains high responsibility in keeping PHI secure and confidential. Benefits - At Q-Centrix, our purpose—safer, consistent, quality healthcare for all—drives everything we do. - We provide a compelling, equitable rewards package comprised of: - An inclusive culture - Flexible work environment - Learning and development opportunities - Competitive pay that rewards high performance - Robust benefits that support health and financial wellness - The target wage range for this role is $27.00 - $33.00 per hour. - Individual wage rates within this range are based on multiple factors including but not limited to skills, experiences, licensure, certifications, and other business and organizational considerations. - Part-time team members enjoy a fully remote work environment with a flexible schedule. Commitment to Diversity, Equity, Inclusion and Belonging At Q-Centrix (An MRO Company), we hire people who love learning, value innovation, and believe in our purpose of safer, consistent, quality health care for all. We applaud qualified applicants who are accountable and committed to producing quality work. As an Equal Opportunity Employer, we support and value diversity, dignity, and respect in our work environment, and are committed to creating an inclusive environment in which everyone can thrive. We employ people based on the needs of the business and the job, and their individual professional qualifications. Here’s what does not impact our employment decisions: race, religious creed, religion, color, sex, sexual orientation, pregnancy, parental status, genetic information, gender, gender identity, gender expression, age, national origin, ancestry, citizenship, protected veteran or disability status, health, marital, civil union or domestic partnership status, or any status or characteristic protected by the laws or regulations in locations where we operate. If you are an individual with a qualified disability and you need an accommodation during the interview process, please reach out to your recruiter.
Data Engineer II
Core & MainCore & Main is a U.S.-based distributor of water, sewer, and fire protection products. It delivers these services to private water companies, municipalities, and professional contr
Title: Data Engineer II Location: Remote - Ohio Job Description: Based in St. Louis, Core & Main is a leader in advancing reliable infrastructure™ with local service, nationwide®. As a specialty distributor with a focus on water, wastewater, storm drainage and fire protection products and related services, Core & Main provides solutions to municipalities, private water companies and professional contractors across municipal, non-residential and residential end markets, nationwide. With over 370 locations across the U.S., the company provides its customers local expertise backed by a national supply chain. Core & Main’s 5,700 associates are committed to helping their communities thrive with safe and reliable infrastructure. Job Summary Design, build, test, and maintain reliable data pipelines and data solutions that support analytics, reporting, and operational use cases. This role owns end‑to‑end data pipelines or data flows within defined domains and works independently on moderately complex data engineering tasks. Partner closely with senior engineers, architects, and stakeholders to deliver high‑quality, secure, and well‑documented data solutions that meet business and technical requirements. Major Tasks, Responsibilities and Key Accountabilities - Design, develop, and maintain production‑ready data pipelines, data transformations, and data models. - Collaborate with business and technical stakeholders to understand data requirements and translate them into scalable solutions. - Implement and maintain data warehouse, data lake, or lakehouse solutions aligned with architectural standards. - Perform unit testing and support QA, regression, and user acceptance testing for data solutions. - Troubleshoot, debug, and resolve data pipeline failures, performance issues, and data quality defects. - Contribute to technical documentation, including design artifacts, data mappings, and operational runbooks. - Participate in peer code reviews and apply feedback to improve code quality and maintainability. - Support enhancements, patches, and upgrades to existing data platforms and tooling. Preferred Qualifications - Bachelor’s degree in Computer Science, Information Technology, or related field. - 4 years of hands-on development experience in SQL and/or Python for data warehouse management, data integration, and data lake management. - Deep working knowledge in SQL development using T-SQL code to design, implement, and optimize complex database objects, such as tables, views, stored procedures, indexes, and functions. - Experience working with Azure data architecture, including a solid understanding of tools for building data pipelines on cloud-based data platforms, such as Delta Lakehouse Medallion architecture and data warehousing solutions. - Exposure to modern Spark-based data platforms like Databricks or Microsoft Fabric for data engineering tasks, including leveraging their capabilities for scalable data processing, analytics, and machine learning workflows in a cloud-based environment. - Understanding of ELT vs ETL and how to build efficient data pipelines with modern Change Data Capture processes. - Hands-on experience with CI/CD pipelines in Azure DevOps and understanding of Agile development methodologies. - Familiarity with common data mapping and transformation techniques for Dynamics 365 Data Entities and Data Management Framework for the Finance and Operations modules. - Familiarity with Power BI and its integration with Microsoft Fabric for end-to-end analytics. - Strong communication skills with the ability to translate complex technical concepts into business-friendly language. Career Level Dimensions Typical Training/Experience - Typically requires BS/BA in a related discipline. Generally, 3-5 years of experience in a related field; certification is required in some areas OR MS/MA and generally 2+ years of experience in a related field. Problem Complexity - Applies established problem‑solving skills to moderately complex situations. Identifies root causes for common and recurring data issues and escalates more complex or ambiguous problems appropriately. Troubleshoots and resolves issues within defined data pipelines, systems, or domains, using documented patterns and guidance from senior team members. Autonomy - Performs assignments with moderate independence, operating within established practices and architectural standards. - Determines appropriate approaches to solutions for well‑defined problems. - Receives regular technical guidance on complex problems, design decisions, or unfamiliar technologies. Collaboration - Works closely with other engineers, analysts, and business partners to deliver reliable, high‑quality data solutions. - Actively participates in knowledge sharing, documentation, and peer reviews. - May provide informal guidance to less experienced engineers but does not have formal leadership or people management responsibilities. Core & Main is an Equal Employment Opportunity employer. Employment at Core & Main is based solely on a person’s merit and qualifications directly related to professional competence. Core & Main does not discriminate against any employee or applicant on the basis of race, creed, color, religion, national origin, nationality, ancestry, age, disability, veteran status, pregnancy or related condition (including breastfeeding), affectional or sexual orientation, gender identity or expression, marital status, status with regard to public assistance, citizenship, or any other basis protected by law. None of the questions in this application are intended to elicit information regarding any protected characteristics, nor imply any limitation, illegal preferences or discrimination based upon non-job-related information or protected characteristics.
Data Engineer
G2iG2i serves enterprises with remote staff augmentation for developer teams. The company provides talented web and mobile developers to help companies grow and re
• Own the data stack end-to-end: ingestion → transformation → modeling → serving → monitoring • Build and maintain ETL/ELT pipelines from APIs, webhooks, and operational systems • Design resilient data models that handle evolving and imperfect source systems • Implement monitoring and alerting for data quality, freshness, and pipeline failures • Ensure high reliability and observability across the data layer • Lead improvements, migrations, and infrastructure decisions • Collaborate with engineering leadership on architecture, including modern data access patterns




