Veridix AI is the technology, data, and AI arm of the Emmes Group, a leading full-service contract research organization (CRO) with over 47 years of experience in supporting clinical research across more than 70 countries. With industry-leading capabilities in cell and gene therapy, vaccines, infectious diseases, and ophthalmology, Emmes is one of the top clinical service providers to the U.S. government and is rapidly expanding its presence in biopharma. Veridix AI develops advanced eClinical solutions, powering clinical trials through patient data collection, randomization, biospecimen tracking, and data quality monitoring. Our cutting-edge AI innovations, including Generative AI (GenAI) capabilities, are transforming clinical trial timelines by streamlining processes from document authoring to automating study builds. Our “Character Achieves Results” culture is driven by five key values that guide our actions in the way we conduct research and distinguish us as an organization: Integrity, Agility, Passion for Excellence, Collaborative Partnerships, and Intellectual Curiosity. If you share our motivations and passion in research, come join us!
AI Data Engineer
Location
United States
Posted
1 day ago
Salary
0
Seniority
Mid Level
Job Description
AI Data Engineer
Emmes Group
Role Description The Data Engineer will have a strong background in data engineering and extensive experience with AWS Cloud services. As a Data Engineer, they are responsible for designing, building, and maintaining scalable data pipelines and infrastructure to support our data analytics and business intelligence initiatives. - Design, develop, and maintain robust data pipelines and ETL processes to ingest, transform, and store data from various sources. - Collaborate with data scientists, analysts, and other stakeholders to understand data requirements, design data models, and deliver solutions that meet business needs. - Automate data workflows and implement monitoring and logging to ensure the health and performance of the data infrastructure. - Conduct data profiling, cleansing, and validation to ensure high data quality standards. - Optimize data storage and retrieval performance, ensuring data quality and integrity. - Implement and manage data architecture on AWS, ensuring scalability, reliability, and security. - Stay up to date with the latest trends and best practices in data engineering and AWS cloud technologies. Qualifications - Bachelor’s or master’s degree in computer science, Information Technology, or a related field. - 3 or more years of related professional experience. - Experience in data engineering with a strong focus on AWS cloud services. - Proficiency in SQL and experience with relational databases (e.g., PostgreSQL, MySQL, Redshift). - Experience with AWS services such as S3, Lambda, Glue, EMR, Kinesis, and Redshift. - Strong programming skills in languages such as Python, Java, or Scala. - Knowledge of data modeling, ETL concepts, and data warehousing. - Familiarity with version control systems (e.g., Git) and CI/CD pipelines. - Excellent problem-solving skills and attention to detail. - Knowledge of machine learning frameworks and data science workflows. - Familiarity with data visualization tools (e.g., QuickSight, Qlik). - Familiarity with NoSQL databases (e.g., DynamoDB, MongoDB). - Strong collaboration skills with cross-functional teams to establish best design and user flows for applications. - Strong multitasking, problem solving, and organizational skills. - Proven ability to work independently and in a team environment. - Satisfactory background check required. Benefits - Flexible Approved Time Off - Tuition Reimbursement - 401k Retirement Plan - Work From Home Anywhere in the US - Maternal/Paternal Leave - Casual Dress Code & Work Environment
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Design and manage scalable data platforms powering real-time analytics, batch processing, and exploratory analysis, using AI-assisted development as the default workflow, not an afterthought. • Own the full data lifecycle: ingestion, ETL, storage, and serving, building and iterating on pipelines with AI pair-programming tools (Claude Code) to accelerate delivery. • Ingest data from diverse sources via both streaming (Kafka, Kinesis) and batch pipelines, unifying them into a consistent, queryable platform. • Architect medallion-layer data models (Bronze/Silver/Gold) in Databricks, ensuring business needs are met with clean, well-documented schemas. • Automate, test, and harden data workflows, writing AI-augmented tests, data quality checks, and CI/CD pipelines that catch issues before production. • Build and maintain AI-ready tooling: craft prompts, custom slash commands, and agent workflows that let the entire team scaffold pipelines, generate documentation, and validate data quality faster. • Build and improve Databricks Genie chatbots that allow non-technical users to query data using natural language. • Collaborate with product analytics and data science, applying engineering rigor to messy, unstructured data and transforming it into reliable, production-ready datasets. • Contribute to infrastructure-as-code (Terraform/Atmos) for provisioning and managing cloud data infrastructure.
Lead Data Engineer – Databricks
OZA leading consulting company whose Intelligent Automation expertise accelerates the way you do business.
• Serve as the technical lead for the project—owning solution design decisions, guiding implementation standards, and mentoring other engineers through coaching, reviews, and knowledge sharing. • Design, develop, and maintain complex ETL pipelines using Databricks, ensuring scalable, high-performance data integration across multiple source systems. • Implement and optimize medallion architecture within Databricks, establishing clear data zones (raw, curated, trusted) to support governed, enterprise-wide reporting. • Develop and refine dimensional data models that enable unified, analytics-ready views of business domains and support automated dashboarding and KPI frameworks. • Collaborate closely with cross-functional teams (data stewards, IT, business stakeholders) to translate operational requirements into technical solutions, proactively clarifying dependencies and driving alignment. • Contribute to architectural decisions, leveraging your expertise to recommend best practices, challenge assumptions, and ensure data platform durability and flexibility. • Identify and address integration challenges, data quality issues, and process bottlenecks early, providing actionable insights and thoughtfully pushing back when project risks or inefficiencies arise. • Support knowledge transfer and documentation, empowering colleagues and clients to maintain and evolve data solutions independently.
Sr. Director, Data & Integrations Engineering
CedarCedar is the AI-powered healthcare financial experience platform, built for the rising cost and complexity of healthcare payments. We help millions of people every year understand and resolve their medical bills with clarity and compassion, while helping healthcare organizations operate more efficiently. We’re combining AI, smart design, and empathy to fix one of healthcare’s most urgent crises.
Our healthcare system is the leading cause of personal bankruptcy in the U.S. Every year, over 50 million Americans suffer adverse financial consequences as a result of seeking care, from lower credit scores to garnished wages. The challenge is only getting worse, as high deductible health plans are the fastest growing plan design in the U.S. Cedar’s mission is to leverage data science, smart product design and personalization to make healthcare more affordable and accessible. Today, healthcare providers still engage with its consumers in a “one-size-fits-all” approach; and Cedar is excited to leverage consumer best practices to deliver a superior experience. The RoleCedar is at a meaningful inflection point. We're building a new enterprise data foundation to power the next generation of AI-driven products and client-facing analytics while simultaneously expanding our integration footprint to support new product lines and go-to-market motions. The complexity is real, the stakes are high, and the environment is dynamic. Transparently, this role comes with challenges to navigate. These teams sit at the intersection of client delivery, platform evolution, and operational reliability, which means ad-hoc demands land here regularly and the roadmap can evolve mid-flight. You're not walking into a well-oiled machine; you're walking into an opportunity to build one. A big part of this role is establishing the processes, relationships, and stakeholder alignment that reduce that thrash over time, working closely with internal partners and customers to create more predictability for your teams. The right person for this role doesn't just tolerate the current environment. They see it clearly, stay steady within it, and consistently work to improve it. As Sr. Director of Data & Integrations Engineering, you'll report directly to the CTO and head an organization of ~35 engineers across three teams (Data Engineering, Integration Engineering, and Integration Support Engineering) through three experienced engineering managers. You'll also be backed by Principal Engineers who own the deep technical vision and architecture for the org. Your job isn't to be the smartest technical person in the room; your principals have that covered. Your job is to make the whole machine run: deliver with consistency, develop your people, reduce the organizational thrash that slows everyone down, and ensure the operational quality that our clients, partners, and internal teams depend on. Who You'll LeadIntegration Engineering is accountable for bi-directional data integrations between Cedar and healthcare provider EHR systems: Epic, Cerner, Athena, and more. The team services healthcare provider clients, delivers new implementations, enhances live integrations, and is building scalable integration frameworks to reduce go-live timelines and enable an expanding product surface. They are the technical engine behind Cedar's ability to grow. Integration Support Engineering is Cedar's integration reliability team: on-call response, incident management, root cause analysis, and escalation resolution. They own the SLAs that matter most to our clients and are evolving from a reactive response function toward a proactive, systems-thinking discipline, driving down incident volume through better observability, tooling, and upstream framework improvements. Data Engineering builds and evolves Cedar's enterprise data platform: ELT/ETL pipelines, data governance, Medallion architecture, and the real-time data foundations that power Cedar's AI-driven products. The team is mid-transformation, migrating from legacy infrastructure to a modern stack while simultaneously building net-new foundations for a next-generation analytics and intelligence platform that will drive all future product offerings. The stakes are high and the work is consequential. What You'll OwnDelivery predictability. Establish and maintain the processes, rituals, and accountability structures that keep three parallel workstreams (greenfield platform work, client-facing implementations, and live production operations) shipping on time and with high quality. You know how to right-size agile practices for the work at hand. Process pays rent; you believe that. People and manager development. Grow an effective layer of engineering managers and help them build high-performing, engaged teams. This is your primary lever. Your principals own the technical roadmap; your teams' health, growth, and culture live with you. You have a track record of developing first-time and experienced managers alike. Operational excellence. Own the reliability and SLA performance of Cedar's integration systems. Reduce MTTI, MTTR, and systemic incident rates. Build a culture where operational quality (observability, on-call discipline, postmortem rigor) is a first-class engineering value, not an afterthought. Data platform evolution. Partner with the Principal Data Architect to keep Cedar's data platform transition on track: migration from legacy systems, adoption of new architecture across the organization, and ensuring that downstream consumers (product engineering, data science, analytics, and AI/ML) have trusted, accessible data when they need it. Integration scalability. Reduce the time and cost of onboarding new clients and enabling new product lines. Advance the integration frameworks and patterns that let your teams scale their impact without scaling headcount proportionally. Organizational clarity and thrash reduction. Build the structures (intake processes, prioritization frameworks, escalation paths) that protect your teams from the constant pull of ad-hoc requests and shifting priorities. You'll also coach your managers to run their teams with increasing autonomy: to hold a line, uphold a roadmap, and absorb pressure without needing escalation assistance. You'll build an org that operates with discipline even when the environment around it is noisy. Cross-functional partnership. Build effective relationships with Product, Data Science, Commercial, and Delivery Engineering. Represent your teams' capacity and priorities clearly; negotiate trade-offs constructively; keep issues surfaced and resolved before they become crises. You're a peer other leaders want to work with. What You Bring - 12+ years in software engineering or technical implementation, with meaningful time in data-intensive, integration-heavy, or platform domains. - 8+ years in engineering leadership, including 4+ years managing managers, ideally with direct experience overseeing an org of 30+ engineers. - Solid technical grounding in integration systems and data. You engage credibly with principals and managers on architecture trade-offs, operational risk, and build-vs-buy decisions. You don't need to be the deepest expert, but you need to know when to push back and when to trust. - High tolerance for ambiguity, with a demonstrated talent for reducing it. Cedar moves fast and the environment is genuinely noisy. You don't need a clean brief to start moving. You've thrived in orgs where priorities shift, asks are imperfect, and the path forward requires synthesis rather than instruction. You create structure from chaos without waiting for someone else to hand it to you. - Operational instincts. You've owned SLAs, on-call programs, incident management, or client-facing reliability commitments, and you've built the systems and culture to sustain them. - Effective cross-functional stakeholder management, including executive presence. You can translate between engineering realities and business priorities without losing either side. You've navigated trade-off conversations with product, finance, and executive audiences, and you're comfortable representing your teams' work, priorities, and risks clearly in front of C-suite leadership. Preferred - Healthcare domain experience, particularly in revenue cycle management, EHR data models, or healthcare interoperability (HL7/FHIR). The integration complexity at Cedar is significant, and domain context accelerates both credibility and impact. - AI-forward mindset. You believe agentic developer tooling is a genuine force accelerator, not a passing trend, and you set expectations from the front on adoption. You actively model AI-native workflows for your managers and teams, and you create the space and expectation for engineers to integrate these tools into how they build, debug, and ship. You've seen what happens when a team gets this right, and you want to make it happen again. Compensation Range and Benefits - Salary/Hourly Rate Range*: $280,500 - $330,000 - This role is equity eligible - This role offers a competitive benefits and wellness package *Subject to location, experience, and education What do we offer to the ideal candidate? - A chance to improve the U.S. healthcare system at a high-growth company! Our leading healthcare financial platform is scaling rapidly, helping millions of patients per year - Unless stated otherwise, most roles have flexibility to work from home or in the office, depending on what works best for you - For exempt employees: Unlimited PTO for vacation, sick and mental health days–we encourage everyone to take at least 20 days of vacation per year to ensure dedicated time to spend with loved ones, explore, rest and recharge - 16 weeks paid parental leave with health benefits for all parents, plus flexible re-entry schedules for returning to work - Diversity initiatives that encourage Cedarians to bring their whole selves to work, including three employee resource groups: be@cedar (for BIPOC-identifying Cedarians and their allies), Pridecones (for LGBTQIA+ Cedarians and their allies) and Cedar Women+ (for female-identifying Cedarians) - Competitive pay, equity (for qualifying roles), and health benefits, including fertility & adoption assistance, that start on the first of the month following your start date (or on your start date if your start date coincides with the first of the month) - Cedar matches 100% of your 401(k) contributions, up to 3% of your annual compensation - Access to hands-on mentorship, employee and management coaching, and a team discretionary budget for learning and development resources to help you grow both professionally and personally About us Cedar was co-founded by Florian Otto and Arel Lidow in 2016 after a negative medical billing experience inspired them to help improve our healthcare system. With a commitment to solving billing and patient experience issues, Cedar has become a leading healthcare technology company fueled by remarkable growth. "Over the past several years, we've raised more than $350 million in funding & have the active support of Thrive and Andreessen Horowitz (a16z). As of November 2024, Cedar is engaging with 26 million patients annually and is on target to process $3.5 billion in patient payments annually. Cedar partners with more than 55 leading healthcare providers and payers including Highmark Inc., Allegheny Health Network, Novant Health, Allina Health and Providence.
Role Description We are looking for a skilled and motivated Data Engineer Specialist to join our team. The responsibilities of this role is to design, build, and maintain a robust, self-service, scalable, and secure data platform and end-to-end data pipelines that empowers Data Analysts, and Data Scientists to deliver insights and drive strategic decision-making. Responsibilities of a Data Engineer at QuintoAndar: - Build and maintain a high-performance data platform that meets the company's needs, connects with product solutions, and leads analytical innovation. - Create and edit data pipelines, considering business logic, choosing levels of aggregation, grouping and transforming fields, checking data quality, and cleaning the data. - Create data modeling and transformation workflows, enabling the creation of clear and accessible data abstractions. - Responsible for the entire code development lifecycle (monitoring deployment, documentation, performance, security, adding metrics and alarms, ensuring SLO budget compliance, and more). - Investigate inconsistencies and trace the source of differences (data troubleshooting). - Enable teams across the company to access and use data more effectively through self-service tools and well-modeled datasets. - Align with stakeholders to understand their primary needs while proposing extensible, scalable, and incremental solutions. - Conduct PoCs and benchmarks to determine the best tool for a given problem. - Contribute to defining the strategic vision, crossing team and service boundaries to solve problems. - Advocate for the value of data analytics and engineering within the organization and foster a data-driven culture. - Be a reference within the chapter on technical concepts, tools, and/or best coding practices. Qualifications - 7 or more years of experience in Data Engineering roles. - Specialist in technologies, solutions, and concepts of Big Data (Spark, Hadoop, Hive, MapReduce) and multiple languages (YAML, Python). - Experience with Airflow, Spark, AWS, and Databricks. - Strong foundation in software engineering principles, with experience working on data-centric systems. - Experience with columnar storage solutions and/or data lakehouse concepts. - Proficiency in Python or one of the main programming languages, with a passion for writing clean and maintainable code. - Strong knowledge in optimizing SQL query performance. - Experience in building multidimensional data models (Star and/or Snowflake schema). - Understanding of the data lifecycle and concepts such as lineage, governance, privacy, retention, anonymization, etc. - Knowledge in infrastructure areas such as containers and orchestration (Kubernetes, ECS), CI/CD strategies, infrastructure as code (Terraform), observability (Prometheus, Grafana), among others. - Proficiency in English - our code, documentation, tools, and materials are often structured in English. - Excellent communication skills, proactively sharing and collaborating with both technical and non-technical stakeholders. - Experience as a tech/project lead or similar. - Curiosity, detail-orientation, and thrive in a fast-paced, data-driven environment. Requirements - You will stand out if you have participated in building large-scale data platforms for big data sets and teams using Big Data technologies such as Spark, Trino, Hive, Atlas, Ranger, etc. - Experience in building semantic layers. Benefits - Competitive salary - Profit sharing - Meal allowance - Health insurance - Dental plan - Life insurance - Childcare subsidy and Atypical Parenthood subsidy - Wellhub - Home office allowance - Employee assistance program (mental health, social, legal, and financial support) - Extended parental leave - Day off on birthday, Mother’s Day, and Father’s Day - Benefits Club (discounts on everyday services) - Discounts at educational institutions - Reading kit for children – PlayKids



