Compass

Data Engineer – Specialist

Data EngineerData EngineerFull Time Remote SeniorTeam 10,001+H1B SponsorCompany Site LinkedIn

Location

Brazil

Posted

11 days ago

Salary

Seniority

Senior

Bachelor DegreePortuguesePySpark Python Spark SQL

Job Description

• Define and implement Artificial Intelligence solutions applied to data modernization and legacy systems. • Develop mechanisms for analyzing, interpreting, and extracting technical information from legacy artifacts. • Build Generative AI–based solutions to accelerate documentation, transformation, and migration processes. • Create intermediate metadata models to represent flows, business rules, dependencies, entities, and transformations. • Develop accelerators and reusable components aligned with the enterprise data architecture. • Support the definition of templates, technical standards, and declarative structures for modern pipelines. • Develop and evolve pipelines using Databricks, PySpark, Lakeflow Jobs, and Declarative Pipelines. • Work with batch loads, incremental ingestions, CDC (Change Data Capture), and enterprise integrations. • Support the advancement of data governance, traceability, and data quality during migration. • Collaborate with architects, data engineers, and platform specialists to define scalable and secure solutions.

Job Requirements

Experience with Artificial Intelligence applied to Data Engineering, Software Engineering, or Technical Automation.
Hands-on knowledge of Generative AI and Large Language Models (LLMs).
Experience with:
AI agents
RAG (Retrieval-Augmented Generation)
Prompt engineering
Function calling
Tool use
Automated workflows
Experience with AI orchestration frameworks:
LangChain
LangGraph
Semantic Kernel
CrewAI
Or equivalent frameworks
Experience with Python.
Experience with PySpark and Spark SQL.
Development of data pipelines in distributed environments.
Advanced SQL knowledge.
Knowledge of XML, JSON, and YAML.
Experience with data modeling and Data Warehousing.
Experience working with Databricks environments.

Benefits

Position open to candidates with disabilities (PcD)

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Enterprise Data Warehouse Developer – Microsoft Fabric

LCMC Health

Eight hospitals + dozens of New Orleans area clinics and practices, all focused on keeping you well.

Data Engineer11 days ago

Full Time RemoteTeam 10,001+H1B Sponsor

Company Site LinkedIn

• Develop, design, and implement data warehousing solutions. • Collaborate with stakeholders to gather requirements • Perform data analysis and reporting to support decision-making. • Ensure data integrity and quality in data solutions.

ETL

View details: Enterprise Data Warehouse Developer – Microsoft Fabric

Louisiana

Apply

Job Closed

External Data Transfer Lead

ICON plc

ICON plc, or simply ICON, is a global provider of outsourced development services to companies in industries like biotechnology, medical devices, and pharmaceut

Data Engineer11 days ago

Full Time Remote

Company Site

Role Description As a Senior Lead Clinical Data Science Programmer at ICON, you will be instrumental in leading the development and implementation of advanced data science solutions for clinical trials. - Manage day-to-day clinical data science activities, supporting your team to deliver quality outcomes. - Lead the development of Data Transfer Specifications (DTS) documents to align external data providers and research partners on the required structure for new data, including: - Authoring the DTS and responding to external data providers and internal stakeholder queries to ensure data will be delivered in the correct format and structure. - Track DTS status with external data providers across different data types and programs and ensure data completion dates are met for study timelines and deliverables. - Ensure data structure based on the type(s) of data being used is consistent and compliant with appropriate data templates. - Ensure data structure is consistent across each data provider and complies with appropriate data templates. - Support data reconciliation and data structure inquiry resolution. - Liaise cross-functionally to facilitate the creation of new test codes. - Participate in the Clinical Study Team as an extended team member. - Oversee and train in the use of the DTS and other supplemental documents. - Contribute to improvement initiatives as it relates to external data process. - Ensure study teams adhere to CDISC standards as it relates to external data. - Comply with all pertinent regulatory agency requirements (Understand clinical protocols and requirements for Biomarkers/Imaging/eCOA data, blinding and analysis expectations). - Process change requests to update existing DTS. - Improve templates for existing DTS to ensure data harmonization and downstream analytics. - Provide external data management oversight to vendors, providing a pathway for functional discussions, partnership level processes & standards, portfolio status, communication, and escalation. - Review and contribute to eCRF and EDC builds as it relates to external data requirements. Qualifications - Expertise in biomarker data types and/or Imaging data for oncology and non-oncology studies is preferred. - Experience working with multiple data types/formats. - Experience in managing clinical, biomarker data, eCOA, and imaging data. - Demonstrates broad knowledge of all applicable regulations including 21 CFR Part 11, ICH-GCP Guidelines, and CDISC standards for data collections. - Demonstrates advanced knowledge of Data Management processes and industry best practices. - Advanced knowledge and experience with extracting data into SAS, CSV, and XML formats is required. Requirements - Employment with ICON is contingent upon having the legal right to work in the country where the role is based. Benefits - Competitive base salary and performance-related incentives. - Health and wellbeing programmes including medical, dental, and vision coverage where applicable. - Retirement and pension plans. - Life assurance and disability coverage. - Employee assistance programmes and wellbeing resources. - Learning and development opportunities through structured training and career pathways. - Benefits may vary depending on role and location. Company Description ICON is a global healthcare intelligence and clinical research organisation united by a mission to bring new medicines and treatments to patients faster. As a values-driven organisation, integrity, collaboration, agility, and inclusion are at the heart of how we work and interact with each other, customers, patients, and suppliers.

GCP SAS

View details: External Data Transfer Lead

United States

Apply

Job Closed

Senior/Lead Data Engineer – AI-Native Aftermarket Platform

Truelogic Software

Premium boutique software development company that helps brands with big ideas to make a difference in people’s lives.

Data Engineer11 days ago

Full Time RemoteTeam 501-1,000Since 2004H1B No Sponsor

Company Site LinkedIn

• Design and build robust, idempotent data pipelines from scratch utilizing a modern data stack. • Design star and snowflake schemas, writing precise, grain-aware SQL to construct scalable data marts. • Write production-grade, unit-tested Python code at the module level, adhering to strong engineering disciplines such as type hinting and testing. • Build and test dbt models across staging, intermediate, and mart layers while managing overall project structure. • Author and deploy jobs using Databricks Asset Bundles (DAB) following documented architectural patterns. • Implement rigorous data quality checks at source, intermediate, and destination layers to prevent silent drops of nulls or duplicates. • Maintain data governance through comprehensive dbt tests and strict documentation-at-merge-time discipline. • Operate securely within a multi-repository architecture, utilizing service principals and ensuring zero personal credentials in production deployments. • Run cross-repository exposure checks prior to merging schema-breaking changes. • Own data pipelines end-to-end, making key technical design decisions and mentoring mid-level engineers through substantive code reviews. • Define overarching technical direction across core data systems, including modeling standards, branching strategies, observability thresholds, and secret management policies. • Act as a technical leader to unblock the team and actively participate in hiring panels to scale the engineering organization.

Azure PySpark Python Spark SQL Unity Vault

View details: Senior/Lead Data Engineer – AI-Native Aftermarket Platform

Brazil

Apply