Data Engineer – Specialist
Location
Brazil
Posted
11 days ago
Salary
0
Seniority
Senior
Job Description
Data Engineer – Specialist
Compass
• Define and implement Artificial Intelligence solutions applied to data modernization and legacy systems. • Develop mechanisms for analyzing, interpreting, and extracting technical information from legacy artifacts. • Build Generative AI–based solutions to accelerate documentation, transformation, and migration processes. • Create intermediate metadata models to represent flows, business rules, dependencies, entities, and transformations. • Develop accelerators and reusable components aligned with the enterprise data architecture. • Support the definition of templates, technical standards, and declarative structures for modern pipelines. • Develop and evolve pipelines using Databricks, PySpark, Lakeflow Jobs, and Declarative Pipelines. • Work with batch loads, incremental ingestions, CDC (Change Data Capture), and enterprise integrations. • Support the advancement of data governance, traceability, and data quality during migration. • Collaborate with architects, data engineers, and platform specialists to define scalable and secure solutions.
Job Requirements
- Experience with Artificial Intelligence applied to Data Engineering, Software Engineering, or Technical Automation.
- Hands-on knowledge of Generative AI and Large Language Models (LLMs).
- Experience with:
- AI agents
- RAG (Retrieval-Augmented Generation)
- Prompt engineering
- Function calling
- Tool use
- Automated workflows
- Experience with AI orchestration frameworks:
- LangChain
- LangGraph
- Semantic Kernel
- CrewAI
- Or equivalent frameworks
- Experience with Python.
- Experience with PySpark and Spark SQL.
- Development of data pipelines in distributed environments.
- Advanced SQL knowledge.
- Knowledge of XML, JSON, and YAML.
- Experience with data modeling and Data Warehousing.
- Experience working with Databricks environments.
Benefits
- Position open to candidates with disabilities (PcD)
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Enterprise Data Warehouse Developer – Microsoft Fabric
LCMC HealthEight hospitals + dozens of New Orleans area clinics and practices, all focused on keeping you well.
• Develop, design, and implement data warehousing solutions. • Collaborate with stakeholders to gather requirements • Perform data analysis and reporting to support decision-making. • Ensure data integrity and quality in data solutions.
External Data Transfer Lead
ICON plcICON plc, or simply ICON, is a global provider of outsourced development services to companies in industries like biotechnology, medical devices, and pharmaceut
Role Description As a Senior Lead Clinical Data Science Programmer at ICON, you will be instrumental in leading the development and implementation of advanced data science solutions for clinical trials. - Manage day-to-day clinical data science activities, supporting your team to deliver quality outcomes. - Lead the development of Data Transfer Specifications (DTS) documents to align external data providers and research partners on the required structure for new data, including: - Authoring the DTS and responding to external data providers and internal stakeholder queries to ensure data will be delivered in the correct format and structure. - Track DTS status with external data providers across different data types and programs and ensure data completion dates are met for study timelines and deliverables. - Ensure data structure based on the type(s) of data being used is consistent and compliant with appropriate data templates. - Ensure data structure is consistent across each data provider and complies with appropriate data templates. - Support data reconciliation and data structure inquiry resolution. - Liaise cross-functionally to facilitate the creation of new test codes. - Participate in the Clinical Study Team as an extended team member. - Oversee and train in the use of the DTS and other supplemental documents. - Contribute to improvement initiatives as it relates to external data process. - Ensure study teams adhere to CDISC standards as it relates to external data. - Comply with all pertinent regulatory agency requirements (Understand clinical protocols and requirements for Biomarkers/Imaging/eCOA data, blinding and analysis expectations). - Process change requests to update existing DTS. - Improve templates for existing DTS to ensure data harmonization and downstream analytics. - Provide external data management oversight to vendors, providing a pathway for functional discussions, partnership level processes & standards, portfolio status, communication, and escalation. - Review and contribute to eCRF and EDC builds as it relates to external data requirements. Qualifications - Expertise in biomarker data types and/or Imaging data for oncology and non-oncology studies is preferred. - Experience working with multiple data types/formats. - Experience in managing clinical, biomarker data, eCOA, and imaging data. - Demonstrates broad knowledge of all applicable regulations including 21 CFR Part 11, ICH-GCP Guidelines, and CDISC standards for data collections. - Demonstrates advanced knowledge of Data Management processes and industry best practices. - Advanced knowledge and experience with extracting data into SAS, CSV, and XML formats is required. Requirements - Employment with ICON is contingent upon having the legal right to work in the country where the role is based. Benefits - Competitive base salary and performance-related incentives. - Health and wellbeing programmes including medical, dental, and vision coverage where applicable. - Retirement and pension plans. - Life assurance and disability coverage. - Employee assistance programmes and wellbeing resources. - Learning and development opportunities through structured training and career pathways. - Benefits may vary depending on role and location. Company Description ICON is a global healthcare intelligence and clinical research organisation united by a mission to bring new medicines and treatments to patients faster. As a values-driven organisation, integrity, collaboration, agility, and inclusion are at the heart of how we work and interact with each other, customers, patients, and suppliers.
Senior/Lead Data Engineer – AI-Native Aftermarket Platform
Truelogic SoftwarePremium boutique software development company that helps brands with big ideas to make a difference in people’s lives.
• Design and build robust, idempotent data pipelines from scratch utilizing a modern data stack. • Design star and snowflake schemas, writing precise, grain-aware SQL to construct scalable data marts. • Write production-grade, unit-tested Python code at the module level, adhering to strong engineering disciplines such as type hinting and testing. • Build and test dbt models across staging, intermediate, and mart layers while managing overall project structure. • Author and deploy jobs using Databricks Asset Bundles (DAB) following documented architectural patterns. • Implement rigorous data quality checks at source, intermediate, and destination layers to prevent silent drops of nulls or duplicates. • Maintain data governance through comprehensive dbt tests and strict documentation-at-merge-time discipline. • Operate securely within a multi-repository architecture, utilizing service principals and ensuring zero personal credentials in production deployments. • Run cross-repository exposure checks prior to merging schema-breaking changes. • Own data pipelines end-to-end, making key technical design decisions and mentoring mid-level engineers through substantive code reviews. • Define overarching technical direction across core data systems, including modeling standards, branching strategies, observability thresholds, and secret management policies. • Act as a technical leader to unblock the team and actively participate in hiring panels to scale the engineering organization.
Senior/Lead Data Engineer – AI-Native Aftermarket Platform
Truelogic SoftwarePremium boutique software development company that helps brands with big ideas to make a difference in people’s lives.
• Design and build robust, idempotent data pipelines from scratch utilizing a modern data stack. • Design star and snowflake schemas, writing precise, grain-aware SQL to construct scalable data marts. • Write production-grade, unit-tested Python code at the module level, adhering to strong engineering disciplines such as type hinting and testing. • Build and test dbt models across staging, intermediate, and mart layers while managing overall project structure. • Author and deploy jobs using Databricks Asset Bundles (DAB) following documented architectural patterns. • Implement rigorous data quality checks at source, intermediate, and destination layers to prevent silent drops of nulls or duplicates. • Maintain data governance through comprehensive dbt tests and strict documentation-at-merge-time discipline. • Operate securely within a multi-repository architecture, utilizing service principals and ensuring zero personal credentials in production deployments. • Run cross-repository exposure checks prior to merging schema-breaking changes. • Own data pipelines end-to-end, making key technical design decisions and mentoring mid-level engineers through substantive code reviews. • Define overarching technical direction across core data systems, including modeling standards, branching strategies, observability thresholds, and secret management policies. • Act as a technical leader to unblock the team and actively participate in hiring panels to scale the engineering organization.



