Senior Data Engineer – AWS, LLMs

Data EngineerData EngineerFull TimeRemoteSeniorTeam 10,001+H1B SponsorCompany SiteLinkedIn

Location

Brazil

Posted

13 days ago

Salary

0

Seniority

Senior

Job Description

Senior Data Engineer – AWS, LLMs

Compass

• Definir e evoluir arquitetura de dados cloud-native na AWS; • Projetar modelos de dados otimizados para: consulta operacional (APIs) e consumo pelo agente conversacional; • Liderar migração de dados históricos do mainframe; • Garantir consistência, integridade e rastreabilidade; • Trabalhar com grandes volumes e dados históricos complexos; • Construir e manter pipelines de ingestão e transformação; • Integrar fluxo existente (PeopleSoft, mainframe, RDS); • Otimizar consultas no PostgreSQL; • Implementar estratégias de particionamento, indexação e caching (Redis); • Reduzir latência para consultas frequentes; • Garantir aderência a LGPD, políticas de retenção; • Implementar controle de acesso a dados, mascaramento / anonimização quando necessário, auditoria e lineage; • Estruturar dados para consumo por LLMs; • Apoiar criação de camadas semânticas; • Trabalhar com embeddings / indexação semântica (diferencial); • Implementar validações de qualidade de dados; • Monitorar pipelines e consistência;

Job Requirements

  • Experiência com Data Engineering em AWS;
  • Construção e manutenção de pipelines de dados (ingestão e transformação);
  • Integração de dados de sistemas legados e corporativos (mainframe, PeopleSoft, RDS);
  • Experiência com modelagem de dados para diferentes consumos (APIs e analytics/IA);
  • Forte atuação em PostgreSQL (otimização de queries, performance, indexação e particionamento);
  • Conhecimento em caching (Redis);
  • Experiência com grandes volumes de dados e migração de sistemas legados;
  • Práticas de governança de dados (LGPD, auditoria, lineage, controle de acesso e qualidade);
  • Estruturação de dados para consumo por LLMs (camadas semânticas e embeddings – diferencial);

Related Categories

Related Job Pages

More Data Engineer Jobs

Stefanini Brasil logo

Data Engineer – ML Specialist

Stefanini Brasil

Co-creating Solutions for a Better Future

Data Engineer13 days ago
Full TimeRemoteTeam 10,001+Since 1987H1B No Sponsor

• A specialist who will lead the creation and management of the data pipelines that feed the project's analytical models. • Responsible for ensuring the quality, integrity, and availability of data for training and inference of Machine Learning models in production. • Expected to mentor other team members and define data engineering and MLOps strategies.

Brazil

IT Data Platform Engineer 2

Evergen

Evergen is a global industry-leading contract development and manufacturing organization (CDMO) in regenerative medicine. As the only regenerative medicine company that offers a differentiated portfolio of allograft and xenograft biomaterials at scale, Evergen is headquartered in Alachua, FL, and has manufacturing facilities in West Lafayette, IN., Eden Prairie and Glencoe, MN., Neunkirchen, DE., Glasgow, UK., and Marton, NZ.

Data Engineer13 days ago

Role Description We are looking for a hands-on Data Platform Engineer to own, operate, and evolve our modern cloud data stack. You will be the primary technical owner of our data infrastructure — responsible for keeping data flowing reliably from source systems into Snowflake and ensuring clean, trusted data reaches our business teams through Power BI. This is a high-impact role on a small, focused team where your work will be directly visible to the business. What You Will Work On - DATA INGESTION - Own and manage Fivetran connectors across all source systems including NetSuite, HubSpot, ADP, SQL Server, SAP HANA, and SharePoint. - Configure and monitor sync schedules, column exclusions, and incremental load strategies to control cost and reliability. - Troubleshoot connector failures and proactively manage schema drift from upstream sources. - DATA TRANSFORMATION - Maintain and extend our dbt project across three layers: staging (L1), core dimensions and facts (L2), and business-ready marts (L3). - Write and optimize SQL models using incremental merge strategies and watermark patterns. - Author and maintain dbt tests, model documentation, and source freshness checks to ensure data quality. - Support the buildout of our finance EDW including GL activity, planning data from Workday Adaptive, and NetSuite financials. - ORCHESTRATION & PIPELINE OPERATIONS - Manage end-to-end pipeline scheduling and monitoring, ensuring daily refreshes complete reliably before business hours. - Maintain the integration between Fivetran, dbt, and Power BI dataset refresh triggers. - Build and maintain alerting so pipeline failures are caught and communicated before the business is impacted. - DATA WAREHOUSE & GOVERNANCE - Manage Snowflake environments including databases, schemas, roles, warehouses, and cost controls. - Implement and maintain access controls and role-based permissions across the data platform. - Contribute to data catalog and lineage documentation to support a growing team and reduce knowledge concentration risk. - COLLABORATION - Partner with Finance, Sales, and Operations teams to understand reporting requirements and translate them into reliable data models. - Support and mentor the junior member of the data team as they develop their skills. - Work closely with the incoming NetSuite implementation team to ensure clean data integration into the warehouse. Qualifications - 3 to 5 years of experience in data engineering, analytics engineering, or a closely related role. - Hands-on experience with dbt (Core or Cloud) including incremental models, tests, macros, and documentation. - Proficiency with Snowflake including schema design, query optimization, warehouses, and role-based access control. - Experience with a managed ingestion tool such as Fivetran and dlt including connector configuration and monitoring. - Strong SQL skills with the ability to write and debug complex analytical queries. - Familiarity with ELT pipeline patterns and medallion-style data warehouse architecture. - Experience troubleshooting pipeline failures independently and communicating issues clearly to non-technical stakeholders. - Comfort working autonomously in a small team environment with limited oversight. Requirements - Experience connecting to on-premises source systems (SQL Server, SAP HANA, Oracle) via ODBC or CDC tooling. - Familiarity with ERP financial data in NetSuite, SAP, or similar, particularly GL structures and chart of accounts. - Exposure to Power BI including dataset refresh management and understanding of how semantic models consume warehouse data. - Experience with Git-based workflows and basic CI/CD practices for data projects. - Prior involvement in an EDW build or dimensional modeling project (star schema, slowly changing dimensions). What Success Looks Like - In your first 30 days: - You have completed a full walkthrough of the existing stack with our departing team member. - You can independently run, monitor, and troubleshoot the daily pipeline end to end. - You have documented any gaps or risks you have identified in the current setup. - In your first 90 days: - All Fivetran connectors are live and the dlt migration is complete. - dbt is running in dbt Cloud or Snowflake Workspaces with jobs, alerting, and documentation in place. - The Finance team has improved confidence in the GL data flowing through the warehouse. - In your first year: - The data platform is running reliably with minimal intervention and strong business trust. - The NetSuite data integration is live and the finance EDW is serving reporting needs. - You have grown the junior team member's capability and reduced single-person dependency on yourself. Our Stack - Ingestion: Fivetran, dlt - Warehouse: Snowflake - Transformation: dbt (Cloud or Snowflake-native) - Reporting: Power BI - Sources: NetSuite, HubSpot, ADP, SQL Server, SAP HANA, Workday Adaptive, SharePoint, Oracle DB Company Description Evergen is a global industry-leading contract development and manufacturing organization (CDMO) in regenerative medicine. As the only regenerative medicine company that offers a differentiated portfolio of allograft and xenograft biomaterials at scale, Evergen is headquartered in Alachua, FL, and has manufacturing facilities in West Lafayette, IN., Eden Prairie and Glencoe, MN., Neunkirchen, DE., Glasgow, UK., and Marton, NZ.

United States
$130K - $145K / year
Volga Partners logo

Conversational Data Collection Associate

Volga Partners

Smart and Reliable Technology Solutions

Data Engineer13 days ago
Full TimeRemoteTeam 1,001-5,000Since 2020H1B Sponsor

Role Description We are looking for individuals to record natural conversations using provided scripts. You will work with a partner and play a specific role (such as Customer or Support Agent). The goal is to create clear, natural-sounding audio recordings by following simple instructions. This project involves recording natural conversations with a partner using provided scripts. You will play a role (such as a Customer or Support Agent) and speak in a clear, natural way. The conversations are based on customer support situations, where one person helps solve a problem. These may include: - Verifying account details - Solving basic technical issues - Explaining simple policies Each assignment takes approximately 30 to 90 minutes and must be completed in one continuous session. You must record in a quiet, noise-free environment. Qualifications - Fluent in Brazilian Portuguese and English - Good communication skills in the required language(s) - Ability to follow instructions - Attention to detail - Willingness to work with a partner - Ability to meet deadlines Requirements - Speak naturally (like a real conversation, not robotic) - Follow all instructions carefully - Ensure clear audio with no background noise - Be available to re-record if needed - Computer or laptop with internet access - Working microphone - Basic knowledge of: - Google Sheets - Google Drive - Zencastr - Ability to download, rename, and upload files Payment - You will be paid for up to 1.5 hours per assignment, based on approved work. - Payment depends on audio quality and following instructions. - Work that does not meet quality standards may be rejected or require re-recording. - Payment is processed after successful quality review. - $4.00 USD per hour. Agreement & Acknowledgement Selected participants will be required to sign an agreement and acknowledgement form confirming: - Understanding of the task and instructions - Acceptance of quality and payment terms - Consent to use of recorded data for the project

South Africa
$4 / hour
Future Processing logo

Senior Cloud Data Engineer, AWS, GCP, Databricks

Future Processing

Great software... because we put people first

Data Engineer13 days ago
Full TimeRemoteTeam 1,001-5,000Since 2000H1B No Sponsor

• odpowiedzialność za całość rozwiązań współtworzonych wraz z zespołem • tworzenie lub modyfikowanie rozwiązań do przetwarzania danych w chmurze • tworzenie i modyfikowanie dokumentacji • analizowanie i optymalizowanie rozwiązań w zakresie działającego lub projektowanego systemu • analizowanie wymagań klienta pod kątem dostarczenia optymalnego rozwiązania jego potrzeby biznesowej • analizowanie potencjalnych zagrożeń • dostosowywanie rozwiązań względem wymagań biznesowych • testowanie rozwiązań.

Poland
zł133 - zł196 / hour