GFT Technologies logo
GFT Technologies

As a pioneer for digital transformation GFT develops sustainable solutions across new technologies.

Data Engineer

Data EngineerData EngineerFull TimeRemoteSeniorTeam 10,001+Since 1987H1B No SponsorCompany SiteLinkedIn

Location

Colombia

Posted

6 days ago

Salary

0

Seniority

Senior

Job Description

Data Engineer

GFT Technologies

• Desarrollar scripts Python para automatización, limpieza y validación de datos. • Colaborar con el BA para entender las reglas de negocio contable. • Monitorear y optimizar el rendimiento de los jobs Spark en producción. • Documentar los flujos de datos, modelos y diccionarios de datos.

Job Requirements

  • Especialista en el diseño y construcción de pipelines de transformación y conversión de datos contables.
  • Procesar, transformar y cargar grandes volúmenes de información financiera.
  • Diseñar, desarrollar y mantener pipelines ETL/ELT usando Apache Spark y AWS Glue.
  • Implementar lógica de transformación y conversión de datos contables.
  • Garantizar la calidad, trazabilidad y linaje de los datos.

Benefits

  • Flexibilidad: ¡Aquí el equilibrio lo es todo!
  • Colaboración: La colaboración es fundamental.
  • Multiculturalidad: Contamos con un equipo global diverso.
  • Desarrollo: Ofrecemos un plan de carrera personalizado.
  • Relevancia: Colaboramos con clientes líderes en la industria.

Related Categories

Related Job Pages

More Data Engineer Jobs

ZoomInfo Technologies LLC logo

Forward Deployed Data Engineer

ZoomInfo Technologies LLC

ZoomInfo (NASDAQ: GTM) is the Go-To-Market Intelligence Platform that empowers businesses to grow faster with AI-ready insights, trusted data, and advanced automation. Its solutions provide more than 35,000 companies worldwide with a complete view of their customers, making every seller their best seller.

Data Engineer6 days ago
Full TimeRemoteTeam 1,001-5,000

Role Description ZoomInfo is building its Forward Deployed Engineering function. You will define how it operates — how engagements run, what the deliverables are, how the playbook works, what scales and what doesn't. The support structure exists (data team, product team, infrastructure, executive sponsorship); what's missing is the person who brings it to life in front of customers and turns early wins into a repeatable model. As an FDE, you embed directly with ZoomInfo's strategic accounts — large enterprises with complex data needs, often in financial services, insurance, technology, and other globally distributed industries. You work alongside their teams to understand their go-to-market challenges, then design and deliver bespoke intelligence applications that combine ZoomInfo's third-party data with the customer's first-party data to drive real business outcomes. You own the engagement end-to-end: from discovery through deployment, from executive presentation through production code. You'll work closely with our data and product teams to bring the full breadth of ZoomInfo's data foundation to bear — company intelligence, contact data, buying signals, intent data, and specialized vertical datasets — assembled into purpose-built applications tailored to each customer's specific personas and workflows. ZoomInfo has one of the deepest go-to-market data foundations in the world — 500M+ professional profiles, 100M+ company records, intent signals, vertical datasets spanning financial filings, insurance, commercial fleet, and more. You'll have access to incredible data, powerful infrastructure, and our most important customer relationships. What you build with them — and how you build it — will define the model going forward. What Engagements Typically Look Like - Entity resolution at scale — reconciling legal entity hierarchies (D&B, tax IDs, company-house registrations) with how customers actually go to market. - Hierarchy management — enforcing one-to-one matching across regions, fixing parent-child linkage gaps, dispositioning orphaned accounts, surfacing white space on top of clean parent IDs. - Location-level precision — moving customers off monolithic HQ-level enrichment so geo-based sales teams see local firmographics instead of global rollups. - Automated, no-human-in-the-loop logic — entity suppression, disposition-based matching, orchestration rules for inactive entities, parent linkages, and white space alerts. - Data warehouse as the operating layer — moving sophisticated analysis out of CRM into Snowflake or BigQuery. - Buying group filtering — applying persona-density criteria across hierarchies to turn a customer's 5,600 Disney legal entities into 31 actionable targets. What You'll Build Every engagement comprises a consistent service architecture — three pillars built on top of each other, and five capability areas assembled into the actual deliverable. - Three pillars: - Data Foundation — golden reference matching, persistent IDs, unified entity profiles across the customer's first-party systems. - Data Management — business-specific logic that turns the foundation into something the customer can actually go to market with. - Activation — TAM to SAM to SOM, fit scoring, in-market signals. - Five capability areas: - Data Foundation Development — Match every record across the customer's CRM, ERP, billing, and marketing systems to a golden reference dataset. - Account Architecture & Entity Resolution — Define what an account means for the customer's business and build automated logic that enforces that structure. - TAM Development & White Space Discovery — Build the customer's complete addressable market against ICP criteria. - Account Fit Scoring & In-Market Signals — Build custom fit models from historical win/loss patterns. - Ongoing Governance & Automation — Match orchestration rules, enrichment segmentation, CRM field locking, and warehouse integration. What You'll Do - Own Strategic Customer Engagements End-to-End — Serve as the primary technical point of contact for assigned strategic accounts. - Bridge Technical and Business Audiences — Present to executive leadership and synthesize complex go-to-market data needs into clear, actionable proposals. - Build the FDE Playbook — Document what works: discovery frameworks, engagement phases, integration patterns, deliverable templates, success metrics. - Drive Stickiness and Expansion — Identify expansion opportunities as they emerge. Qualifications - High Ownership, High Ambiguity Tolerance — Comfortable making judgment calls with incomplete information. - Strong Software and Data Engineering Fundamentals — Proficient in Python and SQL, and comfortable working in cloud data warehouses. - Customer-Facing Communication — Experience working directly with customers in a technical capacity. - Go-to-Market Data Familiarity (Preferred) — Experience with B2B data, CRM systems, enrichment and orchestration tooling. Benefits - Comprehensive benefits including holistic mind, body and lifestyle programs designed for overall well-being. - Actual compensation based on factors such as work location, qualifications, skills, experience, and/or training. - Base salary range: $171,500 — $269,500 USD.

United States
$171.5K - $269.5K / year
Prove logo

Data Science Lead

Prove

We're the modern way of proving identity.

Data Engineer6 days ago
Full TimeRemoteTeam 201-500H1B Sponsor

Role Description The Data Science Lead will serve as the strategic architect and research pioneer for the organization’s data ecosystem. This role is responsible for designing robust data architectures, leading research and development (R&D) for novel data sources, establishing rigorous analytical methodologies, and ensuring the seamless, scalable ingestion of high-quality data into downstream production solutions. Core Pillars of Responsibility - Data Architecture & Scalable Engineering - Blueprint Design: Design and oversee the evolution of scalable data architectures that support advanced analytics, machine learning (ML) modeling, and real-time processing. - R&D & Novel Data Source Evaluation - Exploratory Research: Scout, evaluate, and pressure-test new internal, external, and alternative data sources (e.g., synthetic data, IoT streams, third-party APIs) for predictive power and commercial viability. - Proof of Concepts (PoCs): Lead rapid prototyping and PoCs to validate new technologies, algorithms, and data structures before scaling them to production. - Vendor & Partner Assessment: Technical vetting of data vendors and partners to ensure data quality, density, and seamless integration capabilities. - Methodology & Analytical Rigor - Framework Standardization: Define and document the organization's gold-standard methodologies for statistical analysis, experimental design (A/B testing), and ML modeling. - Evaluation Metrics: Establish rigorous validation protocols and evaluation metrics (e.g., precision/recall, drift detection, bias/fairness audits) to ensure model and data integrity. - Continuous Improvement: Keep the organization at the cutting edge of data science by translating academic research and emerging industry trends into practical business methodologies. - Ingestion & Solution Integration - Productionalization Bridge: Serve as the critical bridge between R&D and Production, ensuring that complex analytical models and data sources are seamlessly ingested into core business products and solutions. - API & Interface Design: Oversee data delivery contracts between the DS ecosystem and downstream software applications to ensure the creation of clean, well-documented APIs. Key Deliverables (First 12 Months) - Data Source Playbook: A formalized framework for scoring, vetting, and onboarding new data assets. - Methodology Registry: A centralized repository of approved statistical models, evaluation metrics, and ingestion protocols to ensure team-wide consistency. - Feature Importance Registry & Feature Engineering Roadmap: A centralized repository connecting current data sources to their product value and impact of removal and/or possible substitutes to the roadmap of how Prove can leverage the signals in new and differentiated ways. - Architectural Roadmap: A 12 month to 3-year vision aligning data science infrastructure with corporate scaling goals. Qualifications - 6+ years in Data Science/Data Engineering, with at least 2 years in a technical leadership or architectural role. - Technical Stack: Python, R, SQL, Cloud Platforms (AWS/GCP/Azure), Big Data tech (Spark, Kafka), Orchestration (Airflow), and MLOps tools. - Expertise: Deep understanding of data modeling, schema design (SQL/NoSQL), statistical evaluation, and MLOps deployment patterns, especially in R&D functions that bridge research with production. - Soft Skills: Exceptional ability to translate complex technical architectures into strategic business value for non-technical stakeholders. Benefits - Competitive salaries & Bonus Plan (for eligible roles) and Equity Plan - Modern Health for financial, mental, and physical wellness - 401(k) Retirement Plan & Match (US Offices) and Local Country Pension (International Offices) - Unlimited Vacation and Flexible hours - Comprehensive medical benefits for you and your family ❤️ - Emotional & Physical Wellness – Access to wellness services (EAP & Prove Well-Being Reimbursement) - Bottomless snacks & beverages for certain office locations - Daily GrubHub stipend for lunch if coming into the office (US Offices) - A great place to work and connect with other talented Provers like yourself!

United States
$179K - $200K / year
Leega logo

Engenheiro de Dados GCP – Sênior

Leega

Inteligência, Inovação e Tecnologia.

Data Engineer6 days ago
Full TimeRemoteTeam 201-500Since 2010H1B No Sponsor

• Profissional "hands-on" focado na implementação da arquitetura proposta, garantindo que o agente compreenda linguagem natural, acesse os dados necessários e mantenha a coesão da conversa. • Implementação da Orquestração: Desenvolver a camada de orquestração do agente utilizando LangGraph, construindo os grafos de estado e fluxos da conversa. • Desenvolvimento de Ferramentas (Tool Calling): Implementar e integrar chamadas estruturadas às APIs (tool calling) e sistemas disponibilizados pelo cliente, garantindo mecânicas de retry e fallback. • RAG e Base de Conhecimento: Construir pipelines de RAG (Retrieval-Augmented Generation) para que o agente consulte a base de conhecimento N1 e recupere contextos de forma otimizada. • Gestão de Estado: Implementar a persistência de estado (checkpointing) para suportar conversas de múltiplos turnos (multiturno) de forma fluida. • Implantação no GCP: Realizar o deploy, integração contínua e configuração dos recursos necessários no Google Cloud Platform (ex: Cloud Run, Vector Databases no GCP, Cloud Storage). • Transferência de Conhecimento: Apoiar ativamente a transferência de conhecimento técnico diário para o time do cliente.

Brazil
Natera logo

Head of Data Engineering, Real-World Data

Natera

We are a global leader in cell-free DNA (cfDNA) testing, dedicated to oncology, women’s health, and organ health.

Data Engineer6 days ago
Full TimeRemoteTeam 1,001-5,000Since 2004H1B Sponsor

Natera is seeking a leader to head Data Engineering and Platform for Real-World Data (RWD). This role will in partnership with Product drive the strategy and roadmap for the RWD platform and lead the development, operation, and delivery of data products that support clinical, research, and business priorities. The ideal candidate brings strong engineering leadership, deep technical expertise, and experience building reliable, scalable data platforms in healthcare or life sciences environments. This leader will oversee a team focused on data engineering, data platform development, and data delivery, and will partner closely with product, bioinformatics, analytics, and AI teams to build data solutions that are useful, reliable, and easy to consume across the organization.

United States