Innodata Inc logo
Innodata Inc

Innodata (NASDAQ: INOD) is a leading data engineering company. With more than 2,000 customers and operations in 13 cities around the world, we are an AI technology solutions provider-of-choice for 4 out of 5 of the world’s biggest technology companies, as well as leading companies across financial services, insurance, technology, law, and medicine. By combining advanced machine learning and artificial intelligence (ML/AI) technologies, a global workforce of subject matter experts, and a high-security infrastructure, we’re helping usher in the promise of AI. Our global workforce includes over 7,000 employees in the United States, Canada, United Kingdom, the Philippines, India, Sri Lanka, Israel and Germany. We’re poised for a period of explosive growth over the next few years.

Data Engineer

Data EngineerData EngineerFull TimeRemoteMid LevelTeam 5,001-10,000

Location

United States

Posted

3 days ago

Salary

$100K - $120K / year

Seniority

Mid Level

Job Description

Data Engineer

Innodata Inc

Role Description We are seeking a Data Engineer to design and build enterprise data warehouses, data lakes, and pipelines that power data-driven decision-making for data center supply chain and real estate operations. This role is responsible for creating scalable, secure, and optimized ETL infrastructure on GCP/AWS, while enabling advanced AI/ML use cases such as RAG, copilots, and agentic AI for predictive analytics and workflow automation. What You’ll Own - Design and implement data-driven solutions on GCP including BigQuery, Cloud Storage, Dataflow, Pub/Sub, and Looker/BI. - Build ETL scripts using SQL and Python to extract, clean, and transform structured and unstructured data from ERP, procurement, logistics, and facility management systems. - Develop and optimize data pipelines for ingestion, transformation, and loading into enterprise data lakes and warehouses. - Build and extend end-to-end data and BI solutions, spanning extraction, storage, transformation, and visualization layers. - Partner with supply chain, real estate, and AI/ML teams to provide pipelines for AI solutions (e.g., RAG ingestion, Copilot integration, multi-agent workflows). - Ensure data governance, lineage, and compliance across supply chain datasets. - Continuously optimize query performance, ETL processes, and pipeline reliability. Qualifications - Advanced proficiency in SQL (complex queries, optimization) and Python (data engineering, scripting, APIs). - Experience building ETL/ELT pipelines operating on structured and unstructured data sources. - Knowledge of enterprise data warehouse and data lake architectures. - Exposure to data pipelines for AI/ML (vector DB ingestion, embeddings, RAG pipelines, copilots, agents). - Familiarity with supply chain or data center operations data is a strong plus. - Bonus: experience with ML Engineering, data visualization tools (Looker, Tableau, Power BI) and MLOps practices. - Strong hands-on expertise with GCP services: BigQuery, Dataflow, Pub/Sub, Cloud Storage, Looker/BI (or similar, preferred). Requirements - The expected salary range for this position is $100,000 – $120,000 USD per year, based on experience, skills, and qualifications. Company Description Innodata (Nasdaq: INOD) is a global data engineering company. We believe that data and Artificial Intelligence (AI) are inextricably linked. Our mission is to enable the responsible advancement of artificial intelligence by providing the data, evaluation frameworks, and human expertise required to build AI systems that can be trusted at scale. We provide a range of transferable solutions, platforms, and services for Generative AI / AI builders and adopters. In every relationship, we honor our 36+ year legacy delivering the highest quality data and outstanding outcomes for our customers.

Related Categories

Related Job Pages

More Data Engineer Jobs

Full TimeRemoteTeam 5,001-10,000Since 1995H1B No Sponsor

• Projetar e Desenvolver Pipelines de Dados: Arquitetar, implementar e manter pipelines ETL/ELT escaláveis e de alto desempenho para processar grandes volumes de dados estruturados e não estruturados. • Integração e Processamento de Dados: Construir arquiteturas acionadas por eventos para permitir o processamento de dados em tempo real e a integração perfeita entre sistemas. • Gerenciamento da Infraestrutura em Nuvem: Aproveitar os serviços em nuvem AWS (por exemplo, S3, Redshift, Glue, Lambda, Kinesis, Pentaho) para projetar e implantar uma infraestrutura de dados robusta. • Desenvolvimento e Otimização de Código: Escrever código limpo, eficiente e de fácil manutenção em Python para suportar a ingestão, transformação e orquestração de dados. • Colaboração em Ambientes Ágeis: Trabalhar em estreita colaboração com equipes multifuncionais, incluindo cientistas de dados, analistas e engenheiros de software, seguindo metodologias ágeis para entregar soluções iterativas. • Qualidade e Governança de Dados: Implementar melhores práticas para qualidade de dados, monitoramento e conformidade para garantir a integridade, consistência e segurança dos dados. • Otimização de Desempenho: Otimizar pipelines de dados e consultas em ambientes em nuvem para desempenho, escalabilidade e eficiência de custos. • Mentoria e Liderança: Fornecer orientação técnica aos membros juniores da equipe, promovendo uma cultura de aprendizado contínuo e melhoria. • Documentação e Compartilhamento de Conhecimento: Documentar designs técnicos, processos e fluxos de trabalho para garantir a manutenção e transferência de conhecimento.

Brazil
Versapay logo

Senior Data Engineer

Versapay

The first Collaborative Accounts Receivable Network. Accomplish more, get paid faster, and deliver better experiences.

Data Engineer3 days ago
Full TimeRemoteTeam 201-500Since 2006H1B No Sponsor

• Optimize our existing Snowflake architecture, establishing strict environmental isolation and scalable structures that prepare our data for eventual downstream commercialization and product offerings. • Leverage tools like Snowflake Cortex, Cursor, and UiPath to automate workflows, build semantic models, and deploy agents that accelerate time-to-value. • Implement and manage robust data quality and observability frameworks to ensure pipeline reliability and proactive issue resolution. • Design and maintain MLOps pipelines to support the seamless rollout, monitoring, and lifecycle management of ML models directly within Snowflake. • Partner closely with your peers under the Data Engineering Manager to share responsibilities across pipeline management, MLOps, and architecture, avoiding siloed knowledge and ensuring comprehensive team coverage. • Synthesize disparate operational entities into a unified, enterprise-wide semantic model that supports both internal analytics and future data monetization efforts.

Canada
$130K - $150K / year
Versapay logo

Senior Data Engineer

Versapay

The first Collaborative Accounts Receivable Network. Accomplish more, get paid faster, and deliver better experiences.

Data Engineer3 days ago
Full TimeRemoteTeam 201-500Since 2006H1B No Sponsor

• Architect for the Future: Optimize our existing Snowflake architecture, establishing strict environmental isolation and scalable structures that prepare our data for eventual downstream commercialization and product offerings. • Drive Agentic Engineering: Leverage tools like Snowflake Cortex, Cursor, and UiPath to automate workflows, build semantic models, and deploy agents that accelerate time-to-value. • Establish Data Observability: Implement and manage robust data quality and observability frameworks to ensure pipeline reliability and proactive issue resolution. • Operationalize Machine Learning: Design and maintain MLOps pipelines to support the seamless rollout, monitoring, and lifecycle management of ML models directly within Snowflake. • Execute Shared Ownership: Partner closely with your peers under the Data Engineering Manager to share responsibilities across pipeline management, MLOps, and architecture, avoiding siloed knowledge and ensuring comprehensive team coverage. • Model for Enterprise Utility: Synthesize disparate operational entities into a unified, enterprise-wide semantic model that supports both internal analytics and future data monetization efforts.

United States
$110K - $130K / year
RPE logo

Engenheiro de Dados Pleno

RPE

🟠Somos a força por trás dos pagamentos que movem o varejo brasileiro.

Data Engineer3 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor

• Desenvolvimento de Pipelines: Projetar, construir e manter fluxos de dados (ETL/ELT) utilizando Pentaho (PDI) para sistemas legados e AWS Glue/Airflow para arquiteturas modernas em nuvem. • Orquestração e Automação: Gerenciar a execução de jobs e rotinas operacionais via Rundeck e Airflow, garantindo a observabilidade do dado. • Modelagem de Dados: Escrever queries complexas em SQL para transformar dados brutos de diferentes processadoras em visões analíticas para o negócio. • Codificação e Versionamento: Utilizar Python para automações e transformações customizadas, mantendo todo o ciclo de vida do código via Git. • Integração Multiprocessadora: Lidar com a ingestão de dados provenientes de diversos parceiros (RPE, Dock, etc.), padronizando informações de diferentes fontes.

Brazil