3M logo
3M

Here, we innovate with purpose & use science every day to create real impact in every life around the world. #LifeWith3M

Lead Data Engineer

Data EngineerData EngineerFull TimeRemoteSeniorTeam 10,001+Since 1902H1B SponsorCompany SiteLinkedIn

Location

Minnesota

Posted

2 days ago

Salary

$188.3K - $230.1K / year

Seniority

Senior

Bachelor Degree7 yrs expEnglishAirflowAzureCloudKafkaPythonSparkSQL

Job Description

Lead Data Engineer

3M

• Lead the architecture and development of scalable, secure data pipelines supporting AI/ML workloads • Own end to end data engineering processes: ingestion, transformation, storage, quality, and monitoring • Collaborate with data scientists and ML engineers on model features, training pipelines, and deployment • Drive best practices in data modeling, orchestration, versioning, and performance optimization • Mentor and guide junior engineers • Ensure data governance, lineage, and compliance standards are met across platforms • Support real time and batch processing frameworks in production environments

Job Requirements

  • Bachelor's degree or higher in computer science from an accredited institution
  • Seven (7) or more years of data engineering experience
  • Strong expertise in Python, SQL, and distributed data systems (e.g., Spark, Databricks, Synapse)
  • Experience building AI/ML ready data pipelines
  • Hands-on experience with cloud platforms (Azure preferred)
  • Strong understanding of ML concepts
  • Proven experience with workflow orchestration (Airflow, Data Factory, Synapse Pipelines, etc.)
  • Experience with streaming platforms (Kafka, Event Hub)
  • Background in DevOps, CI/CD, or infrastructure as code

Benefits

  • Medical, Dental & Vision
  • Health Savings Accounts
  • Health Care & Dependent Care Flexible Spending Accounts
  • Disability Benefits
  • Life Insurance
  • Voluntary Benefits
  • Paid Absences
  • Retirement Benefits

Related Categories

Related Job Pages

More Data Engineer Jobs

Full TimeRemoteTeam 1,001-5,000Since 1938H1B Sponsor

• Manage the assignment, workflow, and oversight process to ensure quality, consistency, and timeliness of deliverables from data engineers • Create and maintain standards for data design, modeling, and deployment in Snowflake • Provide direction and training to data engineers regarding foundational applications used in the development lifecycle • Assess new data ingestion patterns to improve efficiency and timeliness of data availability • Define data delivery strategy and monitor data release schedule • Provide guidance to data analyst community on best practices for leveraging and querying new data assets • Interpret business needs from requests, and rapidly implement effective technical solutions that meet long term objective of providing consistent data delivery across enterprise • Create and maintain ETL Source to Target mapping documents, data playbooks, and data flow inventory documents

Pennsylvania
$98.9K - $186.3K / year
BlueLabs logo

Data Engineer II

BlueLabs

BlueLabs is an analytics and technology solutioning firm. We solve your toughest challenges with data-driven insights.

Data Engineer2 days ago
Full TimeRemoteTeam 51-200Since 2013H1B No Sponsor

• Partner directly with the Director of Data to translate platform vision into engineering execution • Lead the design and implementation of Kimball-style dimensional models and star schema architectures, migrating legacy data systems to a modern, scalable foundation. • Own and enforce data hygiene standards across the platform — identifying, documenting, and remediating data quality issues proactively. • Design, build, and maintain production-grade data pipelines with a focus on reliability, observability, and maintainability. • Develop and champion best practices for data modeling, transformation, and testing across the engineering team. • Drive the extraction and codification of business logic from existing systems into well-documented, portable, cloud-agnostic layers. • Collaborate with analysts, engineers, and stakeholders to ensure data models meet both current reporting needs and future platform requirements. • Provide technical guidance and mentorship to Data Engineer I team members. • Contribute to architectural decisions and participate in design reviews, bringing a principled, opinionated perspective on platform direction. • Support and improve documentation standards to ensure the codebase and data platform remain well-understood across the team.

United States
$95K - $125K / year
Innodata Inc logo

Data Engineer

Innodata Inc

Innodata (NASDAQ: INOD) is a leading data engineering company. With more than 2,000 customers and operations in 13 cities around the world, we are an AI technology solutions provider-of-choice for 4 out of 5 of the world’s biggest technology companies, as well as leading companies across financial services, insurance, technology, law, and medicine. By combining advanced machine learning and artificial intelligence (ML/AI) technologies, a global workforce of subject matter experts, and a high-security infrastructure, we’re helping usher in the promise of AI. Our global workforce includes over 7,000 employees in the United States, Canada, United Kingdom, the Philippines, India, Sri Lanka, Israel and Germany. We’re poised for a period of explosive growth over the next few years.

Data Engineer2 days ago
Full TimeRemoteTeam 5,001-10,000

Role Description We are seeking a Data Engineer to design and build enterprise data warehouses, data lakes, and pipelines that power data-driven decision-making for data center supply chain and real estate operations. This role is responsible for creating scalable, secure, and optimized ETL infrastructure on GCP/AWS, while enabling advanced AI/ML use cases such as RAG, copilots, and agentic AI for predictive analytics and workflow automation. What You’ll Own - Design and implement data-driven solutions on GCP including BigQuery, Cloud Storage, Dataflow, Pub/Sub, and Looker/BI. - Build ETL scripts using SQL and Python to extract, clean, and transform structured and unstructured data from ERP, procurement, logistics, and facility management systems. - Develop and optimize data pipelines for ingestion, transformation, and loading into enterprise data lakes and warehouses. - Build and extend end-to-end data and BI solutions, spanning extraction, storage, transformation, and visualization layers. - Partner with supply chain, real estate, and AI/ML teams to provide pipelines for AI solutions (e.g., RAG ingestion, Copilot integration, multi-agent workflows). - Ensure data governance, lineage, and compliance across supply chain datasets. - Continuously optimize query performance, ETL processes, and pipeline reliability. Qualifications - Advanced proficiency in SQL (complex queries, optimization) and Python (data engineering, scripting, APIs). - Experience building ETL/ELT pipelines operating on structured and unstructured data sources. - Knowledge of enterprise data warehouse and data lake architectures. - Exposure to data pipelines for AI/ML (vector DB ingestion, embeddings, RAG pipelines, copilots, agents). - Familiarity with supply chain or data center operations data is a strong plus. - Bonus: experience with ML Engineering, data visualization tools (Looker, Tableau, Power BI) and MLOps practices. - Strong hands-on expertise with GCP services: BigQuery, Dataflow, Pub/Sub, Cloud Storage, Looker/BI (or similar, preferred). Requirements - The expected salary range for this position is $100,000 – $120,000 USD per year, based on experience, skills, and qualifications. Company Description Innodata (Nasdaq: INOD) is a global data engineering company. We believe that data and Artificial Intelligence (AI) are inextricably linked. Our mission is to enable the responsible advancement of artificial intelligence by providing the data, evaluation frameworks, and human expertise required to build AI systems that can be trusted at scale. We provide a range of transferable solutions, platforms, and services for Generative AI / AI builders and adopters. In every relationship, we honor our 36+ year legacy delivering the highest quality data and outstanding outcomes for our customers.

United States
$100K - $120K / year
Full TimeRemoteTeam 5,001-10,000Since 1995H1B No Sponsor

• Projetar e Desenvolver Pipelines de Dados: Arquitetar, implementar e manter pipelines ETL/ELT escaláveis e de alto desempenho para processar grandes volumes de dados estruturados e não estruturados. • Integração e Processamento de Dados: Construir arquiteturas acionadas por eventos para permitir o processamento de dados em tempo real e a integração perfeita entre sistemas. • Gerenciamento da Infraestrutura em Nuvem: Aproveitar os serviços em nuvem AWS (por exemplo, S3, Redshift, Glue, Lambda, Kinesis, Pentaho) para projetar e implantar uma infraestrutura de dados robusta. • Desenvolvimento e Otimização de Código: Escrever código limpo, eficiente e de fácil manutenção em Python para suportar a ingestão, transformação e orquestração de dados. • Colaboração em Ambientes Ágeis: Trabalhar em estreita colaboração com equipes multifuncionais, incluindo cientistas de dados, analistas e engenheiros de software, seguindo metodologias ágeis para entregar soluções iterativas. • Qualidade e Governança de Dados: Implementar melhores práticas para qualidade de dados, monitoramento e conformidade para garantir a integridade, consistência e segurança dos dados. • Otimização de Desempenho: Otimizar pipelines de dados e consultas em ambientes em nuvem para desempenho, escalabilidade e eficiência de custos. • Mentoria e Liderança: Fornecer orientação técnica aos membros juniores da equipe, promovendo uma cultura de aprendizado contínuo e melhoria. • Documentação e Compartilhamento de Conhecimento: Documentar designs técnicos, processos e fluxos de trabalho para garantir a manutenção e transferência de conhecimento.

Brazil