Robots & Pencils is an applied AI engineering firm building the next frontier of business architecture. We design and ship AI co-workers that integrate into enterprise operations and deliver measurable results for our clients. Founded in 2009, we are smaller, faster, and more senior by design, with teams averaging 15+ years of experience.

Data Engineer

Data EngineerData EngineerFull Time Remote Mid LevelTeam 51-200

Location

United States

Posted

3 days ago

Salary

Seniority

Mid Level

AI Observability/Monitoring CI/CD AI/ML Data Engineering Python Scala SQL Apache Kafka Amazon Kinesis AWS

Job Description

Role Description We’re looking for a Staff Data Engineer to join a multi-disciplinary engineering team building modern, enterprise-grade data platforms. This role is ideal for an experienced engineer who can define data strategy, own platform decisions end-to-end, and contribute to technical leadership across the team. In this role, you will: - Design scalable data lakes, warehouses, and pipelines. - Define governance and quality standards. - Drive data platform modernization across real, in-flight work where performance, reliability, and security are critical. - Mentor more junior engineers. - Partner with leadership on data strategy. - Bring an AI-forward mindset. What You’ll Do Craft & Delivery - Define data architecture and platform strategy, leading design across pipelines, warehouses, and data lakes. - Build and optimize scalable data pipelines supporting batch and real-time processing. - Define and enforce data governance, quality standards, and compliance frameworks across the platform. - Build monitoring, logging, and alerting for data pipelines and services, and contribute to CI/CD workflows for data deployment and automation. - Drive data platform modernization, optimizing for performance, cost, and scalability. - Bring an AI-forward mindset to your daily work, using tools like Claude, Cursor, and other modern AI assistants to ship higher-quality work at pace. - Design and implement data contracts and event flows in collaboration with backend, platform, and engineering teams. - Lead the design and implementation of data pipelines for production AI/ML systems, including embeddings, vector stores, RAG data preparation, feature stores, and training/inference data flows. - Integrate data services with APIs, middleware, and third-party systems to support end-to-end data consumption. Collaboration & Communication - Partner with leadership on data strategy, translating technical depth into decisions others can act on. - Collaborate closely with engineering, analytics, AI, and product teams to align data platforms with broader goals. - Advocate for data quality, governance, and platform best practices across teams and engagements. Leadership & Influence - Establish data engineering standards that lift the quality and consistency of work across the team. - Mentor junior and mid-level engineers, helping them grow their craft, confidence, and impact. - Make high-stakes architectural decisions with clear ownership and consideration of long-term tradeoffs. Qualifications - 7+ years of professional data engineering experience, with experience leading complex data platform initiatives. - Strong system architecture background with expertise in distributed data systems. - Expert proficiency in Python, Scala, and SQL. - Deep expertise with cloud-native data platforms and enterprise data warehousing. - Strong expertise in data pipeline orchestration and processing. - Strong experience with streaming platforms and real-time data processing (e.g., Kafka, Kinesis, Pub/Sub). - Strong data modeling expertise and experience with data transformation. - Strong experience with data quality, governance, and compliance frameworks. - Strong experience with container orchestration and CI/CD for data systems. - Strong experience building data pipelines for production AI/ML systems, including embeddings, vector stores, RAG data preparation, feature stores, and training/inference data flows. - Demonstrated leadership and technical mentoring experience across a team or organization. - Strong stakeholder communication skills, with the ability to translate technical depth across audiences. - Demonstrable, day-to-day usage and expert knowledge of AI-forward coding tools such as Claude and Cursor. - Excellent problem-solving skills and the ability to navigate highly ambiguous technical and business challenges with sound judgment. - Experience with data mesh or data fabric concepts, lakehouse architectures, or governance framework implementation is a plus. Helpful Extras and Unique Skills - Experience with handling and modeling data in the healthcare industry is a plus. - AWS certifications, like Certified Data Engineer – Associate, strongly preferred. Benefits - A doer who sees something broken and fixes it. - A fast learner who embraces the changing AI landscape. - Direct in a way that improves the work. - Obsessed with craft and detail. - Built for ownership and accountability. - All in for clients' businesses. - Resourceful under constraints. - Glad to collaborate with dedicated team members.

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Data Tech Lead

Five Acts

Inspiring people through data.

Data Engineer3 days ago

Full Time RemoteTeam 51-200Since 2005H1B No Sponsor

Company Site LinkedIn

Role Description - Arquitetar soluções escaláveis e seguras utilizando as melhores práticas da arquitetura Databricks; - Desenvolver e implementar estratégias eficazes de gerenciamento de projetos, utilizando ferramentas como Jira e Git para garantir transparência e eficiência; - Colaborar com equipes de produto e design para garantir a entrega de soluções alinhadas às expectativas dos clientes e aos objetivos do negócio; - Identificar e resolver problemas técnicos complexos, mantendo um ambiente de desenvolvimento ágil e colaborativo; - Liderar e orientar equipes multidisciplinares no desenvolvimento de produtos de dados, garantindo qualidade, escalabilidade e cumprimento de prazos. Qualifications - Sólido conhecimento e experiência prática em arquitetura Databricks; - Experiência comprovada em integração e implementação de soluções de engenharia de dados; - Vivência com Jira e Git para controle de versão e gerenciamento de projetos; - Forte compreensão das melhores práticas de segurança e governança no Databricks, incluindo Unity Catalog; - Conhecimento em bancos de dados relacionais e não relacionais, como PostgreSQL, DynamoDB e Redis. Requirements - Triagem de currículos; - Bate-papo com equipe de Gestão Humana; - Entrevista Técnica com lideranças do time; - Entrevista Final com cliente. Benefits - Vales Alimentação e Refeição (Swile); - Cobertura de até 100% em Plano de Saúde e Odontológico (Amil); - Seguro de Vida em grupo; - Trabalho remoto; - Convênio Saúde Mental - psicoterapia online e presencial; - Incentivo a certificações e cursos; - Convênio para cursos de pós-graduação e MBA (Esalq/USP); - Parceria com escolas de idiomas; - Parceria com academias e apps de bem-estar (Wellhub); - Palestras e rodas de conversa internas; - Bônus por indicação; - Happy hours; - Mimos em datas comemorativas. Company Description Acreditamos que os dados são o combustível para a transformação das empresas e das pessoas. Com isso em mente, ajudamos nossos clientes a criar soluções analíticas que os inspiram a realizar transformações organizacionais com impacto direto nos seus modelos de negócio. Na Five Acts, valorizamos a diversidade, a equidade e a inclusão. Acreditamos que a diversidade é uma força impulsionadora da inovação e do crescimento. Portanto, não fazemos distinção por questões de gênero, orientação sexual, religião, idade, etnia, ou qualquer outra. Nossas oportunidades são abertas para todas as pessoas!

Databricks JIRA Git Unity PostgreSQL DynamoDB Redis

View details: Data Tech Lead

Brazil

Apply

AI Data Engineer

Emmes Group

Veridix AI is the technology, data, and AI arm of the Emmes Group, a leading full-service contract research organization (CRO) with over 47 years of experience in supporting clinical research across more than 70 countries. With industry-leading capabilities in cell and gene therapy, vaccines, infectious diseases, and ophthalmology, Emmes is one of the top clinical service providers to the U.S. government and is rapidly expanding its presence in biopharma. Veridix AI develops advanced eClinical solutions, powering clinical trials through patient data collection, randomization, biospecimen tracking, and data quality monitoring. Our cutting-edge AI innovations, including Generative AI (GenAI) capabilities, are transforming clinical trial timelines by streamlining processes from document authoring to automating study builds. Our “Character Achieves Results” culture is driven by five key values that guide our actions in the way we conduct research and distinguish us as an organization: Integrity, Agility, Passion for Excellence, Collaborative Partnerships, and Intellectual Curiosity. If you share our motivations and passion in research, come join us!

Data Engineer3 days ago

Full Time Remote

Role Description The Data Engineer will have a strong background in data engineering and extensive experience with AWS Cloud services. As a Data Engineer, they are responsible for designing, building, and maintaining scalable data pipelines and infrastructure to support our data analytics and business intelligence initiatives. - Design, develop, and maintain robust data pipelines and ETL processes to ingest, transform, and store data from various sources. - Collaborate with data scientists, analysts, and other stakeholders to understand data requirements, design data models, and deliver solutions that meet business needs. - Automate data workflows and implement monitoring and logging to ensure the health and performance of the data infrastructure. - Conduct data profiling, cleansing, and validation to ensure high data quality standards. - Optimize data storage and retrieval performance, ensuring data quality and integrity. - Implement and manage data architecture on AWS, ensuring scalability, reliability, and security. - Stay up to date with the latest trends and best practices in data engineering and AWS cloud technologies. Qualifications - Bachelor’s or master’s degree in computer science, Information Technology, or a related field. - 3 or more years of related professional experience. - Experience in data engineering with a strong focus on AWS cloud services. - Proficiency in SQL and experience with relational databases (e.g., PostgreSQL, MySQL, Redshift). - Experience with AWS services such as S3, Lambda, Glue, EMR, Kinesis, and Redshift. - Strong programming skills in languages such as Python, Java, or Scala. - Knowledge of data modeling, ETL concepts, and data warehousing. - Familiarity with version control systems (e.g., Git) and CI/CD pipelines. - Excellent problem-solving skills and attention to detail. - Knowledge of machine learning frameworks and data science workflows. - Familiarity with data visualization tools (e.g., QuickSight, Qlik). - Familiarity with NoSQL databases (e.g., DynamoDB, MongoDB). - Strong collaboration skills with cross-functional teams to establish best design and user flows for applications. - Strong multitasking, problem solving, and organizational skills. - Proven ability to work independently and in a team environment. - Satisfactory background check required. Benefits - Flexible Approved Time Off - Tuition Reimbursement - 401k Retirement Plan - Work From Home Anywhere in the US - Maternal/Paternal Leave - Casual Dress Code & Work Environment

View details: AI Data Engineer

United States

Apply

Senior Data Engineer – AI Native

Life360

The #1 family safety app 📱

Data Engineer3 days ago

Full Time RemoteTeam 201-500Since 2008H1B Sponsor

Company Site LinkedIn

• Design and manage scalable data platforms powering real-time analytics, batch processing, and exploratory analysis, using AI-assisted development as the default workflow, not an afterthought. • Own the full data lifecycle: ingestion, ETL, storage, and serving, building and iterating on pipelines with AI pair-programming tools (Claude Code) to accelerate delivery. • Ingest data from diverse sources via both streaming (Kafka, Kinesis) and batch pipelines, unifying them into a consistent, queryable platform. • Architect medallion-layer data models (Bronze/Silver/Gold) in Databricks, ensuring business needs are met with clean, well-documented schemas. • Automate, test, and harden data workflows, writing AI-augmented tests, data quality checks, and CI/CD pipelines that catch issues before production. • Build and maintain AI-ready tooling: craft prompts, custom slash commands, and agent workflows that let the entire team scaffold pipelines, generate documentation, and validate data quality faster. • Build and improve Databricks Genie chatbots that allow non-technical users to query data using natural language. • Collaborate with product analytics and data science, applying engineering rigor to messy, unstructured data and transforming it into reliable, production-ready datasets. • Contribute to infrastructure-as-code (Terraform/Atmos) for provisioning and managing cloud data infrastructure.

Airflow Apache AWS Cloud ETL Java Kafka Python Scala Spark SQL Terraform

View details: Senior Data Engineer – AI Native

United States

$103.5K - $192K / year

Apply

Lead Data Engineer – Databricks

A leading consulting company whose Intelligent Automation expertise accelerates the way you do business.

Data Engineer3 days ago

Full Time RemoteTeam 201-500H1B Sponsor

Company Site LinkedIn

• Serve as the technical lead for the project—owning solution design decisions, guiding implementation standards, and mentoring other engineers through coaching, reviews, and knowledge sharing. • Design, develop, and maintain complex ETL pipelines using Databricks, ensuring scalable, high-performance data integration across multiple source systems. • Implement and optimize medallion architecture within Databricks, establishing clear data zones (raw, curated, trusted) to support governed, enterprise-wide reporting. • Develop and refine dimensional data models that enable unified, analytics-ready views of business domains and support automated dashboarding and KPI frameworks. • Collaborate closely with cross-functional teams (data stewards, IT, business stakeholders) to translate operational requirements into technical solutions, proactively clarifying dependencies and driving alignment. • Contribute to architectural decisions, leveraging your expertise to recommend best practices, challenge assumptions, and ensure data platform durability and flexibility. • Identify and address integration challenges, data quality issues, and process bottlenecks early, providing actionable insights and thoughtfully pushing back when project risks or inefficiencies arise. • Support knowledge transfer and documentation, empowering colleagues and clients to maintain and evolve data solutions independently.

Azure ETL Tableau

View details: Lead Data Engineer – Databricks

Argentina

Apply

Data Engineer

Job Description

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Data Tech Lead

AI Data Engineer

Senior Data Engineer – AI Native

Lead Data Engineer – Databricks