Senior AI Data Engineer

Location

United States

Posted

2 days ago

Salary

$140K - $180K / year

Seniority

Senior

Job Description

Senior AI Data Engineer

Accertify, Inc.

Role Description We are seeking an experienced Senior AI Engineer to lead the design, development, and deployment of intelligent systems that drive business impact. You will own end-to-end AI solutions, set technical direction, and mentor other engineers, working across machine learning, LLMs, retrieval systems, agentic workflows, and the data layers that feed them. This role requires deep technical expertise and a track record of shipping production AI systems at scale. - AI Architecture: Architect AI systems end-to-end. Set technical direction. Review and mentor others’ designs. - LLM Strategy: Define how we use LLMs, routing, agentic orchestration, prompt patterns, fine-tune vs. prompt trade-offs, and model selection. - Production Ownership: Own deployment patterns across on-prem and cloud (Kubernetes, container infrastructure, secure integration). Drive operational excellence. - Data Architecture: Consult on the data layer, medallion design (bronze/silver/gold), RAG retrieval stores, schema evolution, drift detection, and query performance at scale. - Evaluation Framework: Define how we measure AI quality, use golden datasets, conduct judge-based eval, enable production observability (OpenTelemetry, Phoenix, or similar), and perform behavior testing. - Technical Leadership: Mentor AI Engineers, lead code and design reviews, set engineering standards for the AI team, and drive cross-team initiatives with Data Science and Data Engineering teams. - Security & PII: Set the bar for handling sensitive customer data, secure system design, authentication, PII safety, and compliance-aware patterns. - Collaboration: Partner with product managers, data scientists, engineers, and stakeholders to align AI solutions with business needs. Qualifications - Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field (or equivalent experience). - 5+ years of production software experience. - Strong proficiency with AI/ML frameworks (OpenAI SDK, Hugging Face transformers, agentic orchestration frameworks). - Strong Python AND strong SQL, both are first-class requirements. Comfortable with query optimization, indexing strategy, and data modeling. - Demonstrated experience designing and operating medallion architectures (bronze/silver/gold) at scale. - Experience handling data at scale, partitioning, batch pipeline design, query performance, and schema evolution. - Production LLM experience, prompt engineering, retrieval (RAG), agentic orchestration, fine-tuning trade-offs. - Experience with vector databases (pgvector, Pinecone, Weaviate, etc.) and production RAG patterns. - Experience with AI observability and evaluation frameworks, OpenTelemetry, Phoenix or equivalent, golden-set eval, and judge-based eval methodology. - Strong understanding of secure system design, authentication, API integrations, and PII-sensitive data handling. - Track record of mentoring engineers and shipping cross-team initiatives. - Strong problem-solving skills and ability to explain technical solutions to non-technical audiences. Preferred Skills - Hands-on experience with Model Context Protocol (MCP), building tools, OAuth flows, or MCP server integrations. - Experience leading data engineering or data platform initiatives end-to-end. - Experience with agentic AI frameworks (e.g., the OpenAI Agents SDK) in production. - Experience deploying production systems across both cloud (AWS, GCP, Azure) and on-prem environments. - Experience in fraud detection, financial services, or other PII-sensitive domains. - Knowledge of MLOps tools and CI/CD for AI. - Open-source contributions, conference talks, or technical writing in the AI/ML space. Benefits - Health & Wellness: Medical, dental, and vision coverage for you and your family. - Time Off: Paid time off, holidays, and personal days to maintain work-life balance. - Financial Growth: Competitive compensation, performance-based rewards, and local retirement or savings plans where applicable, along with financial education resources. - Career Development: Training programs, mentorship opportunities, and growth potential within the company. - Wellness Support: Mental health resources, fitness perks, and wellness programs. - Extras & Perks: Commuter benefits, employee discounts, and company-sponsored events. Additional Details - Potential Salary Range: $140-180k. The range displayed on each job posting reflects the minimum and maximum target salaries for the position across all US locations. - This role is eligible for remote US-based (work-from-home) locations. Chicago-based candidates would work in a hybrid capacity from the Accertify HQ located in Itasca, IL. - Visa Sponsorship: employment eligibility to work for Accertify in the U.S. is required, as Accertify will not pursue Visa sponsorship for this position.

Related Categories

Related Job Pages

More Data Engineer Jobs

Jellyfish logo

Staff Data Engineer

Jellyfish

Your Platform to Perform

Data Engineer2 days ago
Full TimeRemoteTeam 1,001-5,000Since 2017H1B No Sponsor

About Jellyfish Jellyfish is rewriting the manual on how high-performing engineering teams actually work within an AI world. We're seeking a Staff Data Engineer who's passionate about building and scaling robust data infrastructure, defining integration patterns, and driving technical strategy across large-scale systems. Ideally, you combine deep hands-on expertise with the ability to influence architecture and direction across teams. This is a high-agency role where you'll help shape not just what gets built, but how we build it — and raise the bar for the engineers around you. What You'll Do - Define the architecture and long-term technical direction, design and implementation of our data pipelines and integrations platform for storage, transformation and export at scale. - Drive integrations with third-party tools and establish the standards that make future integrations faster and more reliable - Own our infrastructure as code strategy using Terraform, and contribute to how we evolve our cloud data infrastructure - Architect workflow orchestration systems for near-real-time and batch data processing, with a focus on reliability and scalability - Lead the design and implementation of data export pipelines that serve diverse customer needs - Spearhead development of internal tooling and agentic workflows that meaningfully accelerate engineering velocity across the org - Serve as a technical anchor — leading design reviews, mentoring engineers, and elevating how the team approaches complex problems - Partner with engineering leadership on roadmap prioritization, cross-team dependencies, and org-wide technical strategy - Communicate clearly about technical decisions, tradeoffs, and project status to both engineers and non-technical stakeholders About You - You have deep expertise building and scaling data pipelines, and a track record of owning complex data infrastructure end-to-end — not just contributing to it - AI - driven developer who dives deep into AI first workflows - Expert Python engineer with strong opinions about how to build reliable, maintainable systems at scale - Designed and led large-scale third-party API integration work, and you know what separates a maintainable integration from a brittle one - Fluent in infrastructure as code (Terraform, CloudFormation, or similar) and think about infrastructure as a product, not an afterthought - Experienced with AI/LLM tool integrations (Claude, Copilot, etc.) and understand the unique infrastructure demands they create - You have production experience with workflow orchestration tools (Prefect, Airflow, Dagster) and can make the hard architectural calls - A natural technical leader — you build consensus, unblock others, and make the engineers around you better - You think in systems: you consider user needs, business impact, and long-term maintainability when designing solutions - Exceptional communicator who can write a design doc, lead a review, and distill a complex tradeoff for any audience - Located in EST or CST time zone Bonus points if: - You've thrived at a rapidly scaling startup and know how to balance speed with technical rigor - You have hands-on experience with Delta Lake and lakehouse architectures - You've built deep integrations with developer tools (Jira, GitHub, GitLab, CI/CD systems) and have strong opinions about how to do it well - You bring strong perspectives on how engineering teams work best — and the tools that help or hurt A list of job experiences and qualification requirements is great, but humility, a performance-driven attitude, and a team-player approach are most important to us. We love to have fun and win in the process. We only hire people who have a passion for building great companies in an environment where a sense of humor is a must. Occasional travel may be required. Applicants must be authorized to work for any employer in the US. We are unable to sponsor or take over sponsorship of an employment visa at this time. Let’s talk about us! This is all about you, but you want to know a little about us. Jellyfish enables leaders to effectively build AI-integrated engineering teams, align engineering decisions with business initiatives and deliver the right software efficiently and on time. AI tools alone won’t transform your org—Jellyfish shows you what’s working, what’s not, and how to build high-performing teams that know how to use AI the right way.

United States
$200K - $260K / hour
Zymewire logo

Data Intern

Zymewire

The leading sales intelligence tool for biopharma service providers

Data Engineer2 days ago
ContractRemoteTeam 11-50H1B No Sponsor

Role Description Data lovers, unite! Lumerate is searching for an awesome post-secondary student to join us for an 8 months internship on our Zebricks Data Team! In this role, your main responsibility will be to: - Review and analyze news content from patient advocacy groups. - Extract insights and bring them into Zebricks to be made actionable for our users through AI, data, and machine learning processes. - Work closely with the rest of the Zebricks team on a number of tasks and projects. - Help shape data best practices to position the brand for long-term success. Qualifications - You enjoy hunting for information on the web. - You're detail-oriented. - You have an interest in data, both qualitative and quantitative. - You're highly organized and comfortable working independently. - You're very comfortable using Excel/Google Sheets. - You're super curious about everything; social media listening experience is a bonus. - You consider yourself to be tech-savvy and excited about working for a software company; Python experience is a bonus. - Your enthusiasm is contagious! Requirements - Registered or enrolled at a Canadian post-secondary education institution and can provide proof of full-time or part-time enrolment during placement. - A Canadian citizen, permanent resident, or protected person. - Legally entitled to work in Canada, including specific province legislation and regulations. Benefits - Salary: $3200- $3500 per month. - Working Hours: 9am - 5pm EST, Mon-Fri. - Location: Remote, anywhere in Canada. - Start Date: August 31st, 2026. - End Date: April 30th, 2027. Company Description Lumerate helps our customers achieve the full picture of their industries. Our mission is to deliver industry awareness to an ever-increasing number of people, in whatever way helps them to make the most informed decisions, take the most immediate action, and be the most awesome at their unique jobs. Zebricks delivers patient advocacy intelligence to life-science industry professionals, supporting industry professionals in accessing the full picture of the patient group universe to drive mutually beneficial co-creation opportunities, relevant stakeholder mapping, and informed decision-making.

Canada
$3.2K - $3.5K / month

Role Description Ser a principal referência técnica de Inteligência Artificial da Neoage, conectando dados, modelos, automações e produtos para transformar IA em um diferencial competitivo do negócio. Como a primeira pessoa dedicada a essa frente, você será responsável por identificar oportunidades de alto impacto, construir aplicações inteligentes, agentes e automações baseadas em IA e estruturar a arquitetura necessária para sustentar a evolução da empresa. Seu desafio será transformar dados em ativos estratégicos, acelerando a inovação, aumentando a eficiência operacional e contribuindo para a criação de produtos inteligentes que impulsionem o crescimento da Neoage. Atuando em parceria com os times de Produto, Tecnologia, Growth e Monetização, você ajudará a construir a fundação técnica que permitirá à Neoage escalar o uso de Inteligência Artificial em toda a organização. Responsibilities - Atuar como principal referência técnica em Inteligência Artificial aplicada ao negócio. - Identificar oportunidades de uso de IA com impacto relevante em receita, produtividade, experiência do usuário e eficiência operacional. - Desenvolver agentes inteligentes, automações e aplicações baseadas em LLMs para uso interno e externo. - Projetar e implementar arquiteturas de RAG (Retrieval-Augmented Generation), embeddings e busca semântica. - Integrar modelos de IA com sistemas internos, APIs, bancos de dados e ferramentas de negócio. - Construir e evoluir pipelines de dados que suportem aplicações de IA, analytics avançado e automações. - Estruturar e evoluir a plataforma de dados da empresa, garantindo qualidade, escalabilidade e confiabilidade. - Desenvolver integrações entre múltiplas fontes de dados e sistemas corporativos. - Implementar monitoramento, observabilidade e mecanismos de avaliação para aplicações de IA. - Garantir qualidade, segurança, governança e performance dos dados utilizados por soluções inteligentes. - Colaborar com times de Produto, Tecnologia e Negócios para transformar problemas complexos em soluções escaláveis. - Documentar processos, disseminar conhecimento e contribuir para a evolução da maturidade de IA e Dados na organização. - Apoiar a construção da visão de longo prazo para IA dentro da Neoage. Qualifications - Experiência prática com IA Generativa, aplicações baseadas em LLMs, agentes inteligentes, RAG, embeddings ou automações baseadas em IA. - Experiência em Engenharia de Dados, Engenharia de Software, Analytics Engineering, AI Engineering ou áreas correlatas, com histórico de construção de soluções escaláveis e orientadas a dados. - Forte base em IA aplicada a dados: compreensão de como dados são consumidos por aplicações de IA, automações e sistemas de decisão, e capacidade de estruturar dados para esses fluxos. - Experiência prática com IA Generativa: LLMs, embeddings, vetorização ou arquiteturas modernas de consumo de dados para IA. - Experiência desenvolvendo aplicações, agentes ou automações integradas a modelos de IA, utilizando APIs de LLMs e frameworks modernos. - SQL avançado e Python, com experiência em ETL/ELT, modelagem de dados e arquiteturas de Data Lake/Lakehouse. - Orquestração de pipelines e fluxos de automação (Airflow, Dagster, n8n, Make ou similares), com foco em monitoramento, observabilidade e confiabilidade. - Ingestão via APIs (REST/Webhooks) e integração com múltiplas fontes e plataformas. - Cloud (preferencialmente AWS — S3, etc.) e motores de consulta distribuídos (Trino, Presto, Athena ou similares). - Familiaridade com ferramentas de analytics e visualização de dados (Looker Studio, Metabase, Power BI ou similares) e, idealmente, stacks de martech/adtech (Google Ads, Meta, CleverTap, Braze, GA4, etc.). - Inglês avançado para leitura técnica, consumo de documentação e interação em ambientes internacionais. Differentials - Experiência construindo agentes inteligentes e fluxos multiagentes. - Vivência com LangChain, LangGraph, CrewAI, LlamaIndex ou frameworks similares. - Experiência com bancos vetoriais e mecanismos de busca semântica. - Conhecimento em MLOps, avaliação de modelos e monitoramento de aplicações de IA. - Experiência com automações avançadas utilizando n8n, Flowise, Langflow ou similares. - Conhecimento em dbt, Airbyte, Apache Iceberg, OpenMetadata, Databricks ou Kubernetes/EKS. - Experiência na construção e evolução de Customer Data Platforms (CDP). - Vivência em empresas de martech, adtech, mídia digital, CRM ou marketing de performance. - Experiência em ambientes de alta autonomia, crescimento acelerado e construção de produtos do zero. - Histórico de implementação de soluções de IA que geraram impacto mensurável para o negócio. Modelo de trabalho - Remoto Modelo de Contrato - Pessoa Jurídica (PJ)

Brazil
VTEX logo

Data Engineer

VTEX

VTEX (NYSE: VTEX) is the composable and complete commerce platform that delivers more efficiency and less maintenance to organizations seeking to make smarter IT investments and modernize their tech stack. Through our pragmatic composability approach, we empower brands, distributors, and retailers with unparalleled flexibility and comprehensive solutions. VTEX is trusted by 2,400 global B2C and B2B customers, including Carrefour, Colgate, Motorola, Sony, Stanley Black & Decker, and Whirlpool, having 3,400 active online stores across 43 countries (as of FY ended on December 31, 2024). Founded in the year 2000, VTEX has a history of being unstoppable, leading a high-tech industry and positioned above market giants. We are building an extraordinary future with more than 1,300 employees across 25 locations in 16 countries in Latin America, North America, Europe, and Asia. At VTEX, you will work in a challenge-driven environment and collaborate with amazing peers. If you are powerful individually, join us, and we will be unstoppable together.

Data Engineer2 days ago
Full TimeRemoteTeam 1,001-5,000

Role Description As a Data Engineer on the Data Platform team, you'll help design, build, and evolve the data infrastructure that powers analytics, AI, and machine learning across VTEX. This is a hands-on, mid-level role: you'll own features end-to-end — from ingestion and processing to storage and consumption — taking on problems that come with real ambiguity, and delivering them with growing independence. We're not looking for someone who arrives knowing everything. We're looking for someone with a strong engineering foundation and a high ceiling: a fast learner with sharp problem-solving instincts who is energized by a data platform going through a deep transformation of its architecture and the way it's built. Qualifications - Think like a platform builder. - Comfortable with modern data architectures — data warehouses, data lakes, and data lakehouses. - Proficient in Python and its data-processing ecosystem, SQL. - Experience with cloud data platforms (AWS preferred; GCP/Azure welcome). - Data-driven mindset. - Proficient with AI assistants and code-generation tools. - Clear communication in writing — specs, docs, and design notes. Requirements - Own a piece of the platform from design through to production with minimal supervision. - Understand the trade-offs that shape a platform others depend on. - Produce work that is idempotent, reproducible, and documented. - Define how the impact of your work will be measured before you build it. - Design solutions that thoughtfully consider when and how AI should be used. - Collaborate well across engineering and non-engineering peers. Benefits - Annual profit-sharing program and equity eligibility. - Health, dental, and life insurance with national coverage provided by VTEX. - Annual budget for professional development in Tech. - Language development incentive program (English, Spanish, Portuguese). - Flexible meal allowance. - Extended parental leaves. - Child-care assistance. - Flexible work schedule and remote-first culture. - Financial assistance to build your work-from-home setup. - Wellness program. - Free shipping on 1000+ VTEX stores.

Worldwide