Job Closed

This listing is no longer active.

Senior Data Engineer

Location

United States

Posted

33 days ago

Salary

0

Seniority

Senior

Job Description

Senior Data Engineer

Growth Protocol

Role Description We are searching for an ambitious go-getter who welcomes the challenge of meeting the needs of a hyper-growth startup. As a Sr. Data Engineer, you will be at the heart of Growth Protocol’s data infrastructure, playing a foundational role in building the systems that power our AI platform. Your work will directly influence product features, client outcomes, and strategic business decisions. You will collaborate with Data Scientists, Backend Engineers, Client IT, and business stakeholders to build and maintain scalable pipelines that serve billions of rows of structured and unstructured data weekly, enabling high-impact insights across multiple industries. Objectives of the Role - Collaboration - Work closely with Data Scientists to translate business and ML requirements into robust data workflows. - Ensure timely delivery of clean, reliable data to support model development and production features. - Technical Development - Engineer and manage scalable ETL architecture using Airflow, Snowpark, Cloud Run, and Apache Beam. - Design and implement a high-performance data infrastructure for seamless processing and integration. - Extract data from diverse online platforms. - Operationalize machine learning models, focusing on deployment, reliability, and performance. - Data Connectivity - Partner with client IT teams to identify the most efficient and secure methods for data ingestion including Snowflake Sharing, Databricks Delta Sharing, Private Link, and VPN. - Work alongside the Platform Engineering team to define requirements for secure networking paths that support high-performance data transfers. - Perform end-to-end testing of client connections to ensure data integrity and connectivity. - Integrate customer databases with our platform. - Monitoring and Reliability - Create and manage real-time monitoring systems for data ingestion and transformation pipelines. - Proactively identify and resolve issues to maintain high levels of system reliability and data integrity. Qualifications - 5+ years of experience in Data Engineering. - Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience. - Experience building data pipelines with robust unit and integration testing. - Proficiency in distributed computing frameworks including Apache Beam and Spark. - Functional understanding of enterprise networking including VPC peering, Private Link, and VPNs, with the ability to troubleshoot connectivity in a cloud environment. - Hands-on experience operationalizing ML models in production. - Familiarity with ML/AI, NLP, and Data Science workflows including MLFlow. - Deep understanding of ETL workflows, data modeling, and data architecture. - Strong debugging and problem-solving skills. - Excellent communication skills and experience collaborating across teams. Preferred Qualifications - Experience working on enterprise products serving Fortune 500 clients across Financial Services, Industrials, and Consumer Products. - Prior startup experience. - Interest in current events, market dynamics, and emerging technologies. - Experience creating Agent Skills. - Familiarity with APIs and web scraping for data collection. - Familiarity with Graph Databases. Tech Stack - Languages: Python, TypeScript - Frameworks: Apache Beam, Spark, FastAPI, Airflow - Cloud: Google Cloud Platform - Data: Elasticsearch, Snowflake, Databricks, Neo4J, PostgreSQL, MongoDB, GCS - Infrastructure and DevOps: Docker, Terraform, GitHub Actions, Cloud Run - Frontend: Next.js Benefits - Competitive compensation and equity in a rapidly growing company. - 100% company-paid health, dental, and vision insurance plus 401(k). - Pet-friendly office.

Related Categories

Related Job Pages

More Data Engineer Jobs

Role Description Siamo Corposostenibile, il principale centro online di Nutrizione Integrativa in Italia, in forte crescita nel settore health & wellness, con un team di oltre 150 professionisti. Stiamo costruendo una piattaforma tecnologica proprietaria con forte componente data-driven: raccolta, normalizzazione e analisi dei dati provenienti da molteplici sistemi (CRM, marketing, prodotto, AI). Cerchiamo un Senior Data Engineer responsabile della progettazione e realizzazione della nostra data platform su GCP. Il focus non è solo analisi: il ruolo prevede la costruzione end-to-end del sistema, dall’acquisizione dei dati alla generazione di dashboard utilizzate in azienda. Avrai ownership diretta su come i dati vengono raccolti, strutturati e resi disponibili al business. Modalità di lavoro: - Full remote (Europa) - Collaborazione asincrona + allineamenti periodici - Richiesta sovrapposizione operativa: GMT-1 / GMT+3 Responsibilities - Progettare e sviluppare un data lake su GCP (Cloud Storage + BigQuery) - Costruire pipeline di ingestione dati da fonti eterogenee: - API (CRM, marketing tools, SaaS esterni) - database interni - eventi applicativi - Sviluppare pipeline ETL/ELT scalabili e monitorabili - Modellare i dati per analytics (star schema, layer semantico) - Garantire qualità, consistenza e affidabilità dei dati - Esporre dati tramite query layer, API o strumenti BI - Sviluppare dashboard per business e operations - Ottimizzare costi e performance su BigQuery - Definire standard e best practice per la gestione dei dati Stack Tecnologico (GCP) - Storage: Google Cloud Storage (data lake) - Data Warehouse: BigQuery - Processing: Python (batch jobs), SQL - Orchestrazione: Cloud Composer (Airflow) o equivalenti - Ingestion/Eventi: Pub/Sub (nice to have) - BI: Looker / Metabase / Superset - Infra: Docker, CI/CD Qualifications - 5+ anni di esperienza in data engineering o backend engineering con forte componente dati - Esperienza nella progettazione di data lake o data warehouse - Ottima conoscenza di Python e SQL - Esperienza con pipeline ETL/ELT e integrazione dati da API esterne - Data modeling per analytics (fact tables, dimensioni, ecc.) - Esperienza con database relazionali (PostgreSQL o simili) Requirements - Esperienza con sistemi distribuiti o scalabili - Versionamento codice (Git) - Esperienza con Docker e pipeline CI/CD - Familiarità con orchestrazione workflow (Airflow o equivalenti) Nice to have - Esperienza con streaming/event-based systems - Esperienza con strumenti BI (Metabase, Superset, Looker, etc.) - Esperienza con sistemi di tracking (event analytics) - Esperienza in ambienti startup / high-growth Soft Skills - Forte autonomia e senso di ownership - Capacità di lavorare su problemi poco definiti - Approccio pragmatico e orientato al risultato - Comunicazione chiara con team tecnici e non tecnici Benefits - Ownership completa della data platform - Impatto diretto su decisioni di business - Ambiente veloce e senza burocrazia - Collaborazione diretta con leadership tecnica - Full remote in Europa - Retribuzione commisurata all’esperienza e alle competenze del candidato. Disponibilità a discutere nel corso del colloquio.

Europe
Job Closed

Role Description Clear Fracture is building AI-driven data integration systems that enable organizations to connect, transform, and reason over complex data using agentic workflows. Our platform operates across cloud and on-prem environments and is designed to support multi-tenant, production-scale use cases. We are looking for a Data Engineer who operates as a software engineer first, with strong experience in data modeling and data systems. You will play a key role in building the core data layer that powers our agentic platform—designing schemas, implementing data services, and enabling reliable, scalable data flows. In addition to building core data infrastructure, you will also develop real use cases on the platform itself, helping shape how users interact with data. This includes designing data interfaces, abstractions, and tooling that make it easier to understand, model, and work with data across the system. This is not a traditional ETL-only role. You will write production code, design systems, and help define how data is represented, accessed, and understood across the platform. Qualifications - Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent practical experience. - 6+ years of professional experience in software engineering and/or data engineering roles. - Due to the nature of the work, U.S. Citizenship and the ability to obtain a Secret Clearance are required. - Strong programming skills in Python (or similar backend language). - Experience designing and implementing data models for production systems, with advanced knowledge of dimensional modeling topics like slowly changing dimensions and entity relationship diagrams. - Proficiency in SQL and experience with relational databases (e.g., PostgreSQL). - Experience building backend services or APIs that interact with data systems. - Experience designing and operating data pipelines (ETL/ELT). - Familiarity with NoSQL databases and different data storage paradigms. - Experience working with large datasets and performance optimization. - Experience with Docker and containerized development workflows. - Familiarity with Kubernetes-based environments. - Strong understanding of software engineering fundamentals (testing, version control, system design). Requirements - Design and implement logical and physical data models for complex, evolving datasets. - Define schemas and access patterns that support multi-tenant usage and application-level workflows. - Balance normalization, performance, and flexibility across different storage systems. - Partner with product and engineering teams to translate requirements into scalable data designs. - Develop real-world data use cases on top of the platform to validate and extend its capabilities. - Design and build data interfaces and abstractions that help users understand and work with data. - Contribute to systems such as data glossaries, semantic layers, and metadata and schema discovery tools. - Help define how users explore, model, and interact with data within the platform. - Translate complex data structures into intuitive, usable representations. - Build backend services and APIs that expose and operate on data models. - Implement data access layers that are reliable, maintainable, and performant. - Contribute to core application architecture where data and services intersect. - Write clean, testable, production-grade code. - Design and implement pipelines for ingesting, transforming, and validating data. - Support both batch and near-real-time processing workflows. - Build systems that handle structured, semi-structured, and unstructured data. - Enable data flows that support AI-driven and agent-based workflows. - Work with embeddings, context retrieval, and data representations used in modern AI systems. - Help design systems that make data accessible and useful for autonomous agents. - Implement validation, monitoring, and testing for data systems. - Ensure correctness, consistency, and observability of data pipelines and services. - Diagnose and resolve data-related issues in production environments. Benefits - Engineering mindset: You approach data systems as software systems, not just pipelines. - Data intuition: You understand how to model real-world complexity into clear, usable structures. - Product thinking: You care about how users interact with and understand data, not just how it is stored. - Systems thinking: You see how data flows through services, APIs, and AI systems. - Ownership: You take responsibility for the reliability and usability of what you build. - Pragmatism: You balance ideal design with real-world constraints. - Collaboration: You work effectively across engineering disciplines.

United States
$120K - $160K / year
Empower logo

Director, Data Engineering – Automation

Empower

We are an equal opportunity employer with a commitment to diversity. All individuals, regardless of personal characteristics, are encouraged to apply. All qualified applicants will receive consideration for employment without regard to age, race, color, national origin, ancestry, sex, sexual orientation, gender, gender identity, gender expression, marital status, pregnancy, religion, physical or mental disability, military or veteran status, genetic information, or any other status protected by applicable state or local law.

Data Engineer34 days ago
Full TimeRemoteTeam 10,001+H1B Sponsor

• Lead a team of data engineers transforming data from disparate systems to enable insights and analytics for business stakeholders. • Create technical roadmaps and recommend strategies for data pipelines and integration. • Leverage cloud-based infrastructure to implement scalable, resilient, and efficient data engineering solutions. • Collaborate with data analysts, data scientists, database administrators, cross-functional teams, and business stakeholders to solve problems. • Influence architectural decisions and design patterns across the data platform. • Provide technical leadership across the software development lifecycle, from design to deployment, including hands-on contribution. • Develop project plans, facilitate prioritization timelines, allocate resources, and take ownership of assigned technical projects in a fast-paced environment. • Perform code reviews and ensure data engineers follow best-practice coding standards. • Define and validate test cases to ensure data quality, reliability, and a high level of confidence. • Continuously improve quality, efficiency, and scalability of data pipelines, reducing gaps and inconsistencies.

United States
$138K - $200.1K / year
Job Closed
Full TimeRemoteTeam 1,001-5,000Since 1966H1B No Sponsor

• Own and deliver impactful data products within WSI’s medallion architecture. • Transform raw and conformed data into governed, high-quality datasets for analytics, AI, and operational use. • Design, build, and optimize data solutions on Microsoft Fabric, including pipelines, Lakehouse/Warehouse structures, PySpark notebooks, and semantic models. • Evolve and implement data architecture patterns (medallion, SCD, CDC, orchestration, CI/CD), adapting them to real-world scale, performance, and business needs. • Ensure data quality, observability, and performance at scale. • Implement validation frameworks, monitoring, SLAs, and cost-optimized storage and compute strategies. • Partner with stakeholders to translate requirements into reusable, scalable data models and curated data products. • Drive consolidation of legacy reporting and BI tools into a unified, governed analytics platform. • Embed security, governance, and best practices, including role-based access, cataloging, and release management. • Act as a technical leader, contributing to standards, mentoring peers, and elevating overall engineering quality. • Leverage modern dev tools (e.g., GitHub Copilot or similar) to accelerate delivery and engineering efficiency.

Wisconsin
$125K - $200K / year