Dyson

Data Engineer III

Data EngineerData EngineerFull Time Remote SeniorTeam 10,001+Since 1991H1B SponsorCompany Site LinkedIn

Location

California

Posted

61 days ago

Salary

$104K - $153K / year

Seniority

Senior

Bachelor Degree5 yrs expEnglishAWS Cloud Docker ETL Heroku Kubernetes Python SDLC Spark SQL Terraform Unity

Job Description

• Lead architecture and design of complex data pipelines on Databricks lakehouse architecture (Unity Catalog, Delta Lake, Structured Streaming) • Define technical approach for data engineering initiatives, mentor less-senior engineers, and set standards for code quality through leadership and code reviews • Design and build data foundations that enable AI/ML capabilities — feature stores, embedding pipelines, vector search indexes, and model training datasets • Align data engineering solutions with business strategy, including support for Agentic AI workloads • Own health, scalability, and modernization of data infrastructure with Databricks as the strategic platform — including workload migration, compute optimization, and Unity Catalog adoption • Optimize pipeline performance (Delta Lake table layouts, clustering, Z-ordering) and establish monitoring/alerting best practices with clear SLAs • Build data infrastructure supporting Agentic AI systems — real-time data access layers, context retrieval pipelines, and agent-accessible data services • Collaborate cross-functionally with DevOps, Platform Engineering, and MLOps roles to integrate data solutions into the broader technology environment and shared AI infrastructure – Mlflow registries, feature stores, and agent orchestration layers • Provide consultation to Senior Leadership on complex projects and drive continuous improvement initiatives • Champion data governance at all layers for data, models, and AI assets • Implement data quality strategies (master data management, validation rules, Delta Live Tables expectations) to ensure trust in enterprise data • Serve as liaison across data engineering, AI engineering, and business teams; promote data literacy and stewardship

Job Requirements

Bachelor's in Computer Science, Engineering, or related field (Master's preferred)
5+ years with Python and SQL in data engineering for big data ML/analytics workloads
5+ years designing, building, and troubleshooting scalable ETL/ELT pipelines for business-critical production systems
3+ years with cloud data services (AWS), container orchestration (Docker, Kubernetes), and IaC (Terraform, CloudFormation)
3+ years architecting ML workflows and data platforms with CI/CD, automated testing, and distributed processing (Spark)
3+ years collaborating cross-functionally with Data Science, MLOps, Platform Engineering, and DevOps teams
3+ years implementing data quality testing and optimizing SQL/Python for cost/performance in the cloud
Understanding of the full Data Science SDLC, and experience mentoring engineers
Strongly Preferred - Databricks & AI Platform
2+ years hands-on with Databricks (Delta Lake, Unity Catalog, Databricks SQL)
Experience with MLflow experiment tracking and model registry workflows
Experience designing pipelines that serve AI/ML inference — real-time feature engineering, embedding generation, and context retrieval for LLM-based systems
Understanding of how data engineering supports Agentic AI: agent-accessible data services, low-latency retrieval, and pipelines enabling autonomous multi-step workflows
Familiarity with Databricks Mosaic AI, Vector Search, and/or Feature Store
FinOps awareness — compute cluster optimization, cost attribution by workload
Familiarity with Salesforce/Heroku data infrastructures
Experience with data virtualization (e.g., Dremio)
Understanding of Platform Engineering concepts and internal developer platforms
Experience migrating from legacy data warehouse/lake to unified lakehouse architecture
Familiarity with Odaseva data security and management

Benefits

group health insurance benefits (medical, vision, dental)
FSA and HSA healthcare accounts
life and accident insurance
adoption and fertility assistance
paid parental leave of up to 6 weeks
short/long term disability
paid time off for vacation, personal needs, and sick time
up to 17 days of Choice Time Off (CTO) per calendar year
up to 11 paid holidays per calendar year
opportunity to contribute to company's 401(k) savings and investment plan or deferred compensation plan with an employer match of 100% on the first 3% of contributions

Related Categories

Data Engineer

Related Job Pages

Data Engineer Jobs in California Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Senior Azure Data Engineer

DATAMAXIS, Inc

Datamaxis is a WMBE corporation and committed to provide IT services to commercial and government organizations.

Data Engineer61 days ago

Contract RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Design and build robust, reusable, parameter-driven ingestion and transformation pipeline using Azure Data Factory • Implement medallion architecture on Azure Data Lake Storage Gen2 • Build performant ELT workflows that leverage pushdown to source systems • Develop and optimize PySpark notebooks and jobs on Azure Databricks or Synapse Spark • Design dimensional models and data vault patterns for analytics consumption • Implement Slowly Changing Dimensions and Change Data Capture • Tune distributed SQL workloads in Synapse Dedicated SQL Pool • Implement CI/CD for data pipelines using Azure DevOps • Instrument pipelines with robust logging and monitoring • Lead or contribute to legacy-to-cloud migrations

Azure Cloud Pandas PySpark Python Spark SQL Vault

View details: Senior Azure Data Engineer

Illinois

Apply

Senior Data Engineer – Financial Transactions, Automation

NVIDIA

Data Engineer61 days ago

Full Time RemoteTeam 10,001+Since 1993H1B Sponsor

Company Site LinkedIn

• Architect event-driven pipelines (Kafka) and develop new data models that ensure transactional integrity (ACID) for commercial events like invoices, payments, and adjustments • Automate scalable ETL processes and refactor next-generation data architectures to improve quality, security, and coverage for rapidly growing business demands • Collaborate across teams to codify business processes into self-measuring systems, debugging complex challenges to ensure the reliability of financial operations

Apache AWS Azure Cloud Distributed Systems Docker ETL Google Cloud Platform Kafka Kubernetes Linux Node.js PostgreSQL Python Scala Go

View details: Senior Data Engineer – Financial Transactions, Automation

California

$184K - $287.5K / year

Apply

Senior Data Engineer

OpenTeams

OpenTeams is your single source for everything open source.

Data Engineer61 days ago

Full Time RemoteTeam 11-50H1B No Sponsor

Company Site LinkedIn

• Design and build of ETL processes in collaboration with software and model development teams. • Create and maintain scalable data infrastructure. • Own full pipeline and infrastructure lifecycle including performance monitoring and optimization. • Maintain and improve existing pipelines, ensuring stability over existing requirements and adapting to new needs.

AWS Cloud ETL Google Cloud Platform Python SQL

View details: Senior Data Engineer

Texas

$145K - $200K / year

Apply

Senior Oracle GoldenGate Data Engineer

Compass

Data Engineer61 days ago

Full Time RemoteTeam 10,001+H1B Sponsor

Company Site LinkedIn

• Design, develop, and maintain scalable data pipelines focused on ingestion via CDC (Change Data Capture) using Oracle GoldenGate; • Configure and manage real-time and near-real-time data replication between source systems and cloud environments; • Ensure data consistency, integrity, and synchronization between source and target systems; • Monitor ingestion pipelines, perform troubleshooting, and optimize CDC process performance; • Support full and incremental (delta) load strategies; • Develop and maintain data processing pipelines using Azure and Databricks (Spark); • Implement transformations following modern data architecture patterns using Bronze, Silver, and Gold layers; • Optimize pipelines for performance, scalability, and cost efficiency; • Work with structured and semi-structured data for analytical consumption, reporting, and AI/ML initiatives; • Collaborate with data architects to define modern Lakehouse architectures; • Support data governance, data catalog, lineage, and compliance initiatives; • Ensure data availability, reliability, security, and quality for downstream consumption.

Azure Cloud Oracle Spark

View details: Senior Oracle GoldenGate Data Engineer

Brazil

Apply

Job Closed

Data Engineer III

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Senior Azure Data Engineer

Senior Data Engineer – Financial Transactions, Automation

Senior Data Engineer

Senior Oracle GoldenGate Data Engineer