Job Closed

This listing is no longer active.

Blend360

Optimizing business performance through people, data, tech & analytics

Data Engineer

Data EngineerData EngineerFull Time Remote SeniorTeam 501-1,000H1B SponsorCompany Site LinkedIn

Location

Argentina

Posted

70 days ago

Salary

Seniority

Senior

Bachelor Degree4 yrs expEnglishAirflow Apache HTTP Server PySpark Apache Spark SQL Unity

Job Description

• Design and build production-grade data pipelines in Databricks using Spark/PySpark and SQL. • Develop and maintain an Analytics ID stitching pipeline using deterministic and probabilistic matching techniques across multiple customer data sources. • Build and manage modular data marts (Identity, Behavior, Demographics) with independent refresh cadences. • Implement and maintain a scalable feature store supporting downstream analytics and data science use cases. • Own the end-to-end data lifecycle: ingestion, transformation, validation, deployment, monitoring, and optimization. • Develop data quality frameworks including schema drift detection, anomaly monitoring, match-rate validation, and automated deduplication audits. • Implement CI/CD processes for multi-environment promotion (dev/staging/prod) in Databricks environments. • Coordinate orchestration workflows and manage dependencies using Databricks Workflows or similar tools. • Collaborate closely with Data Architects and Client stakeholders to translate business rules into scalable technical solutions. • Produce comprehensive technical documentation including data contracts, lineage maps, architecture diagrams, and operational runbooks.

Job Requirements

4+ years of experience in Data Engineering building production-grade data pipelines at scale.
Strong hands-on experience with Databricks and Apache Spark (PySpark preferred).
Advanced SQL skills (complex joins, CTEs, window functions, performance tuning).
Experience developing identity resolution or entity matching pipelines (deterministic and/or probabilistic).
Experience designing and implementing data marts or dimensional models (Kimball or similar).
Familiarity with data quality frameworks (schema drift detection, validation, anomaly monitoring).
Experience implementing CI/CD for data pipelines and managing multi-environment deployments.
Strong communication skills and ability to present technical concepts to non-technical stakeholders.
Experience using Jira for ticket tracking and Confluence for documentation.
Nice to Have: Experience with third-party data providers (Epsilon, LiveRamp, Neustar).
Experience with feature stores (Databricks Feature Store, Feast, or similar).
Knowledge of Databricks Unity Catalog.
Experience managing large-scale customer data (transactions, loyalty, retail/QSR data).
Experience with Delta Lake / Lakehouse architecture.
Familiarity with orchestration tools such as Airflow.
Experience working in consulting or embedded enterprise client environments.
Advanced English level (written and spoken) required for client-facing collaboration and technical presentations.

Benefits

📚Learning Opportunities: Certifications in AWS (we are AWS Partners), Databricks, and Snowflake.
Access to AI learning paths to stay up to date with the latest technologies.
Study plans, courses, and additional certifications tailored to your role.
Access to Udemy Business, offering thousands of courses to boost your technical and soft skills.
English lessons to support your professional communication.
👨🏽‍💻Travel opportunities to attend industry conferences and meet clients.
👩‍🏫 Mentoring and Development: Career development plans and mentorship programs to help shape your path.
🎁 Celebrations & Support: Special day rewards to celebrate birthdays, work anniversaries, and other personal milestones.
Company-provided equipment.
⚖️ Flexible working options to help you strike the right balance.
Other benefits may vary according to your location in LATAM.

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More Data Engineer Jobs

Engineer IV, Data Engineering

Omnicell

A leader in transforming the pharmacy care delivery model

Data Engineer70 days ago

Full Time RemoteTeam 1,001-5,000H1B Sponsor

Company Site LinkedIn

Responsibilities: - Translate business needs and architectural guidance into detailed designs, data contracts, and implementation plans that break down large initiatives into actionable engineering tasks with reliable estimates - Create detailed pipeline designs covering schemas, transformations, partitioning, DLT configurations, orchestration, error handling, and observability that align with the platform architecture through close collaboration with the Data Architect - Lead implementation and guide junior engineers on design, coding standards, and best practices - Develop metadata-driven and configuration-driven pipeline patterns that reduce custom code and improve consistency - Make technical decisions that ensure reliability, performance, maintainability, and scalability. Ensure production readiness with monitoring, lineage, alerting, observability, CI/CD and documentation - Define and enforce engineering design patterns, coding standards, testing practices, and operational best practices - Evaluate and incorporate new technologies and Databricks capabilities that improve reliability, performance, or developer productivity - Validate new technologies with the Data Architect and operationalize them through documentation, examples, and enablement - Implement automated data quality checks, rule enforcement, and exception handling - Production support of both an existing and new platform including optimization of jobs, incident tracking and other analysis required for production - Lead resolution of complex production issues and deliver durable root cause fixes - Maintain SLAs for reliability, recovery, idempotency, performance, and cost efficiency - Mentor Level 2–3 engineers through pairing, design guidance, code reviews, and technical coaching Basic Skills: • Bachelor’s degree preferred; equivalent experience accepted • 10+ years in data engineering (12+ without a degree) • 4+ years building production-grade batch/streaming pipelines using PySpark, Spark Structured Streaming, Python, and SQL • Proven experience with data governance, schema evolution, data lineage, and secure access patterns • Proven 2 years’ experience with maintaining and sustaining data pipelines Preferred Skills: • 3+ years hands-on with Databricks (Delta Lake, DLT, Unity Catalog, workflow jobs) within the last 6 years • Experience building metadata-driven or configuration-driven pipelines • Experience with data quality frameworks (DQX, Great Expectations, or equivalent) • Experience with observability, metrics and query performance analysis • Strong Spark optimization Work Conditions: • Team collaborative hours between 8am to 4pm EST • Corporate office/lab environment • Ability to travel 10% of the time

Python PySpark Apache Spark SQL Databricks Unity CI/CD

View details: Engineer IV, Data Engineering

United States

Apply

Job Closed

Data Engineer Intern

Atlassian

Atlassian is a publicly-traded computer software business specializing in collaboration, development, and issue-tracking software for teams. As an employer, Atlassian maintains a t

Data Engineer70 days ago

Internship RemoteTeam 11,000Since 2012

• influence product teams • inform Data Science and Analytics Platform teams • partner with data consumers and products to ensure quality and usefulness of data assets • help strategize measurement, collecting data, and generating insights • understand and improve product experience and engagement • improve efficiency and costs • guide strategy • report to a Data Engineering Manager • learn from your team mentor who are some of the best Data Engineers in the business.

Python SQL

View details: Data Engineer Intern

Australia

Apply

Distinguished Architect, Data Platform

CloudZero

CloudZero is the only cloud cost intelligence platform that connects technical decisions to business results.

Data Engineer71 days ago

Full Time RemoteTeam 11-50Since 2017H1B No Sponsor

Company Site LinkedIn

• Define the Data Platform Architecture • Lead end-to-end technical design for CloudZero's next-generation data platform, from event ingestion and stream processing through hot/cold storage and the query layer to the API surface • Document architectural decisions, tradeoffs, and migration strategies with the rigor of an RFC-driven process • Shape and drive every layer of the new architecture: event ingestion, stream processing and enrichment, real-time serving, analytical storage, query layer, and API • Design and deliver CloudZero's real-time data pipeline from ingestion through enrichment to serving • Establish SLOs for throughput, latency, and correctness, and build the operational playbooks that make this system trustworthy enough to replace the batch pipelines our entire product currently depends on • Tackle real-time streaming at scale across thousands of customers simultaneously, with fault tolerance, backpressure awareness, and correctness as non-negotiables • Redesign CloudZero's dimensional cost model to support high-cardinality, multi-dimensional cost attribution without runaway materialization costs • Drive incremental, delta-based materialization strategies using modern open table formats, dramatically reducing expensive full-rebuild jobs and unlocking millions in annual infrastructure savings • Assess CloudZero's current query infrastructure, drive in-flight migrations to completion, and lead the evolution of the query engine layer going forward • Own performance optimization across partition pruning, predicate pushdown, and query planning, and set the vision for how the query layer grows as data volumes scale 10x • Evolve CloudZero's proprietary cost attribution engine from a batch-oriented model to one that assigns complex cost dimensions by team, feature, and customer within seconds of resource usage • Rethink enrichment, data lineage, and correctness guarantees in a streaming context • Partner with product, infrastructure, and analytics engineering to define a multi-year data platform roadmap • Build consensus across engineering leadership on foundational investments including table formats, streaming frameworks, query engines, and schema management • Participate in architecture reviews, contribute to design patterns and best practices, and mentor senior and staff engineers through code review, pairing, and structured feedback • Make everyone around you better, not by directing, but by raising the collective craft

Apache HTTP Server Apache Kafka Apache Spark

View details: Distinguished Architect, Data Platform

California

$275K - $330K / year

Apply

Data Engineer

Arrow Components

Data Engineer71 days ago

Full Time RemoteTeam 10,001+H1B No Sponsor

Company Site LinkedIn

• Design, develop, and maintain scalable data pipelines, ETL/ELT processes, and data integration solutions with moderate independence • Translate business requirements into technical designs and contribute to solution architecture discussions • Build and enhance business-to-business (B2B) integrations and internal data processing workflows • Diagnose and resolve data quality issues, pipeline failures, and performance bottlenecks; propose optimizations • Develop clear documentation for data models, workflows, and engineering solutions • Collaborate with cross-functional teams including data architects, analysts, application developers, and business stakeholders to ensure effective data delivery • Contribute to project workstreams, ensuring tasks are delivered with quality and within timelines • Adhere to and promote engineering best practices, coding standards, and data governance guidelines

ETL Java Python SQL

View details: Data Engineer

Mexico

$69.4K - $80K / year

Apply

Job Closed

Data Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Engineer IV, Data Engineering

Data Engineer Intern

Distinguished Architect, Data Platform

Data Engineer