Job Closed

This listing is no longer active.

Cloudera

At Cloudera, we believe that data can make what is impossible today, possible tomorrow.

Senior Data Architect

Data EngineerData EngineerFull Time Remote SeniorTeam 1,001-5,000Since 2008H1B SponsorCompany Site LinkedIn

Location

Costa Rica

Posted

58 days ago

Salary

Seniority

Senior

Bachelor Degree5 yrs expEnglishHadoop HDFS PostgreSQL Python SQL

Job Description

• Design and implement scalable data warehouse and lakehouse architectures on the Cloudera platform. • Define enterprise data models, governance frameworks, security standards, and data quality practices. • Architect and optimize analytics solutions across SQL engines including Impala, Hive, and Iceberg. • Design AI-powered analytics solutions leveraging LLMs, Retrieval-Augmented Generation (RAG), vector databases (such as PostgreSQL, Qdrant, Milvus), and NLQ capabilities. • Lead the integration of AI/ML capabilities into enterprise data platforms and data pipelines. • Leverage vibe coding / AI-assisted development tools to accelerate development and improve productivity. • Build and optimize batch and near real-time data pipelines. • Collaborate with business stakeholders to translate business requirements into scalable data products and analytics solutions. • Establish best practices for performance optimization, data architecture, and AI-assisted development. • Mentor teams on modern data architecture and AI-enabled development methodologies. • Ensure data security, governance, and compliance within enterprise data platforms.

Job Requirements

Bachelor’s degree in Computer Science or equivalent and 5-6 years of related experience; OR Master’s degree and 3-5 years of related experience; OR PhD and 0-3 years of related experience
Deep expertise in enterprise data warehousing, lakehouse architectures, and Cloudera-based data platforms.
Strong experience with CDP, including HDFS, Hive, Impala, Kudu, and Cloudera data ingestion and processing frameworks.
Strong understanding of distributed data systems and Hadoop-based architectures.
Advanced SQL skills, including performance tuning and query optimization.
Proficiency in Python and data engineering frameworks.
Experience with dimensional and normalized data modeling.
Strong understanding of data governance, lineage, metadata management, and enterprise security.
Experience implementing AI/ML, LLM, vector database, and RAG-based solutions in production environments.
Familiarity with AI-assisted development tools (e.g., GitHub Copilot and LLM-powered workflows).
Strong communication, stakeholder management, and problem-solving skills.
Ability to align enterprise data architecture with business objectives in Finance, Sales, and Revenue Operations.
Ability to bridge traditional data platforms with modern AI capabilities.

Benefits

Generous PTO Policy
Support work life balance with Unplugged Days
Flexible WFH Policy
Mental & Physical Wellness programs
Phone and Internet Reimbursement program
Access to Continued Career Development
Comprehensive Benefits and Competitive Packages
Paid Volunteer Time
Employee Resource Groups

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Senior Data Engineer

OneSignal

Data Engineer58 days ago

Full Time RemoteTeam 51-200

Role Description The operations team is a highly strategic and analytical team that helps guide and implement strategic initiatives across the company. We are looking for a Data & Analytics specialist to connect and organize our data across the company to drive visibility into performance and strategy across our sales, marketing, product, operations, finance, and engineering efforts. The ideal candidate will have some background in data engineering. - Design, build and maintain business-critical data and distributed systems that will provide real time and reliable data to all of our go to market tools and internal users. - Connect our production backend/data to business systems including Salesforce, Marketo, Intercom, Google Analytics, Metabase, etc. This can include working with a data warehouse/data lake, organizing large scale data (we send 10 billion notifications a day), and building ETLs to business systems. - Evaluate ways to increase the efficiency of internal data flows and centralize sources of truth. - Innovate, design, and build data systems, services, and tools using GCP (Google Cloud Platform) that scale with OneSignal’s products and business requirements. - Work cross functionally including with the backend engineering team as well as business teams including operations, product, marketing, sales, customer success, support, finance, etc. - Analyze the data and create data insights and business insights to help move the business as well as assist and empower teams across the company in making data related decisions. - Build data science/machine learning models using internal and external data sources to identify potential new customers, those who are at risk of churn or those with potential upsell opportunities. - Work with Airflow, DBT, Presto, Hightouch and introduce the latest tools into our technology stack. Potentially, figure out how to incorporate artificial intelligence into our technology stack. Qualifications - 6+ years of professional experience in a technical area at a high growth startup is preferred. - Proficiency with Python and experience with DBT and Airflow is a plus. - Self driven and ability to identify problems and implement and identify solutions. - A combination of technical and business acumen. The ideal candidate would have an understanding of SaaS metrics and growth company infrastructure scaling challenges. - Strong interpersonal and communication skills and experience working cross functionally. - The ideal candidate has had experience growing and managing a smaller but high functioning team. Requirements - The New York and California base salary for this full time position is between $170,000 to $190,000. Your exact starting salary is determined by a number of factors such as your experience, skills, and qualifications. - In addition to base salary, we also offer a competitive equity program and comprehensive and inclusive benefits.

Data Engineering Distributed Systems Salesforce Marketo Metabase GCP AI/ML Airflow dbt Presto AI Python

View details: Senior Data Engineer

United States

$170K - $190K / year

Apply

Senior Data Engineer

Imagemaker

Let’s co-create awesome digital experiences!

Data Engineer58 days ago

Full Time RemoteTeam 201-500Since 2003H1B No Sponsor

Company Site LinkedIn

• Business Context Apoyar al equipo de Sostenibilidad de LATAM Airlines en la evolución de su plataforma de datos. El proyecto consiste en integrar y modelar distintas fuentes corporativas junto con la Base de Datos de Sostenibilidad, habilitando información confiable, trazable y escalable para análisis, reportabilidad y toma de decisiones.

Cloud Google Cloud Platform Grafana Python SQL

View details: Senior Data Engineer

Colombia

Apply

Job Closed

Data Lead (Defense)

Air Space Intelligence

A software-first aerospace company

Data Engineer58 days ago

Full Time RemoteTeam 51-200Since 2018H1B No Sponsor

Company Site LinkedIn

About Air Space Intelligence ASI's mission-critical technology powers decision-making across aviation, defense, energy, and other critical infrastructure domains. Backed by top-tier investors including Andreessen Horowitz, Spark Capital, and Renegade Partners, ASI delivers operational decision superiority—compressing days of analysis into seconds of action. ASI is leading the way and pushing the boundaries of what’s possible. What You Will Do: You will own the reality layer that powers ASI's defense products. You will operate, improve, and expand the data flows that move operational data from mission systems into usable, trusted datasets for decision-making. This role is equal parts data engineering, data forensics, and technical communication. You will dig into real-world data, determine what it actually means, and translate findings into clear, actionable pipelines ready to drive mission-critical decisions for end users. What We Value: - Strong fluency with modern data engineering tooling and patterns: streaming and batch pipelines, schema evolution, data contracts, and lineage. - Demonstrated ability to debug data: profiling, anomaly detection, reconciling sources, and separating signal from noise while cross-validating. - Strong technical communication skills: you can explain what the data is doing, what it means, what is broken, why it matters, and what engineering should change. - Comfort with distributed messaging and processing (Kafka, Flink, Spark, or equivalents) and modern orchestration (Airflow, Dagster, Temporal, or similar). - Strong grasp of API design and integration patterns (REST, gRPC, GraphQL) and experience working across a range of data formats and wire-level protocols (JSON, XML, Protobuf, and binary protocols like JREAP-C or CMF-B). - Working knowledge of modern network protocols, firewalls, system level connections, cross-system authentication, and an enthusiasm to get data flowing. - Familiarity with defense operational data and mission systems, with an appreciation for delayed reporting, inconsistent identifiers, changing semantics, and obscure edge cases. - Comfort operating in classified network environments (e.g., SIPR, JWICS) and working within accreditation boundaries (IL5/IL6, ATO processes). - Experience deploying data infrastructure on Kubernetes and across hybrid cloud and on-prem environments. - A bias for action and distinct aptitude for problem solving in ambiguous environments. - Active SECRET or TOP SECRET U.S. Security Clearance. How We Hire: We look at the interview process not as a screening or test, but rather as an opportunity to simulate what it would look like working together. We build the interview process around you.

View details: Data Lead (Defense)

Colorado

Apply

SAP Data Engineer – Freelancer

Mactores

Mactores is a trusted leader among businesses in providing modern data platform solutions.

Data Engineer58 days ago

Contract RemoteTeam 51-200Since 2008H1B No Sponsor

Company Site LinkedIn

• Build extraction pipelines from SAP HANA to AWS S3 using SLT, ODP, CDS views, SDI, and native HANA SQLScript, picking the right tool per source and per latency requirement. • Model raw SAP tables across FI/CO, MM, SD, and adjacent modules into clean, semantically meaningful datasets that the downstream Spark layer and business users can actually use. • Design and operate delta and CDC patterns so incremental loads stay correct, idempotent, and replayable. • Write ABAP extractors where standard SAP tooling falls short, and document them so future engineers can change them safely. • Own the write-back path: load curated data from S3 into SAP BW / BW4HANA and model it for end-user reporting and analytical querying. • Land data in S3 as Parquet with sane partitioning, schemas, and IAM scoping, and define the contract with the PySpark engineer at the ingestion-to-transformation boundary. • Embed with a customer team, ship the pipeline to production, and stay close enough through cutover to know it actually runs.

AWS Cloud PySpark Spark

View details: SAP Data Engineer – Freelancer

India

Apply

Senior Data Architect

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Senior Data Engineer

Senior Data Engineer

Data Lead (Defense)

SAP Data Engineer – Freelancer