Bluelight is a leading software consultancy dedicated to designing and developing innovative technology that enhances users' lives. With a steadfast commitment to delivering exceptional service to our clients, Bluelight excels in its focus on quality and customer satisfaction. Our mission is not only to create cutting-edge applications but also to foster a collaborative and enriching work environment where each team member can grow and thrive. With a presence across the United States and Central/South America, Bluelight is in an exciting phase of expansion, continually seeking exceptional talent to join its dynamic and diverse community.

Data Engineer (Azure)

Data EngineerData EngineerFull Time Remote Mid LevelTeam 201-500

Location

Latin America and the Caribbean

Posted

7 days ago

Salary

Seniority

Mid Level

ETL Data Engineering Python PySpark Azure REST API SQL Azure Key Vault Microsoft SQL Server Git Azure DevOps AI/ML Power BI Tableau

Job Description

Role Description As an ETL Data Engineer, you will play a critical role in our client’s expanding data engineering team, designing, developing, and maintaining data integration processes primarily using Python (PySpark) and Azure Synapse Analytics to ensure the accuracy and availability of analytical data. - Develop and maintain ETL data engineering processes using Python (PySpark) within Azure Synapse Analytics Notebooks, and/or Azure Synapse Analytics Pipelines. - Apply your expertise in data warehousing, understanding star schemas, facts, and dimensions, to design and build effective data storage structures in a Massively Parallel Processing (MPP) SWL Pool. - Extract data from various sources, including REST APIs, SQL database tables, and CSV files. - Utilize your deep knowledge of Azure Synapse Analytics to design and optimize data notebooks/pipelines for scalability and performance. - Contribute to the implementation and understanding of other Data Fabric concepts, such as data lakes, lakehouses, delta lakes, and data cataloging. - Collaborate with data architects to create data models and schemas that align with business requirements. - Implement data quality checks and validation processes to maintain data accuracy and consistency. - Identify and resolve performance bottlenecks and optimize ETL data notebooks/pipelines to meet SLAs. - Monitor ETL jobs, diagnose issues, and implement solutions to ensure data pipeline reliability. - Maintain comprehensive documentation of ETL data engineering processes, data flows, and data transformations. - Work closely with cross-functional teams to understand data requirements and provide support for data-related initiatives. - Ensure data security and compliance with data governance and privacy standards. Qualifications - Bachelor’s degree in Computer Science, Information Technology, or a related field; or equivalent work experience, with certifications related to data engineering or data science (e.g. Azure Data Engineer) being a plus. - Proven experience in ETL data engineering with significant expertise in using Python (PySpark) to perform data extraction, transformation, and loading from REST APIs, SQL database tables, and CSV files. - Proficiency in using Azure Synapse Analytics resources including Notebooks, Pipelines, Linked Services, and Azure Key Vault. - Demonstrated ability to write complex SQL queries, optimize query performance, and work with both SparkSQL and MS SQL. - Knowledge of data integration best practices and tools. - Experience with version control systems, such as Git (Azure DevOps). - Strong problem-solving and analytical skills, with a keen attention to detail. - Excellent communication skills, both verbal and written, with the ability to work collaboratively in a team environment with shifting priorities. - Familiarity with big data technologies, machine learning, and data analysis preferred. - Experience with data visualization tools (e.g. Power BI, Tableau) and Agile Methodologies a plus. Benefits - Your contributions are highly valued by clients, and the work you do often has a direct and significant impact on their business. - You will have the opportunity to work on a variety of projects for our incredible clients, which will accelerate your career growth. - You’ll collaborate with modern technologies and work alongside some of the best professionals in the industry!

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Unit Lead – Data Engineer

evoila

We get IT done, no bulls#!t.

Data Engineer7 days ago

Full Time RemoteTeam 201-500Since 2009H1B No Sponsor

Company Site LinkedIn

• Fachliche und disziplinarische Führung des Data-Engineering-Teams • Aufbau, Konzeption und Weiterentwicklung moderner Datenplattformen und Data-Lakehouse-Architekturen (z. B. auf Basis von Databricks) • Strategische Beratung unserer Kunden bei der Einführung und Skalierung datengetriebener Plattformen in unterschiedlichen Branchen • Entwicklung und Optimierung von ETL-/ELT-Pipelines zur Verarbeitung großer Datenmengen in Batch- und Streaming-Szenarien • Sicherstellung von Qualität, Best Practices und technischer Exzellenz im Team – von Code Reviews über Architekturentscheidungen bis hin zu Standards für DataOps und CI/CD • Aktive Mitgestaltung der Unit-Strategie, des Portfolios sowie Unterstützung bei Go-to-Market- und Pre-Sales-Aktivitäten

AWS Azure Cloud ETL Google Cloud Platform Python Scala Spark SQL Terraform Go

View details: Unit Lead – Data Engineer

Germany

Apply

Job Closed

Data Engineer

AIS (Applied Information Sciences)

A Partner That Brings Enterprise Cloud Transformation Full Circle

Data Engineer7 days ago

Full Time RemoteTeam 501-1,000Since 1982H1B No Sponsor

Company Site LinkedIn

• Design, build, and maintain scalable batch and near-real-time data pipelines using cloud-native services • Develop and optimize data ingestion, transformation, and orchestration workflows across diverse data sources • Build and maintain ELT/ETL frameworks to support analytics, reporting, and data science use cases • Prepare, transform, and curate raw data into analytics-ready datasets for both technical and non-technical stakeholders • Develop, deploy, and operate data products within Azure-based analytics platforms (e.g., Databricks, Synapse, Fabric) • Implement data quality checks, monitoring, and observability to ensure data accuracy, reliability, and integrity • Apply data governance, security, and privacy controls aligned with enterprise and regulatory standards • Monitor data platform performance and proactively implement cost and performance optimizations • Partner with data scientists, analysts, and analytics engineers to ensure trusted and timely access to data • Design data solutions that are scalable, reusable, automated, and well-governed by default

Airflow Apache Azure Cloud ETL PySpark Python Spark SQL

View details: Data Engineer

Virginia

$111K - $204K / year

Apply

Job Closed

Software Data Engineer, Data Platform

Augury

Founded in 2012, Augury is a computer software and technology company that connects smartphones with ultrasonic sensors and vibrations to detect machine malfunctions before they oc

Data Engineer7 days ago

Full Time RemoteTeam 203Since 2011

Our mission is to transform how people and machines work together to push the boundaries of human productivity. A leader in Industrial AI, Augury helps the world’s manufacturers leverage real-time production insights to drive new levels of efficiency. Combining predictive and prescriptive AI technology with industry expertise, production teams can proactively address alerts, minimize downtime, reduce asset costs, and maximize yield and capacity. Our customers achieve payback in six months or less, enabling global scale. We're looking for team members excited to partner with the world's manufacturers and build the future of production together. You are a Software Data Engineer with deep experience building data-intensive systems, not a traditional ETL or BI-focused Data Engineer. In this role, you will design and build production-grade data services, platforms, and pipelines that power DIH and our AI-driven products. You will combine strong software engineering fundamentals with modern data engineering practices, with a focus on clean architecture, reliability, scalability, observability, and testing. As a Software Data Engineer, Data Platform, you will: - Build and evolve Python-based services and pipelines that ingest raw industrial events, store them reliably, and expose clean, well-modeled tables and APIs for downstream consumers, including Digital Twin, Smart Canvas, AI agents, and analytics. - Design systems that handle duplicates, invalid data, late-arriving events, and reprocessing in a principled, incremental, and reproducible manner. - Collaborate with platform, machine learning, and product teams across Israel and globally to transform complex data challenges into robust, observable, and scalable software solutions. - A Day in Your Life Production Data Systems & Pipelines - Design and implement end-to-end data flows, from raw event ingestion into durable storage to modeled datasets and aggregates that power products, Digital Twin capabilities, analytics, and AI agents. - Build idempotent pipelines that can safely re-run without corrupting data, using deterministic keys and clearly defined contracts between raw, curated, and modeled datasets. - Implement incremental aggregations (e.g., machine signal summaries, production metrics, and operational KPIs) that correctly account for late-arriving data, watermarking strategies, and reproducibility requirements. - Model relationships and context across machines, lines, factories, sensors, work orders, and operational events to support context-aware applications, knowledge graphs, and AI agents. - Partner with platform teams to define how datasets are stored within our lakehouse, Digital Twin, and context graph architectures and exposed through well-defined APIs and tools. Software Engineering & Data Quality - Write clean, maintainable Python services with clear separation of concerns across ingestion, validation, transformation, persistence, aggregation, and orchestration layers. - Apply strong data modeling and SQL fundamentals, including schema design, indexing strategies, event-time semantics, and scalable aggregation patterns. - Drive testing discipline across the data platform, including unit tests, data-quality tests, integration tests, and validation frameworks. - Design for observability through metrics, logging, tracing, and monitoring that simplify debugging, improve data quality visibility, and support production operations. - Troubleshoot and resolve production data issues, including incorrect aggregations, missing data, duplicate records, schema evolution challenges, and backfill operations. Streaming, Lakehouse & Scalability - Build and evolve systems that scale from local development environments to cloud-scale lakehouse architectures using technologies such as Databricks, Delta Lake, and Spark. - Design and implement data pipelines following modern lakehouse patterns, including Bronze, Silver, and Gold layers, partitioning strategies, and cost-efficient compute utilization. - Work with streaming and messaging platforms (Kafka, Pub/Sub, or similar) to build reliable, idempotent consumers, replay capabilities, and reprocessing workflows. - Contribute to multi-tenant data architectures, data contracts, and governance practices that enable secure and efficient access to customer data at scale. Collaboration & AI-Native Experiences - Work closely with DIH, Smart Canvas, and AI teams to define how agents interact with structured data, context graphs, APIs, and tools in deterministic and reliable ways. - Translate product requirements and user needs into technical designs that balance correctness, performance, latency, cost, and long-term maintainability. - Participate in architecture reviews, design discussions, code reviews, and collaborative development practices that raise the overall engineering bar across the organization. - Help shape the future of AI-native experiences by building the data foundations that power intelligent applications and agentic workflows. What You Bring - Bachelor's degree in Computer Science, Software Engineering, Data Engineering, Information Systems, or a related engineering discipline, or equivalent practical experience. - 5+ years of professional software engineering experience, including substantial experience building backend systems, distributed systems, or data-intensive applications in production environments. - Strong Python engineering skills, including modular architecture, dependency management, testing practices, observability, and production-grade code quality. - Strong SQL and data modeling expertise, including schema design, indexing strategies, event-driven data models, and scalable analytical aggregations. - Hands-on experience building incremental and idempotent data pipelines that handle duplicate, invalid, and late-arriving events without impacting downstream consumers. - Experience with at least one major cloud platform (Azure, GCP, or AWS) and modern lakehouse technologies such as Databricks, Delta Lake, Spark, or equivalent architectures. - Experience with streaming or messaging technologies such as Kafka, Pub/Sub, Event Hubs, or similar event-driven systems. - Proven ability to diagnose and resolve production data issues, including data quality problems, schema evolution, backfills, replay scenarios, and performance bottlenecks. - Strong written and verbal communication skills in English and experience collaborating effectively with globally distributed teams. Nice to Have - Experience building industrial, IoT, manufacturing, or operational data platforms. - Familiarity with Digital Twin architectures and industrial data models. - Experience with graph databases, context graphs, knowledge graphs, or relationship-centric data modeling. - Exposure to AI/LLM-powered applications, including retrieval-augmented generation (RAG), agents, tool calling, or evaluation frameworks. - Experience working with Databricks or similar lakehouse platforms from both application and platform perspectives. - Experience building data products that directly support AI agents, intelligent applications, or machine learning workflows. Perks - Stock options - Paid parental leave - Flex PTO Augury is a people-first organization. We believe in fostering an inclusive environment in which employees feel encouraged to share their unique perspectives, leverage their strengths, and act authentically. We know that diverse teams are strong teams, and we welcome those from all backgrounds and varying experiences. We are committed to providing employees with a work environment free of discrimination and harassment. We believe that diversity is more than just good intentions, and we are committed to creating an inclusive environment for all employees. Augury is a proud equal opportunity employer, we strive to create a work environment in which everyone, all applicants, employees, customers, guests, and vendors feel safe and comfortable. We commit to maintain a workplace that is free of any type of harassment and does not tolerate anyone intimidating, humiliating, or hurting others. We prohibit willful discrimination based on age, gender, ethnicity, race, color, religion, political opinions, sexual orientation, sexual identity or expression, military or veteran status, disability or any other characteristic protected by law.

View details: Software Data Engineer, Data Platform

Israel

Apply

AI & Data Engineer

Enroute

We deliver IT services and solutions provided by a team of passionate problem solving individuals highly skilled.

Data Engineer7 days ago

Full Time RemoteTeam 51-200H1B Sponsor

Company Site LinkedIn

Role Description We are seeking a data-driven Ai Engineer to join our team at a high-growth advertising technology company. This role focuses on scaling our reporting infrastructure for advertising performance and billing reconciliation, ensuring that financial and operational data is accurate, automated, and actionable. - Develop robust data pipelines, ensuring data quality and reliability. - Enable efficient data consumption across the organization. - Collaborate closely with cross-functional teams including Product, Engineering, Analytics, and Business stakeholders to deliver high-impact data platforms. The ideal candidate is a proactive problem-solver with strong technical expertise, capable of working with large datasets, modern data architectures, and cloud-based environments. You thrive in fast-paced settings, navigate ambiguity with confidence, and are passionate about turning data into actionable value. Qualifications - Databricks & AI Architecture (Must-Have) - Strong experience working with Databricks Lakehouse architecture. - Nice to have expertise in Databricks Mosaic AI and Unity Catalog for governing AI assets. - Hands-on experience building RAG (Retrieval-Augmented Generation) pipelines using Vector Search. - SQL & Data Modeling (Must-Have) - Advanced SQL development. - AI Engineering & Data Workflows (Must-Have) - Experience integrating LLM APIs (OpenAI, Anthropic, etc.) into data workflows. - Hands-on experience using AI for: - Data enrichment - Anomaly detection - Automated classification - Experience with LangChain, LlamaIndex, or similar frameworks. - Exposure to Model Context Protocol (MCP) or similar approaches to connect AI models with external tools and data sources. - Strong understanding of Tool Calling / Function Calling: enabling LLMs to interact with SQL databases and external APIs securely. - Experience in Prompt Engineering and Guardrailing: designing system prompts that maintain context and hierarchy (e.g., understanding team associations). - Platform & Engineering Practices (Nice-to-Have / Medium) - Experience with GitHub workflows. - Familiarity with CI/CD pipelines (Jenkins or similar). - Experience working with YAML/YML configuration files. Requirements - Architect AI Agents: Build and deploy agents that can perform NLP-based data generation, automated data enrichment, and complex data reasoning within Databricks. - Natural Language Interfaces: Develop "Chat with your Data" features, allowing stakeholders to query the data warehouse using natural language. - Integrate LLMs into data workflows for automation and intelligence. - Develop scalable data models to support analytics and AI use cases. - Implement AI-driven enhancements such as anomaly detection and data enrichment. - Collaborate with data, analytics, and engineering teams to improve data reliability. - Optimize performance and scalability of data and AI workflows. - Support automation through CI/CD practices. - Ensure data quality, traceability, and maintainability across pipelines. Benefits - Monetary compensation - Year-end Bonus - IMSS, AFORE, INFONAVIT - Major Medical Expenses Insurance - Life Insurance - Funeral Expenses Coverage - TDU Membership - MediAccess - Health Check-Up Subsidy - Preferential rates for car insurance - Vacations - Official Mexican Holidays - Life Happens Days - Bereavement Leave - Civil Marriage Leave - English Classes - Certifications - Educational Agreements (Talisis, U-ERRE, UNID, TecMilenio, Tec de Monterrey, UDEM, SPIS) - Corporate Agreements & Discounts (Sorteos Tec, Envia Flores, TopGolf) - Taquitos Rewards - Birthday Bonus - Work-from-home Bonus - Laptop Policy

AI Databricks Unity SQL LLM OpenAI API LangChain LlamaIndex GitHub CI/CD Jenkins AI Agents

View details: AI & Data Engineer

Mexico

Apply

Data Engineer (Azure)

Job Description

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Unit Lead – Data Engineer

Data Engineer

Software Data Engineer, Data Platform

AI & Data Engineer