Blend360

Optimizing business performance through people, data, tech & analytics

Data Quality Engineer

Data EngineerData EngineerFull Time Remote Mid LevelTeam 501-1,000H1B SponsorCompany Site LinkedIn

Location

Mexico

Posted

82 days ago

Salary

Seniority

Mid Level

Azure Databricks Data Engineering ETL SQL Observability/Monitoring AWS Snowflake AI

Job Description

Role Description We are looking for a Data Quality Engineer with strong experience in Azure and Databricks to ensure data quality, reliability, and consistency across modern data platforms. This role focuses on validating data pipelines, implementing automated quality checks, and collaborating closely with Data Engineering and business teams to guarantee accurate and production-ready data assets. - Design and implement a data quality framework across Bronze, Silver, and Gold layers — defining validation rules, threshold tolerances, and alerting standards. - Build and maintain automated data quality checks within Databricks pipelines — row counts, null checks, referential integrity, schema validation, and business rule assertions. - Own reconciliation between source systems and Databricks layers — ensuring source data lands accurately and transformations produce expected outputs. - Validate identity resolution outputs in the Silver layer — reviewing match rates, investigating false positives and false negatives, and ensuring enterprise identifiers are being assigned correctly across source populations. - Perform end-to-end pipeline testing — validating that data flows correctly from ingestion through to the Gold layer and that downstream reporting outputs reflect accurate data. - Partner with Data Engineers to define acceptance criteria for each sprint’s pipeline and data model deliverables before they are promoted to production. - Support UAT with client business stakeholders — helping them validate that Gold layer outputs meet their reporting requirements. - Document all QA processes, test results, and data quality findings in a format that can be handed off to the client team at engagement close. - Monitor pipeline health post-deployment — investigating and triaging data quality incidents and working with engineers to resolve root causes quickly. Qualifications - Experience working with Azure-based data platforms, including Databricks. - Strong understanding of data quality frameworks and testing methodologies for data pipelines. - Experience validating ETL/ELT processes and working with layered architectures (Bronze, Silver, Gold). - Strong SQL skills and experience analyzing large datasets. - Experience implementing automated data validation and reconciliation processes. - Familiarity with data pipeline monitoring, alerting, and troubleshooting. - Ability to collaborate with Data Engineers and business stakeholders. - Strong analytical thinking and attention to detail. - Experience documenting QA processes and results in a structured manner. - English: Advanced (required for effective communication with global teams). Requirements - 3+ years of experience in Data Engineering or Data Quality roles. Benefits - Learning Opportunities: - Certifications in AWS (we are AWS Partners), Databricks, and Snowflake. - Access to AI learning paths to stay up to date with the latest technologies. - Study plans, courses, and additional certifications tailored to your role. - Access to Udemy Business, offering thousands of courses to boost your technical and soft skills. - English lessons to support your professional communication. - Travel opportunities to attend industry conferences and meet clients. - Mentoring and Development: - Career development plans and mentorship programs to help shape your path. - Celebrations & Support: - Special day rewards to celebrate birthdays, work anniversaries, and other personal milestones. - Company-provided equipment. - Flexible working options to help you strike the right balance. - Other benefits may vary according to your location in LATAM. For detailed information regarding the benefits applicable to your specific location, please consult with one of our recruiters.

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)More Remote Jobs

More Data Engineer Jobs

AI Data Platform Lead

Agiloft

The global standard in no-code contract lifecycle management (CLM) software.

Data Engineer82 days ago

Full Time RemoteTeam 201-500Since 1991H1B Sponsor

Company Site LinkedIn

• Own the end-to-end data architecture for the Data Warehouse Foundation, designing for AI-first consumption across GPT assistants, AI agents, predictive models, and operational intelligence — in addition to BI and reporting. • Lead data modeling across all 11 departments, designing canonical enterprise data models that serve cross-functional AI and analytics use cases without duplication or fragmentation. • Design and implement the contextual intelligence layer — including RAG architecture, vector store strategy, knowledge base ingestion pipelines, and document and unstructured data processing — that powers Agiloft's enterprise knowledge system. • Build and maintain the agentic data integration layer: real-time and near-real-time data access patterns, agent memory and state persistence design, orchestration data requirements, and agent output integration back into the warehouse. • Own the AI/ML feature layer — feature engineering strategy and standards, training data pipeline design, feature store architecture, and model output integration — enabling predictive analytics across churn, pipeline health, and operational forecasting. • Design and govern the operational data and GPT context layer, including structured context feed design for GPT assistants, data freshness and access SLAs for AI use cases, and cross-departmental data reuse standards. • Lead the Data Warehouse Foundation build in partnership with the external consulting team — setting architecture standards, reviewing implementation against AI-first principles, and ensuring the five-wave build plan delivers a foundation that serves the full intelligence architecture. • Design and manage data ingestion, ELT/ETL, and orchestration pipelines across all source systems, ensuring reliability, performance, and cost efficiency. • Establish and enforce AI data engineering standards across the organization — prompt-adjacent data design, agent data access patterns, reusable pipeline components, and quality assurance processes. • Own data access policy design and least-privilege access controls in partnership with Security, ensuring data made available to AI systems is governed, auditable, and compliant. • Define data quality standards and monitoring processes for AI-consumed data, where quality failures have direct impact on model and agent performance. • Partner with the Principal Data and Integrations Architect on infrastructure design, ensuring data modeling and AI consumption requirements are incorporated into pipeline and architecture decisions from the start — not retrofitted after build. • Partner with the VP FP&A and Manager of BI & Data to ensure the semantic and metrics layers are technically sound and serve both AI use cases and reporting requirements. • Manage the AI Ops data architecture roadmap, translating business and AI use case requirements from all 11 departments into sequenced, prioritized technical work. • Maintain documentation and knowledge transfer standards for all data architecture, pipelines, and integration patterns — ensuring AI Ops-built infrastructure is reusable, auditable, and not dependent on any single individual. • Collaborate with the AI Agent Engineer and GPT & AI Systems Lead to ensure data infrastructure supports agent orchestration, retrieval-augmented generation, and multi-step reasoning workflows. • Define the roadmap for data science and AI data work in partnership with the VP of AI Operations — this role does not take direction from IT on resource allocation or prioritization. All roadmapping is managed within AI Operations. • Evaluate and recommend data tooling, frameworks, and platform components in alignment with AI Ops' technology-agnostic, build-for-leverage approach. • Other duties as assigned.

Airflow AWS Cloud ETL Python SQL

View details: AI Data Platform Lead

United States

Apply

Job Closed

Senior Data Engineer

Intelligent Medical Objects (IMO)

Data Engineer82 days ago

Full Time RemoteTeam 201-500Since 1993H1B No Sponsor

Company Site LinkedIn

• Build and operate production-grade data platforms that support IMO’s terminology-driven products, analytics, and machine learning use cases • Design, develop, and maintain data pipelines for batch and incremental processing using modern lakehouse and cloud-native patterns • Work extensively with cloud data platforms (AWS + Databricks) to ingest, transform, and serve structured and semi-structured data at scale • Model data intentionally—developing well-documented, analytics- and product-ready data models that balance usability, performance, and correctness • Apply strong software engineering practices to data work, including version control, testing, CI/CD, and infrastructure-as-code • Collaborate directly with product, analytics, and AI teams to translate requirements into scalable technical solutions • Improve reliability, performance, and cost-efficiency of data systems through monitoring, observability, and continuous optimization • Design for data quality and trust, implementing automated checks, validation frameworks, and lineage-aware workflows • Contribute to platform evolution, helping shape standards around orchestration, data modeling, environments, and deployment • Operate in an Agile environment, taking ownership of deliverables and proactively identifying risks and opportunities • Mentor and support other engineers, leading by example in code quality, problem decomposition, and technical decision-making • Continuously learn and apply industry best practices in data engineering, analytics engineering, and AI data foundations

Airflow AWS Cloud EC2 Python Spark SQL Terraform

View details: Senior Data Engineer

Illinois

$130K - $180K / year

Apply

Job Closed

Data Engineer

The Phia Group

The Phia Group is a service-oriented organization assisting employee health plans nationwide. We provide our clients with innovative cost-cutting solutions and constantly expanding service offerings. We continue to enjoy growth thanks to our most valuable resource – our talented and committed team. At The Phia Group, whose mission is to provide high quality yet affordable healthcare to American employees and their families, you can look forward to not only unparalleled benefits for yourself but also being immersed in a company that was named one of USA Today’s Top Workplaces for 2026. Recognized by The Boston Globe and Louisville Business First for our commitment to inclusivity, enjoyment, and empathy for our valued employees.

Data Engineer82 days ago

Full Time Remote

Data Engineer (Remote) Department:IT Location:Canton, MA The Phia Group is a service- oriented organization assisting employee health plans nationwide. We provide our clients with innovative cost-cutting solutions and constantly expanding service offerings. We continue to enjoy growth thanks to our most valuable resource – our talented and committed team. At The Phia Group, whose mission is to provide high quality yet affordable healthcare to American employees and their families, you can look forward to not only unparalleled benefits for yourself but also being immersed in a company that was named one of USA Today’s Top Workplaces for 2026. Meanwhile, from a regional perspective, both The Boston Globe and Louisville Business First also recognized our unwavering commitment to upholding an internal culture of inclusivity, enjoyment, and empathy for our valued employees by listing The Phia Group in their respective lists for the Top Places to Work in 2026. The Data Engineer is responsible for supporting the development, maintenance, and optimization of data pipelines and analytics-ready datasets. You will be collaborating across multiple teams and stakeholders to solve complex problems and support data-driven initiatives. Essential Duties and responsibilities include the following; other duties may be assigned: - Build, maintain, and optimize data pipelines utilizing Azure Data Factory, ensuring data is ingested, transformed, and delivered to Snowflake reliably for analytics - Implement monitoring, alerts, and testing of data pipeline performance, data quality metrics, and lineage to ensure trustworthy data delivery - Troubleshoot data issues and perform root cause analysis to proactively resolve operational issues - Document data structures, processes, architectural decisions, and best practices for knowledge sharing - Develop, maintain, and optimize Snowflake objects (schemas, tables, views) and SQL transformations to produce curated, analytics-ready datasets - Collaborate with analysts, stakeholders, and product owners to translate business needs into data requirements and stable technical implementations - Enable data for AI/ML use cases by preparing feature-rich datasets, supporting feature engineering, and ensuring data consistency for model training and inference - Support deployment and operationalization of machine learning models by integrating pipelines with ML workflows (e.g., batch/real-time scoring) - Continually improve ongoing reporting and analytics, automating or simplifying self-service or manual processes - Implement version control practices for all data engineering code and documentation Experience and Qualifications - Bachelor's degree in Computer Science, Computer Engineering, Information Technology, or a related field; or equivalent experience - 5+ years of experience in data engineering or business intelligence roles working with ETL, data modeling, data architecture, and developing pipelines and applications for analytics (e.g., BI, reporting, machine learning, deep learning) - Solid programming skills in advanced SQL, Python, or other programming languages for data processing and automation Experience supporting or working with AI/ML workflows, including: - Data preparation and feature engineering for machine learning models - Integration of data pipelines with ML frameworks (e.g., scikit-learn, TensorFlow, PyTorch, or similar) - Understanding of model lifecycle concepts (training, validation, deployment, monitoring) - Expertise working with Snowflake for data warehousing, including experience with schema design, performance tuning, and optimization - Proficiency with Git, Azure DevOps, and collaborative development best practices - Experience designing, developing, and deploying end-to-end pipelines using Azure Data Factory Working Conditions / Physical Demands Sitting at workstation for prolong periods of time. Extensive computer work. Workstation may be exposed to overhead fluorescent lighting and air conditioning. Fast paced work environment. Operates office equipment including personal computer, copiers, and fax machines. This job description is not intended to be and should not be construed as an all-inclusive list of all the responsibilities, skills or working conditions associated with the position. While it is intended to accurately reflect the position activities and requirements, the company reserves the right to modify, add or remove duties and assign other duties as necessary. External and internal applicants, as well as position incumbents who become disabled as defined under the Americans with Disabilities Act, must be able to perform the essential job functions (as listed here) either unaided or with the assistance of a reasonable accommodation to be determined by management on a case by case basis.

Azure Snowflake Observability/Monitoring SQL AI/ML Data Engineering ETL Python scikit-learn TensorFlow PyTorch Git Azure DevOps

View details: Data Engineer

Massachusetts

Apply

Full Stack Data Engineer

Anbre Interiors | Top Interior Designers in Chennai & Luxury Interiors

Interiors That Inspire – Chennai's Best Home Interior Design Experts.

Data Engineer82 days ago

Full Time RemoteTeam 51-200Since 2002H1B No Sponsor

Company Site LinkedIn

• Design, build, and maintain data pipelines using GCP tools (BigQuery, Cloud Functions, Cloud Composer, Cloud Scheduler, Apache Beam, Airflow). • Clean, transform, and organize data from multiple sources. • Automate ETL/ELT workflows for reliability and scalability. • Support ingestion from APIs, spreadsheets, and internal systems. • Write Python and Bash scripts to process and automate data tasks. • Develop lightweight backend services and utilities to streamline internal processes. • Build and update dashboards in Looker Studio and D3.js. • Deliver clean, intuitive KPI reports for operations and leadership. • Support visualization needs across the Wellness Division. • Contribute to simple predictive modeling and forecasting tasks. • Prepare structured datasets for future machine learning initiatives.

Airflow Apache BigQuery Cloud D3.js ETL Google Cloud Platform JavaScript Python SQL

View details: Full Stack Data Engineer

South Africa

$1.5K - $1.8K / month

Apply

Job Closed

Data Quality Engineer

Job Description

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

AI Data Platform Lead

Senior Data Engineer

Data Engineer

Full Stack Data Engineer