L2 Data Engineer
Location
Worldwide
Posted
17 hours ago
Salary
0
Seniority
Mid Level
Job Description
L2 Data Engineer
DeepSource Technologies
Role Description We are looking for a highly skilled and experienced L2 Data Engineer to join our growing Data & Analytics team. In this role, you will lead the design, development, optimization, and maintenance of scalable enterprise data platforms and cloud-native data solutions. You will work closely with architects, analysts, and business stakeholders to build high-performance data pipelines and modern lakehouse solutions that support advanced analytics, reporting, and data-driven decision-making. This opportunity is ideal for a senior data professional with strong hands-on expertise in Databricks and the Microsoft Azure ecosystem, who is passionate about building reliable, scalable, and optimized data platforms in enterprise environments. Qualifications - 5+ years of professional experience in Data Engineering or related roles. - Strong expertise in Python for enterprise data processing, transformation, and automation. - Advanced hands-on experience with Pandas, PySpark, and Spark SQL for large-scale distributed processing. - Strong experience with Databricks, including cluster management, notebook development, workflow orchestration, Delta Lake, and performance optimization. - Extensive experience building and managing enterprise data pipelines using Azure Data Factory. - Strong working knowledge of Azure Synapse Analytics, particularly Spark pool integration and enterprise data warehousing concepts. - Advanced SQL skills including query optimization, performance tuning, indexing strategies, and troubleshooting. - Strong understanding of data lake architecture, Delta Lake, incremental processing, partitioning, and lakehouse concepts. - Experience implementing data governance, security, access controls, and monitoring within cloud data platforms. - Experience handling production support, troubleshooting, and optimization of enterprise data platforms. Requirements - Design, develop, and optimize enterprise-scale data pipelines and ETL/ELT workflows using Azure and Databricks technologies. - Architect and implement scalable data ingestion, transformation, and orchestration processes using Azure Data Factory, Databricks, and Azure Synapse Analytics. - Develop high-performance data transformation frameworks using PySpark, Python, and Spark SQL for large-scale distributed data processing. - Optimize SQL queries, Spark jobs, and data workflows to improve performance, scalability, and cost efficiency. - Lead data migration initiatives, including SQL Server migrations and modernization of legacy data platforms. - Implement and maintain Delta Lake architecture, incremental data loading strategies, and enterprise data lake best practices. - Collaborate with architects and cross-functional teams to design robust and scalable data models aligned with business and governance standards. - Monitor and troubleshoot production pipelines, perform root-cause analysis, and implement preventive measures for recurring issues. - Support CI/CD implementation and infrastructure automation for data engineering workflows. - Mentor junior engineers and contribute to engineering standards, reusable frameworks, and technical best practices. - Create and maintain technical documentation including architecture diagrams, pipeline documentation, and operational runbooks. - Evaluate and recommend modern data engineering tools, frameworks, and optimization strategies. Benefits - Experience with Terraform for Azure infrastructure provisioning and Infrastructure-as-Code (IaC). - Experience implementing CI/CD pipelines for data engineering deployments. - Exposure to Lakehouse Federation, Delta Sharing, and modern data sharing architectures. - Experience with streaming and near real-time data processing solutions. - Knowledge of DevOps practices and cloud cost optimization strategies. Certification Requirement Candidates are expected to hold or be actively working toward the Databricks Certified Data Engineer Professional certification. This certification validates advanced expertise across the following domains: - Advanced ETL and ELT development using Spark SQL and PySpark - Enterprise-grade pipeline orchestration and optimization - Data modeling and scalable lakehouse architecture - Performance tuning and distributed data processing optimization - Advanced data governance and security implementation - Production-grade data engineering practices within the Databricks ecosystem
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Generative AI Engineer - Document Processing of Sensitive Data
NEORISNEORIS is a Digital Accelerator that helps companies step into the future.
Role Description En EPAM NEORIS, creemos que el verdadero cambio nace del talento. Hoy, como parte de EPAM Systems, ampliamos nuestro alcance global y nuestras capacidades, manteniendo lo más importante: una cultura donde cada persona puede crecer, aportar y ser protagonista. No buscamos solo cubrir posiciones. Buscamos talento que quiera superarse, aprender constantemente y dejar huella en proyectos de alto impacto. Estamos en búsqueda de perfiles especializados en IA Generativa y procesamiento documental para incorporarse a un proyecto estratégico e innovador dentro del sector financiero. Principales responsabilidades: - Diseño e implementación de soluciones basadas en IA Generativa y arquitecturas RAG. - Procesamiento masivo y estructuración de documentos complejos. - Definición, mantenimiento y optimización de bases de datos vectoriales. - Desarrollo y optimización de pipelines ETL para tratamiento de información. - Implementación de estrategias de chunking, prompting y reranking. - Participación en la evolución técnica y mejora continua de soluciones IA en producción. - Colaboración con equipos multidisciplinares en entornos ágiles. Qualifications - Excluyentes: - Experiencia real en tratamiento y procesamiento de grandes volúmenes de documentos. - Experiencia en definición, implementación y ciclo de vida/mantenimiento de bases de datos vectoriales (RAG). - Conocimientos en tratamiento de información sensible. - Experiencia en procesos ETL. - Experiencia en optimización de chunking y prompting. - Conocimiento de técnicas de reranker. - Deseables: - Conocimiento de estrategias de caché y optimización de recursos. - Conocimiento de los principales frameworks de IA. - Experiencia en fine tuning de LLMs o SLMs. - Conocimiento de bases de datos basadas en grafos (GraphRAG, KAG, etc.). - Experiencia con Spark. Benefits - Contrato indefinido con salario competitivo. - Modalidad 100% remota. - Plan de carrera personalizado y formación continua. - Participación en proyectos estables con alto componente técnico e innovador. - Flexibilidad horaria y enfoque en la conciliación. - Beneficios sociales adaptados a tus necesidades.
Role Description We are looking for a Middle Data Engineer specialized in Azure Databricks to join our data platform team. The candidate will design and develop modern data pipelines and Lakehouse architectures, leveraging Azure Databricks, Spark, and Azure Data Factory, while integrating with existing SQL Server-based data warehouse environments, also evolving our data platform towards scalable, cloud-based data architectures, enabling advanced analytics and business intelligence. - Design, develop, and maintain data pipelines using Azure Databricks - Build and optimize data transformations using PySpark and SQL in Databricks - Implement and maintain Lakehouse architectures using Delta Lake - Develop ETL/ELT pipelines orchestrated through Azure Data Factory - Integrate data from multiple sources into the data platform and analytical layers - Design and maintain data models and data warehouse structures for analytics - Ensure data quality, scalability, and performance of large-scale data processing pipelines - Collaborate with BI teams to support Power BI and reporting platforms - Support and evolve existing SQL Server data platforms and ETL solutions (SSIS) when required - Contribute to the design of modern cloud-based data architectures Qualifications - 3+ years of experience in Data Engineering or Data Warehouse development - Strong experience with Azure Databricks - Experience developing data pipelines using PySpark and Spark SQL - Solid understanding of distributed data processing and big data concepts - Experience working with Delta Lake and Lakehouse architectures - Strong SQL skills and experience with SQL Server relational databases - Experience building data pipelines using Azure Data Factory - Experience handling large datasets and performance optimization Requirements - Experience with Spark optimization techniques (partitioning, caching, cluster tuning) - Experience with structured streaming in Databricks - Knowledge of CI/CD pipelines for data platforms (Azure Devops) - Familiarity with Power BI - Experience in migrating from traditional ETL process to cloud architectures Benefits - Culture of Relentless Performance: join an unstoppable technology development team with a 99% project success rate and more than 30% year-over-year revenue growth. - Competitive Pay and Benefits: enjoy a comprehensive compensation and benefits package, including health insurance, and a relocation program. - Work From Anywhere Culture: make the most of the flexibility that comes with remote work. - Growth Mindset: reap the benefits of a range of professional development opportunities, including certification programs, mentorship and talent investment programs, internal mobility and internship opportunities. - Global Impact: collaborate on impactful projects for top global clients and shape the future of industries. - Welcoming Multicultural Environment: be a part of a dynamic, global team and thrive in an inclusive and supportive work environment with open communication and regular team-building company social events. - Social Sustainability Values: join our sustainable business practices focused on five pillars, including IT education, community empowerment, fair operating practices, environmental sustainability, and gender equality.
Technical Data Migration Consultant
RLDatixRLDatix is an integrated healthcare operations platform combining data-driven insights with advanced technology to improve patient safety, compliance, and workf
Technical Data Migration Consultant AUS - Melbourne Hybrid Job Description: RLDatix (RLD) is on a mission to help raise the standard of care…everywhere. Trusted by over 10,000 healthcare organizations around the world, our solutions help improve health and care. Our applications ensure that patients receive the best and safest care while supporting the providers who deliver it. Joining TeamRLD means being part of a global effort of over 2,000 team members in making a difference in healthcare…every day. We're searching for a Melbourne-based Technical Data Migration Consultant to join our Data Services Group team, so that we can successfully deliver high-quality healthcare data migration projects for our clients. The Technical Data Migration Consultant will extract, transform, and load healthcare data while collaborating with stakeholders to ensure seamless system transitions and accurate data outcomes. How You'll Spend Your Time - Develop, test, and execute data conversions using SQL and ETL techniques to ensure seamless migration between healthcare systems - Analyse healthcare data in formats such as HL7, CDA, CSV, and SQL in order to support accurate extraction, transformation, and loading - Create and maintain complex SQL scripts to convert data from multiple source systems to target EHR platforms - Facilitate stakeholder meetings to gather requirements, provide updates, and ensure successful delivery of data migration projects - Perform quality assurance checks on data to validate integrity, identify issues, and reconcile converted datasets for accuracy What Kind of Things We're Most Interested in You Having - 3+ years' experience working with healthcare data and EHR/EMR systems - Proven success in delivering data migration or data conversion projects within healthcare environments - In-depth knowledge of SQL Server, T-SQL scripting, and data transformation processes - Ability to commute to our Melbourne office on a hybrid basis - Sincere interest in improving healthcare outcomes through data accuracy and system transformation - A knack for working collaboratively with stakeholders while managing tasks independently in a fast-paced environment By enabling flexibility in how we work and prioritising employee wellness, we empower our team to do and be their best. Key benefits include private health and group accident insurance, an Employee Assistance Program (EAP) for confidential support, and Loyalty Awards for long-service employees. RLDatix is an equal opportunity employer, and our employment decisions are made without regard to race, colour, religion, age, gender, national origin, disability, handicap, marital status or any other status or condition protected by law. We welcome applications from people of all backgrounds and strongly encourage Aboriginal and Torres Strait Islander peoples to apply. As part of RLDatix's commitment to the inclusion of all qualified individuals, we ensure that persons with disabilities are provided reasonable accommodation in the job application and interview process. If reasonable accommodation is needed to participate in either step, please don't hesitate to send a note to accessibility@rldatix.com. Salary offers are based on a wide range of factors including location, relevant skills, training, experience, education, and, where applicable, licensure or certifications obtained. Market and organisational factors are also taken into consideration.
Pipelines Operations - Data Engineer
Hanover Insurance GroupHanover Insurance Group has consistently been named one of America’s Best Midsized Employers by Forbes magazine and one of Business Insurance magazine's Best
Title: PL Ops - Data Engineer ) Location: Worcester, MA, USA United States Remote Req #19719 Job Description: Our Personal Lines Operations team is currently seeking a Data Engineer II in our Worcester office. This is a full-time, exempt role. Fully remote arrangement will be considered for candidates with strong qualifications. POSITION OVERVIEW: The Hanover Insurance Group is hiring a Personal Lines Operations Data Engineer to own the day-to-day reliability and enhancement of our Azure Data Factory (ADF) pipelines and on-prem SQL Server data environment. This role builds and improves ETL workflows that populate SQL tables used by analysts to create Power BI semantic models and reports. The position is split approximately 50/50 between production operations/support and new development/enhancement, while also advancing platform maturity through improved monitoring/alerting, data quality validation, and formalized release practices. What You'll Own: Production support and daily health of scheduled ADF pipelines. Build new pipelines and enhance existing pipelines to improve resiliency, maintainability, and scalability. Implement data validation controls, improve monitoring/alerts, and help define SLAs for data freshness and availability. Establish foundational SDLC practices for data engineering (Git usage, Dev→Prod promotion practices, and a more formal release process). Coordinate cross-team dependencies where upstream internal ETL timelines affect downstream pipeline completion; design dependency-aware orchestration and readiness checks. Contribute to future-state Azure data strategy recommendations (e.g., Data Lake/Blob Storage, notebooks) and support long-term migration planning from on-prem SQL Server to cloud databases (timeline not yet defined). KEY RESPONSIBILITIES: Azure/Fabric Data Factory Engineering Design and maintain curated SQL tables/views used for analytics and reporting; optimize for refresh performance and downstream usability. Develop and maintain ADF pipelines (Copy activities and Data Flows) with consistent patterns for logging, error handling, retries, and notifications. Implement parameterization and reusable components to reduce duplication and speed enhancements. Implement incremental load and backfill strategies appropriate to volume. Production Support & Reliability Monitor daily pipeline execution and triage incidents quickly to restore successful processing. Perform root-cause analysis for failures and recurring issues; implement preventative controls and standardized patterns. Create and maintain operational documentation/runbooks for critical pipelines and common support scenarios. Data Quality, Validation & Trust Build automated validation routines and reconciliation checks (row counts, totals, null/duplicate thresholds, schema drift detection, anomaly flags). Partner with analysts and stakeholders to define key business rules and quality thresholds for trusted reporting datasets. Document data definitions, transformations, and lineage to improve transparency and troubleshooting. Stakeholder Collaboration & Dependency Management Work directly with internal Operations Data Analysts, business stakeholders and external partner analysts to gather requirements and deliver datasets that enable robust Power BI semantic models, reporting and other analytical solutions. Coordinate with upstream internal teams to align dependency readiness signals and timelines; implement orchestration controls to prevent downstream failures. Provide technical coaching and mentoring to less experienced team members; including the usage of best practices. KEY MEASURES OF SUCCESS: Pipelines meeting SLA / on-time delivery Pipeline success rate and reduced manual intervention Time-to-detect/time-to-resolve ETL failures (MTTD/MTTR) Improvements in query/runtime performance and cost efficiency Reduction in recurring data quality defects; completeness/accuracy checks passing Documentation coverage (runbooks for critical pipelines, data dictionary completeness) Required Qualifications Bachelor's Degree preferred in a related field. 6+ years of experience in data engineering, ETL/ELT, or related roles. Hands-on experience building and supporting production pipelines in Azure Data Factory. Strong SQL/T-SQL skills and experience with SQL Server environments supporting analytics/reporting workloads. Advanced data modeling Experience integrating data from multiple sources including databases, flat files/SFTP, and APIs/vendor feeds. Familiarity with Azure data platform components (Data Lake/Blob Storage, notebooks) and/or cloud migration planning, or similar platforms. Strong troubleshooting, root-cause analysis, and operational ownership mindset. Ability to work directly with stakeholders to gather requirements and translate them into technical solutions. Strong communication and documentation skills: Data engineers are often called to present their findings or translate the data into an understandable document. You will need to write and speak clearly, easily communicating complex ideas. Preferred Qualifications Experience establishing SDLC practices for data teams (Git, CI/CD, release management). Experience implementing monitoring/alerting and operational dashboards for data pipelines. Python and/or PowerShell for automation, data validation, and operational tooling. Insurance or operations/customer service experience Familiarity with prompt-based programming and tools (GitHub CoPilot) This job posting provides cursory examples of some of the job duties associated with this position. The examples provided are not complete, and the position may entail other essential and job-related functions and responsibilities that employees will be required to perform.

