Lead Data Engineer/ Data Architect

Location

India

Posted

2 days ago

Salary

0

Seniority

Lead

Job Description

Lead Data Engineer/ Data Architect

Enable Data Incorporated

Role Description Enable Data is seeking an experienced Azure Data Architect with strong hands-on engineering expertise to design, develop, optimize, and modernize enterprise-scale data platforms. The role combines solution architecture, technical leadership, and individual contributor responsibilities, requiring deep expertise in Azure data services, data engineering best practices, and large-scale data processing using PySpark, Python, SQL, and Azure Databricks. The ideal candidate will be responsible for analyzing existing data products and pipelines, identifying improvement opportunities, leading refactoring and redesign initiatives, and implementing scalable, reliable, and high-performance data solutions on Azure. Responsibilities - Architecture & Solution Design - Overall 15+ years of experience with 10+ years of experience in Data Engineering and Analytics solutions. - Design end-to-end data architectures on Microsoft Azure. - Define data ingestion, transformation, storage, governance, and consumption strategies. - Create scalable, secure, and cost-effective data solutions aligned with business objectives. - Establish architectural standards, design patterns, and best practices for data engineering teams. - Collaborate with business stakeholders, product owners, and technical teams to translate requirements into technical solutions. - Design, develop, and modernize enterprise data platforms on Azure. - Analyze, refactor, and redesign existing data pipelines and products while building new scalable data solutions. - Act as a hands-on individual contributor with technical leadership responsibilities, including mentoring junior developers and driving best practices. - Strong hands-on experience in PySpark, Python, and SQL. - Expertise in Azure Databricks and Azure data ecosystem. - Experience designing and implementing scalable data architectures. - Ability to analyze, optimize, refactor, and redesign existing data pipelines. - Develop and maintain ETL/ELT solutions and data products. - Performance tuning, troubleshooting, and data quality implementation. - Mentor and guide junior developers and conduct code reviews. - Collaborate with cross-functional teams to deliver end-to-end data solutions. Requirements - Azure Data Lake, Azure Data Factory, Azure Synapse. - Data warehousing, lakehouse architecture, and cloud migration. - CI/CD, Git, and Agile development practices. Role Type - Senior Individual Contributor with Technical Leadership responsibilities.

Related Categories

Related Job Pages

More Data Engineer Jobs

Enable Data logo

Lead Data Engineer – Data Architect

Enable Data

A leading provider of advanced data, application, and cloud engineering services.

Data Engineer2 days ago
Full TimeRemoteTeam 51-200H1B Sponsor

• Architecture & Solution Design • Overall 15+ years of experience with 10+ years of experience in Data Engineering and Analytics solutions. • Design end-to-end data architectures on Microsoft Azure. • Define data ingestion, transformation, storage, governance, and consumption strategies. • Create scalable, secure, and cost-effective data solutions aligned with business objectives. • Establish architectural standards, design patterns, and best practices for data engineering teams. • Collaborate with business stakeholders, product owners, and technical teams to translate requirements into technical solutions. • Design, develop, and modernize enterprise data platforms on Azure. Analyze, refactor, and redesign existing data pipelines and products while building new scalable data solutions. • Act as a hands-on individual contributor with technical leadership responsibilities, including mentoring junior developers and driving best practices. • Strong hands-on experience in PySpark, Python, and SQL • Expertise in Azure Databricks and Azure data ecosystem • Experience designing and implementing scalable data architectures • Ability to analyze, optimize, refactor, and redesign existing data pipelines • Develop and maintain ETL/ELT solutions and data products • Performance tuning, troubleshooting, and data quality implementation • Mentor and guide junior developers and conduct code reviews • Collaborate with cross-functional teams to deliver end-to-end data solutions

India
Arbital Health logo

Senior Product Manager, Data Platform

Arbital Health

We are a neutral third-party adjudication utility that is accelerating the $1 trillion shift to value-based care

Data Engineer2 days ago
Full TimeRemoteTeam 1-10Since 2023H1B No Sponsor

• Own the product roadmap for Arbital’s data pipeline platform, including ingestion, transformation, calculation, validation, audit trail, and AI consumption layers • Define and prioritize pipeline capabilities based on client needs, implementation learnings, engineering constraints, and long-term platform scalability goals • Translate complex healthcare data requirements from claims processing to VBC contract logic into structured, buildable product specs • Partner with leadership to align pipeline investments with Arbital’s broader product and go-to-market strategy • Write detailed PRDs, user stories, and technical specifications for platform features, configurations, and automation tooling • Work directly with engineering to scope, sequence, and ship pipeline improvements — balancing speed, quality, and flexibility • Define acceptance criteria and lead QA processes for new pipeline & platform capabilities, ensuring outputs meet accuracy and performance standards • Drive platform delivery end-to-end, owning prioritization, cross-team dependencies, and release coordination • Develop deep fluency in Arbital’s data models, pipeline architecture, and healthcare data standards (claims, eligibility, attribution, CMS/ACO files), and actuarial concepts (IBNR, rebates, contract terms) • Work hands-on with data — running SQL queries, reviewing pipeline outputs, and validating logic — to inform product decisions and support debugging • Define standards for data quality, deduplication, business rule configuration, lineage, and pipeline observability across all client environments • Evaluate and recommend tooling improvements across the platform stack (e.g., Airflow, Databricks, AWS) in partnership with engineering • Serve as the primary product owner for data capabilities across implementation, engineering, actuarial, and data science teams • Partner closely with the Implementation team to surface recurring client configuration needs and turn them into scalable platform features • Collaborate with actuarial and data science teams to ensure pipeline logic correctly supports attribution, aggregation, and actuarial use cases • Communicate roadmap priorities, tradeoffs, and delivery status clearly to both technical teams and non-technical stakeholders

United States
$170K - $200K / year
UR Ventures logo

Senior Data Warehouse Developer

UR Ventures

Ensuring University of Rochester innovations enrich our community, improve our society, and make the world ever better

Data Engineer2 days ago
Full TimeRemoteTeam 11-50H1B No Sponsor

• Architecting, designing, developing, implementing and supporting various components for the enterprise data warehouse and associated data sources. • Works independently and provides support for reporting projects, as well as process and architectural recommendations. • Mentoring staff, demonstrating best practices, planning for business continuity and sustainability and working on projects that support Business Intelligence and Data Warehousing functions as assigned. • Works with the user community and reporting team to understand business data and data transformation requirements. • Converts those requirements to functional/technical specifications for database and integration design. • Manages requests and/or components of larger projects with appropriate documentation and status reports. • Architects data integration and transformation processes across on premise and cloud data sources. • Provides leadership and is a subject matter expert depending on area of focus. • Designs and develops data integration and transformation processes across on-premises and cloud data sources using enterprise tools such as Informatica, Denodo, SQL Server, SSIS, Oracle Data Integrator, and Boomi AtomSphere, selected based on environment and requirements. • Develops data models and database designs that leverage their knowledge of data warehouse modeling best practices. • Collaborates with other team members using scrum processes or other project management disciplines depending on area of focus. • Follows best practices for change and code management. • Provides supporting technical documentation that is written, organized and maintained. • Participates in ongoing stabilization, support and maintenance. • Provides on call support as needed for business continuity and operations. • Attends webinars, reads case studies and white papers, and participates in opportunities to further refine skills and stay apprised on functional area topics.

New York
$86.5K - $129.7K / year
IntegriChain logo

Senior Data Engineer

IntegriChain

Data-Driven Commercialization

Data Engineer2 days ago
Full TimeRemoteTeam 501-1,000H1B Sponsor

• Help define and mature data integration, data consolidation, MDM integration, and data platform design patterns across Integrichain. • Design, build, optimize, and operate Snowflake data models, pipelines, stored procedures, and high-volume data processing patterns. • Partner with MDM and Product teams to support HCO Master data ingestion, outbound extracts, cross-reference data, golden record consumption, survivorship outputs, and downstream publishing patterns. • Work with Product, Engineering, MDM, Data Science, DevOps, Security, and business stakeholders to align data solutions to enterprise priorities. • Use dbt or similar ELT tooling to develop reliable, maintainable, testable, and observable data pipelines. • Drive Snowflake performance tuning, warehouse sizing, workload management, cost tracking, and cost optimization practices. • Partner with Data Science leadership to rationalize and consolidate the enterprise data landscape across products, platforms, and acquired capabilities. • Define reusable data integration patterns for batch, micro-batch, near-real-time, and application-to-application data exchange. • Collaborate with cross-functional teams to understand business data needs, source-system realities, and enterprise application integration requirements. • Design scalable patterns for ingesting, transforming, mastering, and publishing data across operational and analytical use cases. • Help establish standards for data contracts, schema evolution, data quality, lineage, and data ownership. • Design and build data pipelines that load source data into Reltio MDM and extract mastered outputs from Reltio for downstream Snowflake, analytics, AI, and operational use cases. • Partner with MDM configuration and Product Management teams to translate HCO mastering requirements into data pipeline, mapping, validation, reconciliation, and publishing patterns. • Work with Reltio APIs, exports, crosswalks/XREFs, event-based integration patterns, and bulk load/extract mechanisms as needed to support inbound and outbound data flows. • Engineer integration patterns for HCO Master data, including party/entity, address, identifier, hierarchy, relationship, match/merge, survivorship, and golden record outputs. • Support source ingestion and reference data integration involving datasets such as HIN, DEA, NPI, NCPDP, 340B/PHS, channel outlet data, customer/account data, and other life sciences master/reference sources. • Develop validation and reconciliation processes to compare source data, Reltio mastered data, Snowflake curated data, and downstream consumption layers. • Help operationalize MDM outputs for business-facing data products, semantic models, reporting tables, APIs, and AI-ready datasets. • Design Snowflake database, schema, table, view, and semantic-layer patterns that support performance, governance, and maintainability. • Optimize Snowflake workloads using clustering, micro-partition awareness, warehouse sizing, query profiling, caching behavior, and workload isolation. • Implement Snowflake cost tracking and optimization practices, including warehouse utilization monitoring, inefficient query identification, and cost allocation by workload, team, or use case. • Build scalable SQL and Snowflake stored procedure logic for large-volume data processing and analytical workloads. • Apply secure Snowflake design patterns including RBAC, masking, access isolation, auditing, and environment separation. • Design, build, and maintain reliable ELT pipelines using dbt or comparable modern data transformation tooling. • Develop Python-based automation for API integration, file processing, metadata management, validation, orchestration support, and operational tooling. • Develop modular, tested, and reusable transformation models for raw, curated, mastered, and business-ready data layers. • Implement automated data quality checks, source freshness checks, reconciliation, logging, and exception-handling patterns. • Build orchestration-ready pipelines that support dependency management, restartability, incremental loads, and operational monitoring. • Collaborate with DevOps/SRE teams on CI/CD, deployment automation, environment promotion, and operational runbooks for data pipelines. • Spearhead logical and physical data modeling efforts for enterprise analytical, operational, MDM, and AI-ready datasets. • Design models that balance normalization, dimensional modeling, medallion/lakehouse concepts, and application-specific consumption needs. • Create denormalized reporting and semantic-model-ready structures that simplify business consumption and reduce ambiguity for AI/LLM use cases. • Process and optimize large data volumes in Snowflake using efficient SQL, PL/SQL-style procedural logic, Snowflake Scripting, and performance-aware design. • Create reusable patterns for historical tracking, snapshots, audit columns, data versioning, and lifecycle management. • Ensure data models support downstream BI, AI/ML, semantic models, data apps, MDM Explorer/Entity 360 use cases, and enterprise reporting.

Pennsylvania