A leading provider of advanced data, application, and cloud engineering services.
Lead Data Engineer – Data Architect
Location
India
Posted
2 days ago
Salary
0
Seniority
Senior
Job Description
Lead Data Engineer – Data Architect
Enable Data
• Architecture & Solution Design • Overall 15+ years of experience with 10+ years of experience in Data Engineering and Analytics solutions. • Design end-to-end data architectures on Microsoft Azure. • Define data ingestion, transformation, storage, governance, and consumption strategies. • Create scalable, secure, and cost-effective data solutions aligned with business objectives. • Establish architectural standards, design patterns, and best practices for data engineering teams. • Collaborate with business stakeholders, product owners, and technical teams to translate requirements into technical solutions. • Design, develop, and modernize enterprise data platforms on Azure. Analyze, refactor, and redesign existing data pipelines and products while building new scalable data solutions. • Act as a hands-on individual contributor with technical leadership responsibilities, including mentoring junior developers and driving best practices. • Strong hands-on experience in PySpark, Python, and SQL • Expertise in Azure Databricks and Azure data ecosystem • Experience designing and implementing scalable data architectures • Ability to analyze, optimize, refactor, and redesign existing data pipelines • Develop and maintain ETL/ELT solutions and data products • Performance tuning, troubleshooting, and data quality implementation • Mentor and guide junior developers and conduct code reviews • Collaborate with cross-functional teams to deliver end-to-end data solutions
Job Requirements
- Azure Data Lake, Azure Data Factory, Azure Synapse
- Data warehousing, lakehouse architecture, and cloud migration
- CI/CD, Git, and Agile development practices
- Role Type: Senior Individual Contributor with Technical Leadership responsibilities.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Senior Product Manager, Data Platform
Arbital HealthWe are a neutral third-party adjudication utility that is accelerating the $1 trillion shift to value-based care
• Own the product roadmap for Arbital’s data pipeline platform, including ingestion, transformation, calculation, validation, audit trail, and AI consumption layers • Define and prioritize pipeline capabilities based on client needs, implementation learnings, engineering constraints, and long-term platform scalability goals • Translate complex healthcare data requirements from claims processing to VBC contract logic into structured, buildable product specs • Partner with leadership to align pipeline investments with Arbital’s broader product and go-to-market strategy • Write detailed PRDs, user stories, and technical specifications for platform features, configurations, and automation tooling • Work directly with engineering to scope, sequence, and ship pipeline improvements — balancing speed, quality, and flexibility • Define acceptance criteria and lead QA processes for new pipeline & platform capabilities, ensuring outputs meet accuracy and performance standards • Drive platform delivery end-to-end, owning prioritization, cross-team dependencies, and release coordination • Develop deep fluency in Arbital’s data models, pipeline architecture, and healthcare data standards (claims, eligibility, attribution, CMS/ACO files), and actuarial concepts (IBNR, rebates, contract terms) • Work hands-on with data — running SQL queries, reviewing pipeline outputs, and validating logic — to inform product decisions and support debugging • Define standards for data quality, deduplication, business rule configuration, lineage, and pipeline observability across all client environments • Evaluate and recommend tooling improvements across the platform stack (e.g., Airflow, Databricks, AWS) in partnership with engineering • Serve as the primary product owner for data capabilities across implementation, engineering, actuarial, and data science teams • Partner closely with the Implementation team to surface recurring client configuration needs and turn them into scalable platform features • Collaborate with actuarial and data science teams to ensure pipeline logic correctly supports attribution, aggregation, and actuarial use cases • Communicate roadmap priorities, tradeoffs, and delivery status clearly to both technical teams and non-technical stakeholders
Senior Data Warehouse Developer
UR VenturesEnsuring University of Rochester innovations enrich our community, improve our society, and make the world ever better
• Architecting, designing, developing, implementing and supporting various components for the enterprise data warehouse and associated data sources. • Works independently and provides support for reporting projects, as well as process and architectural recommendations. • Mentoring staff, demonstrating best practices, planning for business continuity and sustainability and working on projects that support Business Intelligence and Data Warehousing functions as assigned. • Works with the user community and reporting team to understand business data and data transformation requirements. • Converts those requirements to functional/technical specifications for database and integration design. • Manages requests and/or components of larger projects with appropriate documentation and status reports. • Architects data integration and transformation processes across on premise and cloud data sources. • Provides leadership and is a subject matter expert depending on area of focus. • Designs and develops data integration and transformation processes across on-premises and cloud data sources using enterprise tools such as Informatica, Denodo, SQL Server, SSIS, Oracle Data Integrator, and Boomi AtomSphere, selected based on environment and requirements. • Develops data models and database designs that leverage their knowledge of data warehouse modeling best practices. • Collaborates with other team members using scrum processes or other project management disciplines depending on area of focus. • Follows best practices for change and code management. • Provides supporting technical documentation that is written, organized and maintained. • Participates in ongoing stabilization, support and maintenance. • Provides on call support as needed for business continuity and operations. • Attends webinars, reads case studies and white papers, and participates in opportunities to further refine skills and stay apprised on functional area topics.
• Help define and mature data integration, data consolidation, MDM integration, and data platform design patterns across Integrichain. • Design, build, optimize, and operate Snowflake data models, pipelines, stored procedures, and high-volume data processing patterns. • Partner with MDM and Product teams to support HCO Master data ingestion, outbound extracts, cross-reference data, golden record consumption, survivorship outputs, and downstream publishing patterns. • Work with Product, Engineering, MDM, Data Science, DevOps, Security, and business stakeholders to align data solutions to enterprise priorities. • Use dbt or similar ELT tooling to develop reliable, maintainable, testable, and observable data pipelines. • Drive Snowflake performance tuning, warehouse sizing, workload management, cost tracking, and cost optimization practices. • Partner with Data Science leadership to rationalize and consolidate the enterprise data landscape across products, platforms, and acquired capabilities. • Define reusable data integration patterns for batch, micro-batch, near-real-time, and application-to-application data exchange. • Collaborate with cross-functional teams to understand business data needs, source-system realities, and enterprise application integration requirements. • Design scalable patterns for ingesting, transforming, mastering, and publishing data across operational and analytical use cases. • Help establish standards for data contracts, schema evolution, data quality, lineage, and data ownership. • Design and build data pipelines that load source data into Reltio MDM and extract mastered outputs from Reltio for downstream Snowflake, analytics, AI, and operational use cases. • Partner with MDM configuration and Product Management teams to translate HCO mastering requirements into data pipeline, mapping, validation, reconciliation, and publishing patterns. • Work with Reltio APIs, exports, crosswalks/XREFs, event-based integration patterns, and bulk load/extract mechanisms as needed to support inbound and outbound data flows. • Engineer integration patterns for HCO Master data, including party/entity, address, identifier, hierarchy, relationship, match/merge, survivorship, and golden record outputs. • Support source ingestion and reference data integration involving datasets such as HIN, DEA, NPI, NCPDP, 340B/PHS, channel outlet data, customer/account data, and other life sciences master/reference sources. • Develop validation and reconciliation processes to compare source data, Reltio mastered data, Snowflake curated data, and downstream consumption layers. • Help operationalize MDM outputs for business-facing data products, semantic models, reporting tables, APIs, and AI-ready datasets. • Design Snowflake database, schema, table, view, and semantic-layer patterns that support performance, governance, and maintainability. • Optimize Snowflake workloads using clustering, micro-partition awareness, warehouse sizing, query profiling, caching behavior, and workload isolation. • Implement Snowflake cost tracking and optimization practices, including warehouse utilization monitoring, inefficient query identification, and cost allocation by workload, team, or use case. • Build scalable SQL and Snowflake stored procedure logic for large-volume data processing and analytical workloads. • Apply secure Snowflake design patterns including RBAC, masking, access isolation, auditing, and environment separation. • Design, build, and maintain reliable ELT pipelines using dbt or comparable modern data transformation tooling. • Develop Python-based automation for API integration, file processing, metadata management, validation, orchestration support, and operational tooling. • Develop modular, tested, and reusable transformation models for raw, curated, mastered, and business-ready data layers. • Implement automated data quality checks, source freshness checks, reconciliation, logging, and exception-handling patterns. • Build orchestration-ready pipelines that support dependency management, restartability, incremental loads, and operational monitoring. • Collaborate with DevOps/SRE teams on CI/CD, deployment automation, environment promotion, and operational runbooks for data pipelines. • Spearhead logical and physical data modeling efforts for enterprise analytical, operational, MDM, and AI-ready datasets. • Design models that balance normalization, dimensional modeling, medallion/lakehouse concepts, and application-specific consumption needs. • Create denormalized reporting and semantic-model-ready structures that simplify business consumption and reduce ambiguity for AI/LLM use cases. • Process and optimize large data volumes in Snowflake using efficient SQL, PL/SQL-style procedural logic, Snowflake Scripting, and performance-aware design. • Create reusable patterns for historical tracking, snapshots, audit columns, data versioning, and lifecycle management. • Ensure data models support downstream BI, AI/ML, semantic models, data apps, MDM Explorer/Entity 360 use cases, and enterprise reporting.
Product Manager/Technical Product Owner – Data Platform, AI Readiness
Software MindSoftware House focused on results since 1999
• Co-define and evolve the data platform roadmap in collaboration with architecture and engineering teams • Translate technical and business requirements into epics, user stories, and technical backlog items • Work closely with Data Engineers and Architects on: data models and architectures (batch/streaming) data pipelines and ingestion frameworks storage (e.g. data lake / data warehouse) and processing layers • Support design and implementation of platform components for: machine learning workflows (MLOps, feature stores, model lifecycle) data observability, lineage, and quality monitoring • Ensure datasets are reliable, well-structured, and ready for analytics and ML use cases • Participate in technical discussions and architecture decisions (not just coordination) • Drive delivery in an agile setup (planning, backlog refinement, prioritisation) • Communicate technical concepts in a clear way to non-technical stakeholders




