Job Closed
This listing is no longer active.
Empowering the people that make global commerce happen.
Data Architect, Data Platform – Azure
Location
United States
Posted
161 days ago
Salary
0
Seniority
Lead
Job Description
Data Architect, Data Platform – Azure
CargoSprint
• Own the design and delivery of our data warehouse and data marts: models, schemas, semantic consistency, and documentation. • Build and evolve reporting foundations: curated datasets, metric definitions, and repeatable reporting patterns. • Partner with stakeholders to define source-of-truth metrics and ensure consistent definitions across teams. • Establish standards for data quality, testing, reconciliation, lineage, access control, and lifecycle management. • Drive modernization: reduce legacy data debt, simplify flows, and improve reliability and performance across the platform. • Define data contracts with application teams (schemas, events/CDC patterns) so downstream reporting is stable. • Enable self-serve analytics by making data discoverable, documented, and safe to use.
Job Requirements
- 8+ years in data engineering, analytics engineering, or platform engineering with architecture ownership
- Proven experience building data warehouses and marts that support business reporting at scale
- Strong SQL and data modeling skills (dimensional modeling, star schemas, and/or semantic models)
- Experience with Azure data platforms (Synapse/Fabric, ADLS, Databricks on Azure, ADF) or comparable equivalents (Snowflake, BigQuery, Redshift)
- Experience with orchestration and transformation tooling (ADF, Airflow, dbt, Dagster, or equivalent)
- Track record of modernization with incremental migration and clear deprecation plans
- Ability to align engineering and business stakeholders around shared definitions and priorities
- Nice to have: Experience with BI layers and semantic modeling (Power BI preferred; Tableau/Looker also fine)
- Streaming/event-driven data patterns (Kafka/Kinesis/PubSub) or CDC experience
- Payments, billing, invoicing, or other high-volume transactional domains
- Logistics, cargo, or supply chain experience
- Spanish language proficiency.
Benefits
- Medical, dental, and vision plans for you and your family
- 401(k) with company match
- Generous flexible PTO program and paid holidays
- Professional development opportunities
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Build and maintain data pipelines that support ingest, logging, validation, transformation, and secure data access. • Contribute to the development and improvement of scalable ETL/ELT processes under guidance from senior engineers. • Implement and evolve data models and storage patterns that support analytics and operational use cases. • Apply data quality checks, monitoring, and documentation to ensure data is accurate, reliable, and well understood. • Collaborate with internal teams and clients to understand data requirements and support analytics use cases. • Contribute to the ongoing development of Expression’s VOR Data Platform, including tooling, standards, and automation. • Participate in research, prototypes, and proof-of-concept work to evaluate new data and analytics technologies, including emerging AI-driven tools. • Follow established engineering practices and contribute ideas to improve reliability, performance, and developer experience.
• Design, maintain, and validate data schemas supporting federation and integration between C2SET and external systems • Build and support SQL and Python based ETL pipelines for operational, simulation, and analytics data • Ensure data integrity, correctness, and performance across distributed and multi tier data sources • Troubleshoot data mismatches, malformed messages, schema drift, and integration issues • Partner with modeling and simulation engineers to analyze simulation outputs and support tuning of behaviors and decision logic • Design and execute data driven experiments to evaluate model changes and operational impacts • Develop datasets, scripts, and tooling to support repeatable validation and performance analysis • Support Government analysts and integrators by delivering timely, reliable data refreshes and analysis • Perform database performance tuning and optimization for operational workloads • Maintain data documentation, metadata repositories, and data governance artifacts • Act as a functional lead for data engineering and analytics activities within the integrations team
Data Migration Specialist
Prospyr MedicalA HIPAA compliant solution that makes it easy for Aesthetics providers to manage and grow their practices.
• Own end-to-end data migrations for new and expanding customers • Import and validate patient, appointment, invoice, payment, provider, service, and membership data • Map data from a wide range of legacy systems (EMRs, POS tools, spreadsheets, exports) • Identify, clean, normalize, and reconcile inconsistent or incomplete data • Perform QA checks to ensure data accuracy, completeness, and integrity post-migration • Work directly with customers during onboarding to define migration scope and timelines • Explain data requirements, limitations, and tradeoffs in a clear, customer-friendly way • Support go-live readiness by ensuring migrated data aligns with customer workflows • Troubleshoot and resolve migration issues quickly and accurately • Maintain and improve migration playbooks, templates, and checklists • Document repeatable patterns for common legacy systems • Partner with Engineering and Product to improve migration tooling and automation • Surface recurring data issues and upstream product improvements • Partner closely with Customer Experience and Implementation teams on go-live execution • Coordinate with Engineering on complex migrations or edge cases • Provide internal visibility into migration status, risks, and blockers
• Act as the transition point between Prompt Engineering and Data Labeling, translating model and product requirements into concrete data and annotation workflows. • Design, implement, and maintain scalable data workflows for dataset generation, curation, and ongoing maintenance. • Ensure data quality and consistency across labeling projects, with a focus on operational reliability for production AI systems. • Create, review, and maintain high-quality annotations across multiple modalities, including text, audio, conversational transcripts, and structured datasets. • Identify labeling inconsistencies, data errors, and edge cases; propose and enforce corrective actions and improvements to annotation standards. • Utilize platforms such as Labelbox, Label Studio, or Langfuse to manage large-scale labeling workflows and enforce consistent task execution. • Use Python and SQL for data extraction, validation, transformation, and workflow automation across labeling pipelines. • Leverage LLMs (e.g., GPT-4, Claude, Gemini) for prompt-based quality checks, automated review, and data validation of annotation outputs. • Implement automated QA checks and anomaly-detection mechanisms to scale quality assurance for large datasets. • Analyze annotation performance metrics and quality trends to surface actionable insights that improve labeling workflows and overall data accuracy. • Apply statistical analysis to detect data anomalies, annotation bias, and quality issues, and partner with stakeholders to mitigate them. • Collaborate with ML and Operations teams to refine labeling guidelines and enhance instructions based on observed patterns and error modes. • Work closely with Prompt Engineering, Data Labeling, and ML teams to ensure that data operations align with model requirements and product goals. • Document data standards, annotation guidelines, and workflow best practices for use by internal teams and external labeling partners.




