Data Science, Digital Transformation and eCommerce Strategy from experienced eCommerce and AI/ML experts
Senior Data Engineer
Location
Latin America
Posted
3 days ago
Salary
0
Seniority
Senior
Job Description
Senior Data Engineer
Nimble Gravity
• Build, scale, and maintain robust data solutions. • Implement and optimize high-performance data pipelines: extraction, loading, transformation, and orchestration – that are designed for scalability, reliability, maintainability, and speed. • Champion modern software engineering practices as CI/CD, infrastructure-as-code, containerization, and cloud-native deployments • Collaborate closely with business stakeholders to transform use cases into production-ready services and solutions, owning the system from concept to production. • Implement rigorous testing and monitoring practices to maintain superior data quality and integrity.
Job Requirements
- A bachelor's degree or higher in a STEM field, required
- Concentration in Computer Science, Math, Physics or other engineering related field, preferred
- 5+ years of experience in data engineering or a related discipline, with a proven track record of success.
- Expertise in Python and SQL, with a strong foundation in data manipulation and analysis.
- Proficient with Databricks/PySpark and dbt for data warehousing and data transformation tasks.
- Experience with workflow orchestration tools e.g. Airflow, Dagster
- Experience working with large language models (LLMs) especially prompt engineering, retrieval-augmented generation (RAG)s, and/or vector databases are pluses.
- Knowledge of fundamental principles of machine learning, feature engineering, and knowledge graphs are pluses.
- Demonstrated experience in designing and implementing complex data systems from the ground up.
- Proficient in handling large-scale data projects, including data cleaning, ETL, and information retrieval.
- Excellent communication skills required, both verbal and written.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Data Engineer
OscilarAI Risk Decisioning™ platform that helps organizations manage onboarding, fraud, credit, and compliance risks
• Architect and implement scalable ETL and data pipelines spanning ClickHouse, Postgres, Athena, and diverse cloud-native sources to support real-time risk management and advanced analytics for AI-driven decisioning. • Design, develop, and optimize distributed data storage solutions to ensure both high performance (low latency, high throughput) and reliability at scale—serving mission-critical models for fraud detection and compliance. • Drive schema evolution, data modeling, and advanced optimizations for analytical and operational databases, including sharding, partitioning, and pipeline orchestration (batch, streaming, CDC frameworks). • Own the end-to-end data flow: integrate multiple internal and external data sources, enforce data validation and lineage, automate and monitor workflow reliability (CI/CD for data, anomaly detection, etc.). • Collaborate cross-functionally with engineers, product managers, and data scientists to deliver secure, scalable solutions that enable fast experimentation and robust operationalization of new ML/AI models. • Champion radical ownership—identify opportunities, propose improvements, and implement innovative technical and process solutions within a fast-moving, remote-first culture. • Mentor and upskill team members, cultivate a learning environment, and contribute to a collaborative, mission-oriented culture.
Mid/Senior Data Engineer
Modus CreateModus Create is a consulting firm founded in 2011 to help clients transform their businesses to succeed in the digital future. Modus Create employs a fully dist
Role Description We are looking for a Mid/Senior Data Engineer to join our Data Engineering practice and help clients build modern data foundations on Databricks and AWS. - Design and build data pipelines that extract from enterprise ERP systems, transform through medallion architectures, and deliver governed, AI-ready data products. - Work directly with client subject-matter experts to understand business domains, validate data models, and ensure the platform is production-grade from day one. - Current engagements involve regulated manufacturing environments where data governance, quality management, and traceability are essential. - This is a fully remote role with collaboration across distributed teams and daily overlap with the US Eastern Time Zone. Qualifications - 4–7+ years of experience as a Data Engineer or in a closely related role. - Strong programming skills in Python, including PySpark. - Solid SQL skills including complex analytical queries against large enterprise databases. - Hands-on experience with Databricks: Delta Lake, Unity Catalog, Databricks Workflows, and SQL Warehouse. - Working knowledge of AWS core services: S3, IAM, VPC, and networking fundamentals. - Experience building ETL/ELT pipelines that extract from enterprise ERP or transactional systems (Oracle, SAP, Microsoft Dynamics, or similar). - Strong understanding of data modeling, medallion architectures, and dimensional design. - Experience with data quality frameworks: validation rules, anomaly detection, and exception handling. - Experience using AI and LLM tools to accelerate engineering workflows — including deriving data contracts, mapping specifications, and schema documentation from database metadata and limited business context. - Comfortable collaborating directly with business stakeholders and subject-matter experts, not just engineering teams. - Ability to participate in technical discussions, code reviews, and architectural decisions with confidence. - Reliable high-speed internet and ability to work effectively in a remote-first environment. - Daily overlap with US Eastern Time Zone. Requirements - Familiarity with Oracle E-Business Suite table structures and data patterns (INV, PO, BOM, WIP modules). - Exposure to manufacturing domain concepts: bills of material, work orders, production routing, inventory management. - Experience with dbt for data transformation and data product development. - Hands-on experience with data governance and catalog tooling (Unity Catalog, AWS Glue/Datazone, Apache Atlas, or similar). - Multi-system data integration or ERP consolidation experience, reconciling different source schemas into a unified canonical model. - Spec-driven or contract-driven development methodology, YAML specifications, schema validation, data contracts. - Experience in medical device, pharmaceutical, or other regulated manufacturing environments. - Databricks Asset Bundles and CI/CD automation for data platform deployments. - Familiarity with Apache Iceberg or Delta Lake UniForm for open table format interoperability. - Experience supporting AI/ML workflows in production: feature engineering, model serving integration, or AI-ready data product design. Benefits - Building data foundations that power AI, analytics, and operational decision-making for manufacturing enterprises. - Working directly with domain experts to understand how real businesses operate, not just pushing data through pipes. - Solving multi-system integration challenges where no two ERPs store data the same way. - Designing platforms with governance, observability, and data quality built in from the outset. - Contributing to a reusable platform accelerator that will be deployed across multiple client engagements. - Raising the bar for how data engineering is done: spec-driven, tested, version-controlled, and production-grade.
• Build and operate production-grade ingestion pipelines from core clinical, operational, and third-party systems into our Databricks lakehouse • Develop and maintain dbt models that turn raw data into clean, well-documented, analytics-ready datasets • Establish data quality, testing, and monitoring practices that make pipelines reliable and trustworthy • Help shape ingestion patterns and architecture standards alongside the Principal Data Engineer • Enable company-wide metrics for care outcomes and operations • Collaborate with cross-functional leads to develop and iterate on a suite of core operational dashboards, ensuring teams have the self-service tools they need to track company metrics and outcomes. • Design, build, and operate production data pipelines across clinical, operational, and third-party systems using API-based ingestion, Change Data Capture (CDC), and event- or webhook-driven patterns • Build and maintain transformation layers in dbt, including tests, documentation, and reusable models • Develop and refine core analytical and longitudinal data models used across the company • Implement testing, monitoring, and observability to ensure data quality, pipeline reliability, and system performance • Apply strong engineering fundamentals to improve the scalability, performance, and cost-efficiency of data systems on AWS and Databricks • Partner with Product to support metric definitions, outcome measurement, and reporting needs • Contribute to engineering standards, code review, and a culture of knowledge sharing and continuous improvement • Partner with business, product, and engineering stakeholders to design and build intuitive data visualizations and dashboards that drive actionable insights and program visibility.
• Design and build the CDP from scratch: data ingestion, identity resolution, unified profile, activation. • Establish a multi-tenant architecture with data isolation by client and by state. • Build the AI enrichment layer: segmentation, predictive models (CLV, churn, propensity, next-best-product), and LLM-based preference enrichment. • Ensure privacy-by-design and compliance in a regulated vertical. • Work cross-functionally with product, engineering, marketing, and clients.




