The Future is Now; Beyond Boundaries, Beyond Imagination
Staff Data Engineer
Location
Colombia
Posted
10 days ago
Salary
0
Seniority
Lead
Job Description
Staff Data Engineer
Gugu Robotics
• Define data architecture and platform strategy, leading design across pipelines, warehouses, and data lakes • Build and optimize scalable data pipelines supporting batch and real-time processing • Define and enforce data governance, quality standards, and compliance frameworks across the platform • Build monitoring, logging, and alerting for data pipelines and services, and contribute to CI/CD workflows for data deployment and automation • Drive data platform modernization, optimizing for performance, cost, and scalability • Bring an AI-forward mindset to your daily work, using tools like Claude, Cursor, and other modern AI assistants to ship higher-quality work at pace • Design and implement data contracts and event flows in collaboration with backend, platform, and engineering teams • Lead the design and implementation of data pipelines for production AI/ML systems, including embeddings, vector stores, RAG data preparation, feature stores, and training/inference data flows • Integrate data services with APIs, middleware, and third-party systems to support end-to-end data consumption • Partner with leadership on data strategy, translating technical depth into decisions others can act on • Collaborate closely with engineering, analytics, AI, and product teams to align data platforms with broader goals • Advocate for data quality, governance, and platform best practices across teams and engagements • Establish data engineering standards that lift the quality and consistency of work across the team • Mentor junior and mid-level engineers, helping them grow their craft, confidence, and impact • Make high-stakes architectural decisions with clear ownership and consideration of long-term tradeoffs
Job Requirements
- 7+ years of professional data engineering experience, with experience leading complex data platform initiatives
- Strong system architecture background with expertise in distributed data systems
- Expert proficiency in Python, Scala, and SQL
- Deep expertise with cloud-native data platforms and enterprise data warehousing
- Strong expertise in data pipeline orchestration and processing
- Strong experience with streaming platforms and real-time data processing (e.g., Kafka, Kinesis, Pub/Sub)
- Strong data modeling expertise and experience with data transformation
- Strong experience with data quality, governance, and compliance frameworks
- Strong experience with container orchestration and CI/CD for data systems
- Strong experience building data pipelines for production AI/ML systems, including embeddings, vector stores, RAG data preparation, feature stores, and training/inference data flows
- Demonstrated leadership and technical mentoring experience across a team or organization
- Strong stakeholder communication skills, with the ability to translate technical depth across audiences
- Demonstrable, day-to-day usage and expert knowledge of AI-forward coding tools such as Claude and Cursor
- Excellent problem-solving skills and the ability to navigate highly ambiguous technical and business challenges with sound judgment
- Experience with data mesh or data fabric concepts, lakehouse architectures, or governance framework implementation is a plus.
Benefits
- Competitive salary
- Flexible working hours
- Professional development opportunities
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Leverage your data and healthcare experience to create and communicate the vision and roadmap of Wellth’s data ingestion, warehousing, transformation, analytics, and reporting systems. • Contribute to the design and implementation of Wellth’s data product capabilities (e.g., outcomes reporting, personalization, etc.) • Bring to life our next generation of attributed outcomes reporting that drives customer expansion – these improved member health outcomes are what we sell! • Manage and grow a high-caliber team of data and analytics engineers. • Deliver the vision by defining and driving the execution of data projects that deliver better health outcomes for members and clients. • Collaborate with product team members, technical leaders, and senior executives to understand and serve internal and external stakeholders. • Continuously maintain and improve the data quality bar throughout the data life cycle. You have experience working with data quality challenges that emerge from partner and third-party data. • Protect the privacy of Wellth’s data, its customers, and their patients by following secure SDLC guidelines to maintain our HITRUST certification.
Role Description We’re partnering with one of the world’s leading construction equipment manufacturers to reinvent how the industrial world builds, operates, and makes decisions. This is a rare chance to work shoulder-to-shoulder with a global market leader, tackling high-impact problems with technology and taking bold ideas from 0 to 1. As a Data Science lead on this venture, you’ll have the mandate and autonomy to turn raw concepts into real businesses - scoping, building, and launching software that can transform an industry. You'll act as the principal expert while leading data science for the venture. You will collaborate with your venture team to design and implement data pipelines, guide data science best practices, oversee data monitoring, and build AI/ML solutions. Your duties will include: - Assisting in validating the initial business concepts - Ideating and validating AI/ML use cases - Developing prototypes and proof of concepts - Working closely with the product and engineering team members to drive practical outcomes You’ll have the support you need to drive new innovation in impactful domains ripe for change and growth and be at the forefront of data science and technology advancements in the field. In this role, you will: - Act as the primary owner of Data Science, Analytics, and Data Engineering as a subject matter expert - Build out a team and incubate expertise amongst several venture start-ups - Create rapid proofs of concept, then scale into functional MVPs to turn concepts into tangible reality - Mentor and support early-stage venture teams to achieve bigger outcomes at a greater scale in data density and system complexity - Cross-pollinate learnings, best practices, and insights between multiple ventures to drive continuous improvement of Engineering and Data Science at UP.Labs - Enjoy working in a diverse, dynamic, collaborative, transparent, and inclusive environment where all ideas and opinions are equally valued Qualifications - 8 years experience within Data Science, Data Engineering, and Machine Learning domains and their practical applications - Experience working in a startup environment; member of a founding team ideal - Hands-on experience with machine learning algorithms like random forest, linear and logistic regressions, gradient boosting, classification - Familiarity and preference for working in ambiguous, fast-paced environments such as startups and growth-phase tech companies - Hands-on and end-to-end product build, development, and delivery experience - Experience working with or managing and leading remote, distributed teams including full-time data scientists, engineers and vendors/contractors - Awareness of the latest in Data Science and Data Engineering trends, as well as new use cases within the ML Space - Experience working with major cloud environments (Azure, GCP, AWS) and cloud-native software architectures - Experience with Database and Warehouse solutions, e.g. Data Bricks, Snowflake, Fivetran, DBT and respective public cloud data infrastructure service offerings from AWS, GCP, and Azure - Strong experience in collaborating with Product teams to find effective solutions - Strong communication skills put to use by explaining technical vision and deeply technical concepts to a variety of multidisciplinary team members - An open, curious, and humble mindset that builds on to our open, inclusive, and collaborative environment Additional Desired Competencies - Experience with systems planning in the domains of transportation, aviation, or digital simulation would be valuable - Containerized deployment Tools like Kubernetes, and Docker - Infrastructure as Code Tools like Terraform and OpenTofu - Familiarity with AB testing setup and analysis - Familiar with Reinforcement Learning for practical use cases Location Santa Monica, CA or Remote Travel 6+ weeks per year
Senior Data Architect
Nimble GravityData Science, Digital Transformation and eCommerce Strategy from experienced eCommerce and AI/ML experts
• Collaborate with cross-functional teams to understand business requirements and translate them into effective cloud-based data architecture solutions. • Design data models, data flow diagrams, and schema structures that optimize data storage, retrieval, and processing on cloud platforms. • Design and develop data integration strategies to consolidate data from diverse sources into a unified and coherent format. • Evaluate and select appropriate cloud services and tools for various data-related tasks. • Work closely with data engineers, data scientists, and other stakeholders to understand their needs and provide architectural guidance. • Responsible for overseeing the successful execution of data projects and ensuring the outcomes meet business objectives and technical requirements. • Lead and mentor junior team members, promoting knowledge sharing and continuous learning.
Senior Data Engineer
Ceresti HealthEveryone else treats the patient. We activate the caregiver—because that’s where dementia care really begins.
• Design and own Ceresti’s end-to-end data architecture: a landing zone with secure cloud object storage for raw partner files and API payloads, validated ingestion pipelines into our transactional Postgres, and a curated analytics layer that decouples reporting and AI workloads from production • Build ingestion pipelines for the data we receive today, including partner data files (CSV/JSON/XML/HL7/X12 as applicable) and REST/SFTP API integrations with schema validation, quarantine of bad records, and full lineage from raw bytes to curated row • Stand up and operate the curated layer (data warehouse / lakehouse-lite) so analytics and ML models can consume data without slowing down the transactional system • Choose, integrate, and operate the smallest set of tools needed, including object storage, an orchestrator (Dagster, Prefect, Airflow, etc.), dbt or similar for transformations, a single validation library (Great Expectations / Pandera / Soda) • Design and enforce data governance for a HIPAA-regulated environment: PHI/PII classification, encryption in transit and at rest, role-based access, audit logging, retention and minimum-necessary policies, and de-identification where appropriate • Partner with backend, ML, product, and clinical stakeholders to define data contracts with our health plan and ACO partners and hold the line on data quality • Build and maintain reliable feature data for ML models, including embeddings (e.g., pgvector) and curated feature tables for risk stratification, engagement, and outcomes work • Instrument the data platform for observability including pipeline SLAs, data freshness, schema drift, quality metrics, and act on what the data tells you • Participate fully in our Agile process: backlog grooming, sprint planning, demos, and retrospectives • Mentor engineers across the team on SQL, schema design, and the craft of building data systems that are boring in the best possible way




