Job Closed
This listing is no longer active.
Co-creating solutions for a better future
Data Engineer
Location
Colombia
Posted
105 days ago
Salary
0
Seniority
Senior
Job Description
Data Engineer
Stefanini LATAM
• Diseñar, desarrollar y mantener canales de datos y procesos ETL para garantizar el flujo eficiente y confiable de datos desde diversas fuentes a nuestro almacén de datos. • Implementar procesos de validación y calidad de los datos para garantizar la precisión e integridad de los datos. • Realizar modelado de datos y diseño de bases de datos para optimizar el almacenamiento y la recuperación de datos. • Optimizar el rendimiento y la escalabilidad de una base de datos para manejar volúmenes masivos de datos. • Supervisar y solucionar problemas de sistemas y canalizaciones de datos para identificar y resolver cualquier problema o cuello de botella. • Desarrollar y mantener documentación para procesos, sistemas y mejores prácticas de ingeniería de datos. • Colaborar con equipos interdisciplinarios para garantizar la integridad y seguridad de los datos.
Job Requirements
- Experiencia comprobada como ingeniero de datos o puesto similar.
- Fuerte dominio de SQL y experiencia con bases de datos relacionales (por ejemplo, PostgreSQL,Maria DB, MySQL)
- Experiencia con modelado de datos y principios de diseño de bases de datos.
- Familiaridad con herramientas y procesos ETL.
- Conocimiento de plataformas de datos basadas en la nube (por ejemplo, AWS, Azure, Google Cloud)
- Comprensión de los conceptos y tecnologías de almacenamiento de datos.
- Dominio de lenguajes de programación (p. ej., Python, Java, Scala)
- Experiencia en el uso de herramientas de Big Data como Hadoop, Spark, etc.
- Excelentes habilidades analíticas y de resolución de problemas.
- Fuertes habilidades de comunicación y colaboración.
- Capacidad para trabajar de forma independiente y en un entorno orientado al equipo.
- Atención al detalle y compromiso de realizar un trabajo de alta calidad.
Benefits
- Flexible work arrangements
- Professional development opportunities
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Geospatial Data Platform + Label Ops Engineer
SkyFiSkyFi is an equal-opportunity employer that values and encourages workplace diversity.
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description As a Geospatial Data Platform / Label Ops Engineer on the AI/Advanced Engineering team, you’ll own the imagery and labeling data plane behind SkyFi’s near-real-time satellite analytics, making diverse partner imagery fast to ingest, consistent to use, and reproducible end-to-end. You’ll build and operate scalable pipelines to normalize and catalog imagery across many sensors/providers, deliver high-performance tiling/chipping and retrieval services for training and inference, and implement dataset + label versioning and lineage so every model output and evaluation result can be traced back to the exact data used. You’ll define and maintain our labeling pipeline with QA/adjudication and auditability. Working closely with CV and runtime owners, you’ll ship self-serve data products that speed up iteration and improve accuracy. This is a high ownership position where you’ll be a cornerstone member of a team that is empowering the future of Geospatial AI. Qualifications - Demonstrated experience building geospatial imagery pipelines at scale (raster workflows, tiling/chipping, handling heterogeneous sensors/metadata). - Strong data engineering fundamentals: idempotency, backfills, observability, SLAs, schema evolution, and production reliability. - Experience building internal data APIs/SDKs and treating data as a product. - Hands-on experience with labeling workflows or data QA at scale (vendor coordination, task design, QA/adjudication mechanics). - Ability to collaborate tightly with CV/eval owners to translate failure modes into actionable data/labeling pipelines. Requirements - Own the imagery data plane: ingest, normalize, catalog, and serve imagery + metadata across diverse sources for near-real-time and batch workloads. - Build and operate tiling/chipping + retrieval services optimized for training and NRT inference (spatial/temporal indexing, caching, precompute, and latency SLAs). - Implement dataset and label versioning + lineage so every model run / evaluation can be reproduced. - Build and run label ops workflows: task generation, QA, adjudication, gold-check insertion, audit-ability, throughput tracking. - Create data products for internal consumers (APIs/services) that let CV engineers self-serve imagery chips, labels, and eval sets. - Build robust backfill/reprocessing pipelines (idempotent, observable, safe incremental recompute) to support new analytics and changing requirements. - Establish data health monitoring (freshness, completeness, corruption, sensor distribution drift, metadata validation) with alerts and dashboards. - Partner with evaluation and runtime owners to close the loop of failure buckets -> labeling requests -> dataset versions -> retraining/eval. - Partner with computer vision researchers to define image and label strategies for new projects. - Responsible for making sure everyone has the images/data/labels they need. Benefits - Be well compensated. Possibility for equity. - Receive best-in-class benefits, including premium medical, dental, and vision coverage and 20 days paid time off. - Play a critical role in building a market-changing product in the exciting realm of Space. - Thrive in a fast-paced, dynamic environment that rewards initiative, innovation, and getting things done. Salary Band $180,000–$220,000 USD base salary
• design, build, and maintain scalable data pipelines and ingestion frameworks • deliver high value POCs to stabilize and build a strong foundation for an enterprise data platform • develop custom ingestion, optimize data workflows • ensure reliable data delivery into Snowflake or other cloud-based platforms • collaborate with analytics, product, and engineering teams to enable data-driven decision-making
• Design, implement, and optimize data engineering architectures and processes, ensuring scalability, security, and efficiency. • Develop and implement data engineering solutions in Azure environments (Data Factory, Synapse Analytics, Databricks, Data Lake, SQL Database). • Build high-performance, reliable data pipelines (ETL/ELT). • Work on data modeling, system integration, and information governance. • Ensure best practices for performance, security, and scalability. • Collaborate with multidisciplinary teams and international stakeholders. • Participate in meetings and presentations in advanced English.
• Build, test, and document Snowflake data models and business logic in dbt • Apply and improve data quality, testing, observability, and lineage standards • Collaborate with cross-functional partners to define data contracts and interfaces • Contribute to Capital Rx’s modular data platform and client-specific data configurations • Participate in design reviews; propose scalable, maintainable patterns • Monitor pipeline health, troubleshoot incidents, and drive root-cause fixes • Optimize cost and performance of jobs, storage, and queries with guidance • Write clear documentation and support knowledge sharing • Adhere to the Capital Rx Code of Conduct, including reporting of noncompliance



