Job Closed
This listing is no longer active.
Junior Data Engineer
Location
Texas
Posted
95 days ago
Salary
$40 - $44 / hour
Seniority
Junior
Job Description
Junior Data Engineer
Del Oro Consulting, Inc.
• Assist with connecting systems to data sources • Help manage and maintain data connections in Azure • Build and maintain ETL pipelines (Extract, Transform, Load) • Help automate manual data processes • Work with large, enterprise-level datasets • Document data sources, pipelines, and processes
Job Requirements
- Bachelor’s degree in computer science or data engineering
- 2+ years of Data Engineering experience
- Intermediate Python skills (Will be tested)
- Understanding of ETL and data pipelines
Benefits
- Medical
- Dental
- Vision
- 401(k) (with a match)
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Supervise junior members of the data engineering team. Guiding, planning, and reviewing the team's work • Create and maintain optimal data pipeline architecture • Assemble large, complex data sets that meet functional / non-functional business requirements • Extend our machine learning platform by designing tools that interface with cloud services, our current code base, and provide new flexibility in model building • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Python, and AWS • Build analytics tools to provide actionable insights into key business performance metrics, as well as supporting the needs of the analytics team • Create data-handling tools for analytics and data scientist team members that assist them in building and optimizing our decision-making process
Geospatial Data Engineer
Orcrist Technologies GmbHPioneering Future Technologies with Advanced AI and Data Analytics
• Build and operate data pipelines that supply GEOINT services with accurate, compliant, and performant spatial data • Own ingestion, transformation, versioning, and distribution across cloud and air-gapped environments • Collaborate with Data Analytics Team in creating value adding data products • Develop ingestion pipelines using Python, GDAL, Rasterio, tippecanoe, and PostGIS for vector/raster/3D datasets • Automate tiling, generalization, and 3D tile generation (Cesium 3D Tiles, quantized mesh, terrain) with incremental update workflows • Implement data quality checks (topology validation, completeness, coordinate reference integrity) and provenance tracking (lineage metadata, checksums) • Manage storage lifecycle across cloud (S3/GCS) and on-prem object stores; optimize performance and cost • Package data for offline distribution (MBTiles, geopackages, zipped 3D tiles), including delta updates and secure transfer • Collaborate with Data Acquisition and Licensing to enforce usage rights, export control, and compliance • Monitor pipelines (Prometheus, Grafana), maintain runbooks, and participate in on-call/incident response • Own end-to-end sourcing of new geospatial datasets (commercial and freely available)
• Design and maintain a unified data architecture: database schemas, data models, and micro-architecture solutions to ensure scalability and reliability. • Optimize database performance at all levels: indexing, partitioning, clustering, and tuning configuration parameters. • Ensure full compliance with GDPR, UK Data Protection Act, and other relevant regulations: data masking, consent management, retention policies, and privacy impact assessments • Optimize queries, schemas, and indexes where needed • Set up basic data quality checks • Support GDPR and UK data protection requirements, including: Data masking, Access control, Retention policies • Take data notebooks and calculation logic • Turn them into reliable, production-ready pipelines • Ensure scalability, reliability, and reproducibility
• Implement real-time data pipelines with MQTT and Redpanda for stream processing. • Implement offline data pipelines using Dagster for batch processing. • Parse and process binary message formats from various data sources. • Build data warehouses using Postgres, Apache Iceberg, Parquet, and S3. • Design data models that allow for high-performance queries. • Validate and normalize data sources. • Improve local development and CI/CD using modern tooling and GitHub Actions.




