Job Closed

This listing is no longer active.

SkyFi logo
SkyFi

SkyFi is an equal-opportunity employer that values and encourages workplace diversity.

Geospatial Data Platform + Label Ops Engineer

Data EngineerData EngineerOtherRemoteMid LevelTeam 11-50

Location

United States

Posted

92 days ago

Salary

$180K - $225K / year

Seniority

Mid Level

No structured requirement data.

Job Description

Geospatial Data Platform + Label Ops Engineer

SkyFi

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more. Role Description As a Geospatial Data Platform / Label Ops Engineer on the AI/Advanced Engineering team, you’ll own the imagery and labeling data plane behind SkyFi’s near-real-time satellite analytics, making diverse partner imagery fast to ingest, consistent to use, and reproducible end-to-end. You’ll build and operate scalable pipelines to normalize and catalog imagery across many sensors/providers, deliver high-performance tiling/chipping and retrieval services for training and inference, and implement dataset + label versioning and lineage so every model output and evaluation result can be traced back to the exact data used. You’ll define and maintain our labeling pipeline with QA/adjudication and auditability. Working closely with CV and runtime owners, you’ll ship self-serve data products that speed up iteration and improve accuracy. This is a high ownership position where you’ll be a cornerstone member of a team that is empowering the future of Geospatial AI. Qualifications - Demonstrated experience building geospatial imagery pipelines at scale (raster workflows, tiling/chipping, handling heterogeneous sensors/metadata). - Strong data engineering fundamentals: idempotency, backfills, observability, SLAs, schema evolution, and production reliability. - Experience building internal data APIs/SDKs and treating data as a product. - Hands-on experience with labeling workflows or data QA at scale (vendor coordination, task design, QA/adjudication mechanics). - Ability to collaborate tightly with CV/eval owners to translate failure modes into actionable data/labeling pipelines. Requirements - Own the imagery data plane: ingest, normalize, catalog, and serve imagery + metadata across diverse sources for near-real-time and batch workloads. - Build and operate tiling/chipping + retrieval services optimized for training and NRT inference (spatial/temporal indexing, caching, precompute, and latency SLAs). - Implement dataset and label versioning + lineage so every model run / evaluation can be reproduced. - Build and run label ops workflows: task generation, QA, adjudication, gold-check insertion, audit-ability, throughput tracking. - Create data products for internal consumers (APIs/services) that let CV engineers self-serve imagery chips, labels, and eval sets. - Build robust backfill/reprocessing pipelines (idempotent, observable, safe incremental recompute) to support new analytics and changing requirements. - Establish data health monitoring (freshness, completeness, corruption, sensor distribution drift, metadata validation) with alerts and dashboards. - Partner with evaluation and runtime owners to close the loop of failure buckets -> labeling requests -> dataset versions -> retraining/eval. - Partner with computer vision researchers to define image and label strategies for new projects. - Responsible for making sure everyone has the images/data/labels they need. Benefits - Be well compensated. Possibility for equity. - Receive best-in-class benefits, including premium medical, dental, and vision coverage and 20 days paid time off. - Play a critical role in building a market-changing product in the exciting realm of Space. - Thrive in a fast-paced, dynamic environment that rewards initiative, innovation, and getting things done. Salary Band $180,000–$220,000 USD base salary

Job Requirements

  • Demonstrated experience building geospatial imagery pipelines at scale (raster workflows, tiling/chipping, handling heterogeneous sensors/metadata).
  • Strong data engineering fundamentals: idempotency, backfills, observability, SLAs, schema evolution, and production reliability.
  • Experience building internal data APIs/SDKs and treating data as a product.
  • Hands-on experience with labeling workflows or data QA at scale (vendor coordination, task design, QA/adjudication mechanics).
  • Ability to collaborate tightly with CV/eval owners to translate failure modes into actionable data/labeling pipelines.
  • Own the imagery data plane: ingest, normalize, catalog, and serve imagery + metadata across diverse sources for near-real-time and batch workloads.
  • Build and operate tiling/chipping + retrieval services optimized for training and NRT inference (spatial/temporal indexing, caching, precompute, and latency SLAs).
  • Implement dataset and label versioning + lineage so every model run / evaluation can be reproduced.
  • Build and run label ops workflows: task generation, QA, adjudication, gold-check insertion, audit-ability, throughput tracking.
  • Create data products for internal consumers (APIs/services) that let CV engineers self-serve imagery chips, labels, and eval sets.
  • Build robust backfill/reprocessing pipelines (idempotent, observable, safe incremental recompute) to support new analytics and changing requirements.
  • Establish data health monitoring (freshness, completeness, corruption, sensor distribution drift, metadata validation) with alerts and dashboards.
  • Partner with evaluation and runtime owners to close the loop of failure buckets -> labeling requests -> dataset versions -> retraining/eval.
  • Partner with computer vision researchers to define image and label strategies for new projects.
  • Responsible for making sure everyone has the images/data/labels they need.

Benefits

  • Be well compensated. Possibility for equity.
  • Receive best-in-class benefits, including premium medical, dental, and vision coverage and 20 days paid time off.
  • Play a critical role in building a market-changing product in the exciting realm of Space.
  • Thrive in a fast-paced, dynamic environment that rewards initiative, innovation, and getting things done.
  • Salary Band
  • $180,000–$220,000 USD base salary

Related Categories

Related Job Pages

More Data Engineer Jobs

BETSOL logo

Data Engineer

BETSOL

Together we make unbelievable happen.

Data Engineer92 days ago
Full TimeRemoteTeam 501-1,000H1B Sponsor

• design, build, and maintain scalable data pipelines and ingestion frameworks • deliver high value POCs to stabilize and build a strong foundation for an enterprise data platform • develop custom ingestion, optimize data workflows • ensure reliable data delivery into Snowflake or other cloud-based platforms • collaborate with analytics, product, and engineering teams to enable data-driven decision-making

India
Job Closed
Avvale logo

Senior Data Engineer

Avvale

Enabling What's Next

Data Engineer92 days ago
Full TimeRemoteTeam 1,001-5,000Since 2023H1B No Sponsor

• Design, implement, and optimize data engineering architectures and processes, ensuring scalability, security, and efficiency. • Develop and implement data engineering solutions in Azure environments (Data Factory, Synapse Analytics, Databricks, Data Lake, SQL Database). • Build high-performance, reliable data pipelines (ETL/ELT). • Work on data modeling, system integration, and information governance. • Ensure best practices for performance, security, and scalability. • Collaborate with multidisciplinary teams and international stakeholders. • Participate in meetings and presentations in advanced English.

Brazil
Capital Rx logo

Senior Data Engineer

Capital Rx

Affordable Pharmacy Benefits, Powered by Modern Infrastructure.

Data Engineer92 days ago
OtherRemoteTeam 501-1,000Since 2017H1B No Sponsor

• Build, test, and document Snowflake data models and business logic in dbt • Apply and improve data quality, testing, observability, and lineage standards • Collaborate with cross-functional partners to define data contracts and interfaces • Contribute to Capital Rx’s modular data platform and client-specific data configurations • Participate in design reviews; propose scalable, maintainable patterns • Monitor pipeline health, troubleshoot incidents, and drive root-cause fixes • Optimize cost and performance of jobs, storage, and queries with guidance • Write clear documentation and support knowledge sharing • Adhere to the Capital Rx Code of Conduct, including reporting of noncompliance

United States
$140K - $160K / year
Capital Rx logo

Data Engineer I

Capital Rx

Affordable Pharmacy Benefits, Powered by Modern Infrastructure.

Data Engineer92 days ago
OtherRemoteTeam 501-1,000Since 2017H1B No Sponsor

• Partner closely with operational and analytics teams to translate business needs into analytics-ready data models, curated data marts, and reusable datasets. • Build and maintain dbt projects (staging → intermediate → marts), including model layering, macros, exposures, packages, and source freshness to accelerate delivery and improve reliability. • Design and evolve dimensional models (facts/dimensions, grain, SCD patterns) and analytics patterns aligned with key business domains. • Own metrics definitions end-to-end: establish KPI logic, naming standards, calculation rules, and governance; maintain a metrics/semantic layer so stakeholders use consistent definitions across tools and teams. • Implement robust data quality practices (dbt tests, constraints, anomaly checks/monitoring) to ensure reliable downstream reporting and decision-making. • Produce clear documentation and enablement assets (dbt docs, lineage, data dictionaries, metric definitions, onboarding guides) to support self-service analytics. • Optimize warehouse performance and cost in Snowflake through SQL tuning, incremental models, appropriate materializations, and efficient data design. • Use disciplined software engineering practices (Git-based workflows, peer review, CI checks, versioning, environment promotion) to deploy changes safely and repeatably. • Proactively identify and resolve data-related issues, ensuring system reliability and data integrity. • Provide ongoing support for critical analytics data products and participate in an on-call rotation as needed. • Responsible for adherence to the Capital Rx Code of Conduct including reporting of noncompliance.

United States
$100K - $120K / year