Job Closed

This listing is no longer active.

LeoLabs

Persistent Orbital Intelligence 📡 🛰️

Data Engineer

Data EngineerData EngineerFull Time Remote JuniorTeam 51-200Since 2016H1B SponsorCompany Site LinkedIn

Location

United States

Posted

51 days ago

Salary

Seniority

Junior

Bachelor Degree0.2 yrs expEnglishAirflow Apache AWS Cloud ETL Kafka Python Spark SQL

Job Description

• Play a key role in building and operating data pipelines and analytics infrastructure • Work closely with software engineers, radar and catalog teams, and data scientists • Ensure reliable extraction, transformation, and loading (ETL) of mission-critical datasets • Develop scalable batch and streaming data workflows • Enable advanced analytics and support machine learning initiatives • Help transform large volumes of sensor and orbital data into actionable intelligence • Engage in hands-on development with opportunities to grow into increased ownership of data platform design and optimization

Job Requirements

B.S. or M.S. in Computer Science, Data Science, Engineering, Mathematics, Physics, or equivalent experience
0-2 years of experience in data engineering, software engineering, analytics engineering, or related technical roles.
Experience designing and building data pipelines or ETL/ELT workflows
Hands-on experience with Databricks, Apache Spark, or distributed data processing frameworks
Proficiency in Python and SQL for data transformation and analysis
Familiarity with data modeling concepts and modern data lake or warehouse architectures
Experience working in cloud-native environments (AWS preferred)
Understanding of software development best practices including version control, testing, and CI/CD
Strong analytical mindset and ability to troubleshoot complex data issues
Effective communication skills and ability to collaborate across distributed engineering teams
Ability to participate in operational support rotations during critical incidents
Experience supporting data science or machine learning workflows, including feature engineering pipelines
Familiarity with Delta Lake, Lakehouse architectures, or large-scale telemetry data processing
Exposure to streaming data systems such as Kafka or Spark Structured Streaming
Experience with workflow orchestration tools such as Airflow or Databricks Workflows
Background in orbital mechanics, aerospace, physics, or applied mathematics
Experience building analytics datasets or semantic models for BI tools
Active U.S. security clearance or ability to obtain one

Benefits

Global workforce: flexible remote/hybrid opportunities
Work on complex, meaningful missions with real-world impact
Unlimited paid time off for most roles
Competitive salary and equity packages
Comprehensive health, dental, and vision coverage
Access to the forefront of commercial space operations and defense innovation

Related Categories

Data Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Data Engineer Jobs

Lead Metadata Specialist

McGraw Hill LLC.

The work you do at McGraw Hill will be work that matters. We are collectively designing content that will build the future of education. Play your part and experience a sense of fulfilment that will inspire you to even greater heights.

Data Engineer51 days ago

Full Time RemoteTeam 1,001-5,000

Role Description We are seeking a Lead Metadata Specialist to guide metadata strategy and implementation across McGraw Hill’s Higher Education business unit. This role leads projects that advance the design, management, and use of educational metadata—especially around competencies, objectives, and assessment and other types of content—to improve discoverability, personalization, and analytics across products and platforms. The Lead Metadata Specialist will collaborate with curriculum, product, and technical teams to ensure alignment across metadata workflows and business goals. This role also provides mentorship and leadership for metadata specialists and coordinates with the Enterprise Metadata team to support enterprise-level initiatives. This is a remote position open to applicants authorized to work for any employer within the United States. What You'll Do - Lead the design and execution of metadata projects that enhance the creation, delivery, and management of higher education content while enabling robust personalized learning services. - Partner with learning and data scientists to explore opportunities for metadata inference and enrichment to support adaptive learning and personalization. - Develop and maintain learning ontologies and controlled vocabularies for higher education academic disciplines, content, and learning services. - Collaborate with curriculum, product, and technical teams to define metadata strategies that align with business unit goals and emerging learning technology trends. - Provide technical leadership in implementing scalable metadata workflows and quality assurance processes using standard tools and platforms. - Lead and collaborate with Metadata Specialists, fostering consistency, capacity, and growth across metadata initiatives. - Collaborate cross-functionally to develop documentation, training materials, and communications that promote understanding and effective use of metadata across design and product teams. - Monitor external developments in metadata standards and best practices for higher education and integrate relevant frameworks to ensure alignment and innovation. Qualifications - 6+ years in education or educational technology, with at least 3 years direct experience managing educational metadata, learning objective frameworks, and/or competency structures in Higher Education. - Master’s degree in education, learning sciences, information science, or a related field (required or equivalent experience). - Advanced understanding of metadata standards and interoperability frameworks used in Higher Education. - Experience with e-book, assessment, and other educational content design for Higher Education. - Highly organized, self-motivated, able to manage multiple complex projects simultaneously. - Strong leadership and mentoring capabilities. - Growth mindset and openness to change, with a positive attitude and interest in improving over time. - Excellent communication skills, able to bridge technical and non-technical audiences. Preferred - Experience working with AI tools and awareness of their potential for educational content and interactions. - Experience designing and implementing educational recommendation systems. - Proficiency with collaboration and project tools such as JIRA, Confluence, Teams, and Slack. Benefits - The pay range for this position is between $125,000 - $155,000 annually. - Base pay offered may vary depending on job-related knowledge, skills, experience, and location. - An annual bonus plan may be provided as part of the compensation package. - A full range of medical and/or other benefits, depending on the position offered.

AI JIRA Confluence Slack

View details: Lead Metadata Specialist

United States

$125K - $155K / year

Apply

data entry

Marion Counseling Services

Data Engineer51 days ago

Full Time Remote

Role Description Join Marion Counseling Services as a vital member of our team. This position offers an exciting opportunity to support our operations by accurately inputting and managing data essential for our counselling services. - Enter and maintain data in various systems with a high degree of accuracy. - Assist in the preparation of reports and documentation as required. - Ensure confidentiality and security of sensitive information. - Collaborate with team members to improve data management processes. - Respond to inquiries regarding data entries and assist in troubleshooting issues. Qualifications - Proven experience in data entry or a related field. - Strong attention to detail and accuracy. - Proficiency in using data entry software and Microsoft Office Suite. - Excellent organisational skills and ability to manage multiple tasks. - Effective communication skills, both written and verbal. Requirements - Experience in the healthcare or counselling sector. - Familiarity with data management systems. - Ability to work independently and as part of a team.

Microsoft Office

View details: data entry

United States

£40K - £50K / year

Apply

Data Engineer

Auerbach Grayson

Data Engineer51 days ago

Full Time Remote

Role Description We're looking for a Data Engineer with a strong foundation in data pipelines and a meaningful edge in AI-native data infrastructure, specifically RAG pipelines, vector search, embedding workflows, and semantic retrieval systems. You'll work on two interconnected problem sets: - Consolidating eight legacy systems into a unified, reliable data platform: ETL pipelines, a data warehouse, and cross-system client identity resolution. - Transforming three decades of institutional research into an intelligent, searchable, interactable knowledge layer that clients can query in ways that weren't possible two years ago. This is a small, senior team. You'll work directly with the CTO, have real architectural ownership, and build systems that are in production. Qualifications - Strong foundation in data pipelines. - Experience with AI-native data infrastructure. - Familiarity with RAG pipelines, vector search, embedding workflows, and semantic retrieval systems. Requirements - Lead the data engineering work for our research portal migration — extracting, transforming, and loading data from legacy systems into modern cloud infrastructure. - Build and maintain ETL/ELT pipelines across multiple integration points: CRM, research distribution platforms, trading systems, and third-party APIs. - Design and implement our “Golden Record” initiative — cross-system client identity resolution across eight legacy databases with no unified identifiers. - Implement event-driven data flows using AWS EventBridge as the central routing layer, treating each source system as a swappable adapter. - Design and build production-grade RAG (Retrieval-Augmented Generation) pipelines over AGCO's research archive — ingestion, chunking strategy, embedding generation, vector storage, and retrieval. - Implement hybrid search approaches that combine semantic (vector) search with keyword and metadata filtering, appropriate for structured financial research queries. - Build and maintain embedding pipelines that keep the vector store current as new research is published, with full observability and freshness guarantees. - Evaluate and implement emerging retrieval strategies as the space evolves: Re-ranking with cross-encoders; Hypothetical Document Embeddings (HyDE); Query expansion and decomposition; Graph-based retrieval (e.g., GraphRAG) for analyst relationship mapping; Structured metadata retrieval for faceted financial queries; Wire retrieval layers into LLM interfaces for research summarization, analyst Q&A, and recommendation-change tracking across the archive. - Apply DataOps practices across all pipelines: version control, CI/CD, environment parity across dev/staging/production, and infrastructure as code. - Monitor pipeline health, embedding freshness, retrieval quality, and LLM call latency — build alerting that catches problems before users do. - Work within our AWS environment (App Runner, EventBridge, CDK) and contribute to IaC best practices. - Partner with the CTO, product team, and application developers to translate business requirements into sound data and retrieval architecture decisions. - Document data flows, schema designs, chunking strategies, and retrieval logic so systems are maintainable and not a black box. - Contribute to evaluation frameworks for retrieval quality — precision, recall, answer faithfulness — so we know when the system is actually working. Company Description

AI ETL Data Engineering CRM AWS Observability/Monitoring LLM CI/CD Infrastructure as Code

View details: Data Engineer

United States

Apply

Job Closed

Lead Data Engineer

Capgemini

Get the Future You Want

Data Engineer51 days ago

Full Time RemoteTeam 10,001+Since 1967H1B Sponsor

Company Site LinkedIn

At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same. Job Description Your Role: - Plan, manage, and execute releases across all environments - from development to production. - Identify technical and functional interdependencies between various tracks. - Coordinate with all development pods and stakeholders across programs. - Review application code, documentation, and perform pre/post deployment tasks. - Define release process for new data engineering applications. - Change management communication to all required stakeholders. - Execute release packages, step-by-step deployment guides, and command line instructions on production environment. - Support project timelines, clearly setting expectations, and realigning expectations internally as priorities change. - Develop a sound understanding of client’s needs: data conversion and migration requirements, environment management and build processes, deployment planning and execution, solution designs, systems integrations, technical architectures, and infrastructure architectures. Your Profile: - Execute deployment scripts and step-by-step instructions in test and production environments. - Comfortable with Command Line Interfaces. - Understanding of CI/CD, Git branching, packaging and DevOps pipelines. - Working knowledge of data warehousing and reporting technologies: Azure, Databricks, Unity Catalog, Python, Spark, PySpark, Airflow, MicroStrategy, Tableau. #LI-DC10 #LI-Remote Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.

R Data Engineering CI/CD Git Azure Databricks Unity Python Apache Spark PySpark Airflow Tableau AI

View details: Lead Data Engineer

Colombia

Apply

Job Closed

Data Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Data Engineer Jobs

Lead Metadata Specialist

data entry

Data Engineer

Lead Data Engineer