Job Closed
This listing is no longer active.
Persistent Orbital Intelligence 📡 🛰️
Data Engineer
Location
United States
Posted
51 days ago
Salary
0
Seniority
Junior
Job Description
Data Engineer
LeoLabs
• Play a key role in building and operating data pipelines and analytics infrastructure • Work closely with software engineers, radar and catalog teams, and data scientists • Ensure reliable extraction, transformation, and loading (ETL) of mission-critical datasets • Develop scalable batch and streaming data workflows • Enable advanced analytics and support machine learning initiatives • Help transform large volumes of sensor and orbital data into actionable intelligence • Engage in hands-on development with opportunities to grow into increased ownership of data platform design and optimization
Job Requirements
- B.S. or M.S. in Computer Science, Data Science, Engineering, Mathematics, Physics, or equivalent experience
- 0-2 years of experience in data engineering, software engineering, analytics engineering, or related technical roles.
- Experience designing and building data pipelines or ETL/ELT workflows
- Hands-on experience with Databricks, Apache Spark, or distributed data processing frameworks
- Proficiency in Python and SQL for data transformation and analysis
- Familiarity with data modeling concepts and modern data lake or warehouse architectures
- Experience working in cloud-native environments (AWS preferred)
- Understanding of software development best practices including version control, testing, and CI/CD
- Strong analytical mindset and ability to troubleshoot complex data issues
- Effective communication skills and ability to collaborate across distributed engineering teams
- Ability to participate in operational support rotations during critical incidents
- Experience supporting data science or machine learning workflows, including feature engineering pipelines
- Familiarity with Delta Lake, Lakehouse architectures, or large-scale telemetry data processing
- Exposure to streaming data systems such as Kafka or Spark Structured Streaming
- Experience with workflow orchestration tools such as Airflow or Databricks Workflows
- Background in orbital mechanics, aerospace, physics, or applied mathematics
- Experience building analytics datasets or semantic models for BI tools
- Active U.S. security clearance or ability to obtain one
Benefits
- Global workforce: flexible remote/hybrid opportunities
- Work on complex, meaningful missions with real-world impact
- Unlimited paid time off for most roles
- Competitive salary and equity packages
- Comprehensive health, dental, and vision coverage
- Access to the forefront of commercial space operations and defense innovation
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Lead Metadata Specialist
McGraw Hill LLC.The work you do at McGraw Hill will be work that matters. We are collectively designing content that will build the future of education. Play your part and experience a sense of fulfilment that will inspire you to even greater heights.
Role Description We are seeking a Lead Metadata Specialist to guide metadata strategy and implementation across McGraw Hill’s Higher Education business unit. This role leads projects that advance the design, management, and use of educational metadata—especially around competencies, objectives, and assessment and other types of content—to improve discoverability, personalization, and analytics across products and platforms. The Lead Metadata Specialist will collaborate with curriculum, product, and technical teams to ensure alignment across metadata workflows and business goals. This role also provides mentorship and leadership for metadata specialists and coordinates with the Enterprise Metadata team to support enterprise-level initiatives. This is a remote position open to applicants authorized to work for any employer within the United States. What You'll Do - Lead the design and execution of metadata projects that enhance the creation, delivery, and management of higher education content while enabling robust personalized learning services. - Partner with learning and data scientists to explore opportunities for metadata inference and enrichment to support adaptive learning and personalization. - Develop and maintain learning ontologies and controlled vocabularies for higher education academic disciplines, content, and learning services. - Collaborate with curriculum, product, and technical teams to define metadata strategies that align with business unit goals and emerging learning technology trends. - Provide technical leadership in implementing scalable metadata workflows and quality assurance processes using standard tools and platforms. - Lead and collaborate with Metadata Specialists, fostering consistency, capacity, and growth across metadata initiatives. - Collaborate cross-functionally to develop documentation, training materials, and communications that promote understanding and effective use of metadata across design and product teams. - Monitor external developments in metadata standards and best practices for higher education and integrate relevant frameworks to ensure alignment and innovation. Qualifications - 6+ years in education or educational technology, with at least 3 years direct experience managing educational metadata, learning objective frameworks, and/or competency structures in Higher Education. - Master’s degree in education, learning sciences, information science, or a related field (required or equivalent experience). - Advanced understanding of metadata standards and interoperability frameworks used in Higher Education. - Experience with e-book, assessment, and other educational content design for Higher Education. - Highly organized, self-motivated, able to manage multiple complex projects simultaneously. - Strong leadership and mentoring capabilities. - Growth mindset and openness to change, with a positive attitude and interest in improving over time. - Excellent communication skills, able to bridge technical and non-technical audiences. Preferred - Experience working with AI tools and awareness of their potential for educational content and interactions. - Experience designing and implementing educational recommendation systems. - Proficiency with collaboration and project tools such as JIRA, Confluence, Teams, and Slack. Benefits - The pay range for this position is between $125,000 - $155,000 annually. - Base pay offered may vary depending on job-related knowledge, skills, experience, and location. - An annual bonus plan may be provided as part of the compensation package. - A full range of medical and/or other benefits, depending on the position offered.
Role Description Join Marion Counseling Services as a vital member of our team. This position offers an exciting opportunity to support our operations by accurately inputting and managing data essential for our counselling services. - Enter and maintain data in various systems with a high degree of accuracy. - Assist in the preparation of reports and documentation as required. - Ensure confidentiality and security of sensitive information. - Collaborate with team members to improve data management processes. - Respond to inquiries regarding data entries and assist in troubleshooting issues. Qualifications - Proven experience in data entry or a related field. - Strong attention to detail and accuracy. - Proficiency in using data entry software and Microsoft Office Suite. - Excellent organisational skills and ability to manage multiple tasks. - Effective communication skills, both written and verbal. Requirements - Experience in the healthcare or counselling sector. - Familiarity with data management systems. - Ability to work independently and as part of a team.
Role Description We're looking for a Data Engineer with a strong foundation in data pipelines and a meaningful edge in AI-native data infrastructure, specifically RAG pipelines, vector search, embedding workflows, and semantic retrieval systems. You'll work on two interconnected problem sets: - Consolidating eight legacy systems into a unified, reliable data platform: ETL pipelines, a data warehouse, and cross-system client identity resolution. - Transforming three decades of institutional research into an intelligent, searchable, interactable knowledge layer that clients can query in ways that weren't possible two years ago. This is a small, senior team. You'll work directly with the CTO, have real architectural ownership, and build systems that are in production. Qualifications - Strong foundation in data pipelines. - Experience with AI-native data infrastructure. - Familiarity with RAG pipelines, vector search, embedding workflows, and semantic retrieval systems. Requirements - Lead the data engineering work for our research portal migration — extracting, transforming, and loading data from legacy systems into modern cloud infrastructure. - Build and maintain ETL/ELT pipelines across multiple integration points: CRM, research distribution platforms, trading systems, and third-party APIs. - Design and implement our “Golden Record” initiative — cross-system client identity resolution across eight legacy databases with no unified identifiers. - Implement event-driven data flows using AWS EventBridge as the central routing layer, treating each source system as a swappable adapter. - Design and build production-grade RAG (Retrieval-Augmented Generation) pipelines over AGCO's research archive — ingestion, chunking strategy, embedding generation, vector storage, and retrieval. - Implement hybrid search approaches that combine semantic (vector) search with keyword and metadata filtering, appropriate for structured financial research queries. - Build and maintain embedding pipelines that keep the vector store current as new research is published, with full observability and freshness guarantees. - Evaluate and implement emerging retrieval strategies as the space evolves: Re-ranking with cross-encoders; Hypothetical Document Embeddings (HyDE); Query expansion and decomposition; Graph-based retrieval (e.g., GraphRAG) for analyst relationship mapping; Structured metadata retrieval for faceted financial queries; Wire retrieval layers into LLM interfaces for research summarization, analyst Q&A, and recommendation-change tracking across the archive. - Apply DataOps practices across all pipelines: version control, CI/CD, environment parity across dev/staging/production, and infrastructure as code. - Monitor pipeline health, embedding freshness, retrieval quality, and LLM call latency — build alerting that catches problems before users do. - Work within our AWS environment (App Runner, EventBridge, CDK) and contribute to IaC best practices. - Partner with the CTO, product team, and application developers to translate business requirements into sound data and retrieval architecture decisions. - Document data flows, schema designs, chunking strategies, and retrieval logic so systems are maintainable and not a black box. - Contribute to evaluation frameworks for retrieval quality — precision, recall, answer faithfulness — so we know when the system is actually working. Company Description
At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same. Job Description Your Role: - Plan, manage, and execute releases across all environments - from development to production. - Identify technical and functional interdependencies between various tracks. - Coordinate with all development pods and stakeholders across programs. - Review application code, documentation, and perform pre/post deployment tasks. - Define release process for new data engineering applications. - Change management communication to all required stakeholders. - Execute release packages, step-by-step deployment guides, and command line instructions on production environment. - Support project timelines, clearly setting expectations, and realigning expectations internally as priorities change. - Develop a sound understanding of client’s needs: data conversion and migration requirements, environment management and build processes, deployment planning and execution, solution designs, systems integrations, technical architectures, and infrastructure architectures. Your Profile: - Execute deployment scripts and step-by-step instructions in test and production environments. - Comfortable with Command Line Interfaces. - Understanding of CI/CD, Git branching, packaging and DevOps pipelines. - Working knowledge of data warehousing and reporting technologies: Azure, Databricks, Unity Catalog, Python, Spark, PySpark, Airflow, MicroStrategy, Tableau. #LI-DC10 #LI-Remote Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.

