Job Closed
This listing is no longer active.
Senior Data Engineer
Location
United States
Posted
96 days ago
Salary
0
Seniority
Senior
Job Description
Senior Data Engineer
Spear AI
• Implement real-time data pipelines with MQTT and Redpanda for stream processing. • Implement offline data pipelines using Dagster for batch processing. • Parse and process binary message formats from various data sources. • Build data warehouses using Postgres, Apache Iceberg, Parquet, and S3. • Design data models that allow for high-performance queries. • Validate and normalize data sources. • Improve local development and CI/CD using modern tooling and GitHub Actions.
Job Requirements
- Expertise in time-series data processing and analysis (windowing, resampling, interpolation, etc.)
- Experience with binary message parsing.
- Proficiency in Python for data engineering workflows.
- Experience with row-based & columnar-based data formats.
- Experience with OLTP & OLAP databases.
- Knowledge of distributed systems, streaming architectures, and batch processing patterns.
- Hands-on experience with a batch orchestrator such as Dagster/Airflow.
- Hands-on experience with a streaming platform such as Redpanda/Kafka.
- Hands-on experience with binary message formats such as Protobuf.
- Nice To Have: Experience with IoT devices and sensors.
- Nice To Have: Digital signal processing experience.
- Nice To Have: Geospatial analysis and GIS experience.
- Nice To Have: Familiar with working in monorepos.
Benefits
- Unlimited PTO — Take the time you need to recharge and maintain work-life balance.
- Dedicated Sick Time — Your health and well-being come first.
- Comprehensive Health & Benefits – Medical, dental, and vision coverage to keep you and your family protected.
- 11 Paid Holidays — Enjoy time off throughout the year to celebrate and spend with loved ones.
- Professional Development — Educational opportunities and resources to help you grow your skills and advance your career.
- Collaborative Environment — Work directly with leadership in our flat organizational structure, where your ideas and contributions matter.
- Mission-Driven Work — Contribute to projects that directly support national security and make a real-world impact.
- Growth Opportunities — Join us during an exciting expansion phase where you can help shape our future.
- 401(k) with company match.
- Onsite / Remote / Flexible work arrangements or hybrid options (position dependent).
- Relocation assistance (position dependent).
- Referral bonuses.
- Performance bonuses.
- Life insurance and disability coverage.
- Technology home office setup stipend.
- Professional certification reimbursement (position dependent).
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Architect and lead the evolution of our modern data platform, driving technical decisions on tooling, infrastructure patterns, and scalability strategies that support both traditional analytics and AI/ML workloads at scale • Design and build production LLM pipelines and infrastructure that power intelligent operations. • Own end-to-end data acquisition and integration architecture across diverse sources (CRMs, clickstream, third-party APIs), establishing patterns and frameworks that enable self-service data access while maintaining data quality and governance • Create shared abstractions and tooling for AI – for example, common prompt and tool patterns, logging and monitoring, and reusable components – so other engineers can build on a consistent foundation. • Shape our data and system architecture so AI can safely stitch together longitudinal signals across product, billing, support, and operations and recommend what should happen next, not just report what happened. • Lead by example in AI-augmented engineering, using AI to multiply your own speed, mentoring L2/L3 engineers, and raising the bar for how we design, ship, and operate AI-powered features. • Mentor and influence engineering culture, conducting design reviews, providing technical guidance to engineers across the organization, and championing data platform adoption and best practices
• Architect and lead the evolution of our modern data platform, driving technical decisions on tooling, infrastructure patterns, and scalability strategies that support both traditional analytics and AI/ML workloads at scale • Design and build production LLM pipelines and infrastructure that power intelligent operations. • Own end-to-end data acquisition and integration architecture across diverse sources (CRMs, clickstream, third-party APIs), establishing patterns and frameworks that enable self-service data access while maintaining data quality and governance • Create shared abstractions and tooling for AI – for example, common prompt and tool patterns, logging and monitoring, and reusable components – so other engineers can build on a consistent foundation. • Shape our data and system architecture so AI can safely stitch together longitudinal signals across product, billing, support, and operations and recommend what should happen next, not just report what happened. • Lead by example in AI-augmented engineering, using AI to multiply your own speed, mentoring L2/L3 engineers, and raising the bar for how we design, ship, and operate AI-powered features. • Mentor and influence engineering culture, conducting design reviews, providing technical guidance to engineers across the organization, and championing data platform adoption and best practices
• Own the end-to-end architecture of the data migration platform, from ingestion through validation to production deployment • Design migration infrastructure that reduces per-customer engineering effort through reusable components and standardized patterns • Build systems that evolve migrations from engineer-led to standardized and self-service • Establish automated data quality frameworks including profiling, validation, and anomaly detection • Instrument migration systems with dashboards, metrics, and alerts for observability and continuous improvement • Architect self-service import workflows for customers and internal teams • Design ingestion pipelines that align tightly with Hibernate entity models and lifecycle rules • Build intelligent error handling and feedback loops that guide non-technical users through data correction • Create Excel-based import templates with embedded validation, documentation, and formatting standards • Maintain alignment with application changes by tracking Hibernate model evolution and updating pipelines proactively • Design, build, and maintain ETL pipelines for migrating data from legacy waste management systems • Analyze legacy datasets, reverse-engineer business logic, and implement transformation workflows • Execute end-to-end customer migrations in coordination with technical and business stakeholders • Build tooling to ingest data from legacy LAN-based systems common in the industry • Optimize pipeline performance to handle large datasets efficiently and reduce migration timelines • Drive technical decisions for data infrastructure, including build vs. buy evaluations • Partner with Product, Engineering, and Customer Success to shape scalable solutions • Mentor team members on data integration and migration patterns • Create and maintain documentation, runbooks, and operational guides • Lead code reviews and knowledge sharing to raise the bar across the team
• Define and implement the end-to-end architecture for the Lakehouse, ensuring support for high-volume batch processing, real-time streaming, and GenAI/LLM initiatives using Mosaic AI and MLflow • Design solutions that prioritize data quality at the source • Establish and enforce architectural patterns that ensure reliability and trust in data assets • Implement automated testing and observability frameworks (e.g., Delta Live Tables expectations) to maintain data health • Translate complex business requirements into robust logical and physical data models • Document and maintain clear data flow architectures for applications to ensure alignment between business intent and engineering output • Lead virtual Scrum teams and provide deep technical mentorship to data engineers and developers • Advocate for Spark/Databricks best practices, conduct code reviews, and drive the adoption of CI/CD and DevOps principles in data workflows • Architect the frameworks for data lineage, security, and lifecycle management (utilizing Unity Catalog) • Ensure strict compliance with internal policies and external regulations from the point of data ingestion to final consumption • Partner directly with business stakeholders to understand their roadmap and priorities • Ensure all data and AI initiatives are strictly aligned with tangible business value • Manage the architectural roadmap to balance immediate business needs with long-term technical scalability



