Job Closed
This listing is no longer active.
Market Research for the Disruptive Economy.
Data Engineering Lead
Location
New York
Posted
57 days ago
Salary
$190K - $230K / year
Seniority
Senior
Job Description
Data Engineering Lead
YipitData
• Own the design, build, and optimization of end-to-end data pipelines that power our vendor universe. • Establish and enforce best practices in data modeling, orchestration, and system reliability. • Collaborate with product, engineering, and business stakeholders to translate requirements into robust, scalable data solutions. • Work extensively with Databricks and Airflow for large-scale data processing and orchestration. • Troubleshoot and resolve complex pipeline issues to ensure reliability and performance. • Contribute to the team’s technical strategy, helping drive improvements in scalability, performance, and efficiency. • Lead, mentor, and support engineers through challenges, code reviews, and project execution.
Job Requirements
- 6+ years of professional experience in Data Engineering or equivalent technical roles (e.g., data architecture, big data development, or ETL engineering).
- 2+ years of managerial experience, including mentoring, team leadership, and supporting delivery.
- Strong expertise in SQL and distributed data systems.
- Proficiency with PySpark and Databricks for processing and scaling large datasets.
- Hands-on experience with Airflow for pipeline orchestration (Dagster/dbt a plus).
- Proven track record of delivering in fast-paced, deadline-driven environments with minimal oversight.
- Strong problem-solving skills and ability to translate business needs into scalable technical solutions.
- Excellent communication and collaboration skills with both technical and non-technical stakeholders.
Benefits
- We care about your personal life, and we mean it. We offer flexible work hours, flexible vacation, a generous 401K match, parental leave, team events, wellness budget, learning reimbursement, and more!
- Your growth at YipitData is determined by the impact that you are making, not by tenure, unnecessary facetime, or office politics. Everyone at YipitData is empowered to learn, self-improve, and master their skills in an environment focused on ownership, respect, and trust. See more on our high-impact, high-opportunity work environment above!
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Empower businesses to make better decisions using the financial planning & decision-making platform. • Scale and achieve targets predictably. • Collaborate with teams to provide an environment that explores new ideas and learns from customers.
• Build and monitor large-scale data pipelines that ingest data from a variety of sources. • Develop and scale our DBT setup for transforming data. • Work with our data platform team to solve customer problems. • Use your advanced SQL & big data skills to craft cutting edge data solutions.
Senior Data Engineer, Auction Intelligence
SpotifyPassionate music fans. Innovative tech pros. Perfect harmony. Join our band.
• Design, build, and maintain scalable batch and streaming data pipelines using tools like Scio and GCP (BigQuery, Dataflow, GCS) • Develop high-throughput, low-latency pipelines that support use cases such as budget pacing, campaign performance prediction, and auction bidding • Partner closely with machine learning engineers and senior engineers to build data foundations for ML-driven products • Improve data quality, reliability, and observability across pipelines to support critical business decisions • Contribute to architecture and design decisions, balancing long-term scalability with practical delivery • Write clean, maintainable, and well-tested code aligned with Spotify’s engineering practices • Collaborate across functions with product managers, engineers, and stakeholders to deliver impactful solutions
Company Overview Pantheon Data (a Kenific Holding company) is a private, small business based in the Washington, DC, area. Pantheon Data was founded in 2011, initially providing acquisition and supply chain management services to the US Coast Guard. Our service offerings have grown in the past ten years, including infrastructure resiliency, contact center operations, information technology, software engineering, program management, strategic communications, engineering, and cybersecurity. We have also grown our customer base to include commercial clients. The company has used this experience to expand our service offerings to other agencies within the Department of Homeland Security (DHS), the Department of Defense (DoD), and other Federal Civilian Agencies. Position Overview We are seeking a Senior Data Engineer to design, build, and optimize the data foundations for our next-generation Generative AI applications. This role is focused on architecting the Data Enrichment and Vectorization pipelines that power Large Language Models (LLMs). You will be responsible for the end-to-end lifecycle of data, from ingestion in AWS to serving high-context, enriched datasets to AWS Bedrock. Responsibilities - LLM Data Pipelines: Design and implement scalable data ingestion and transformation pipelines specifically for RAG (Retrieval-Augmented Generation) architectures. - AWS Bedrock Integration: Operationalize LLM workflows using AWS Bedrock, managing model invocations, and embedding generation. - Data Enrichment & Quality: Develop advanced Python-based processing jobs to clean and enrich unstructured data with metadata to improve LLM retrieval accuracy. - Vector Database Management: Architect and maintain vector stores (e.g., OpenSearch Serverless or Postgressql pgvector) to support efficient semantic search. - Cloud Architecture: Leverage core AWS services (S3, Glue, Lambda, Step Functions) to build resilient, automated data workflows. - DevSecOps Collaboration: Work with the security team to ensure all data handling meets stringent compliance standards (e.g., FedRAMP/DISA STIGs) through Infrastructure as Code. Required Skills and Experience - Python Mastery: Expert-level Python programming with experience in libraries such as Pandas and LLM orchestration frameworks like LangChain or LlamaIndex. - AWS AI/ML Ecosystem: Hands-on experience with AWS Bedrock and Amazon SageMaker. - Data Engineering Foundations: Proven track record with AWS Glue (ETL), Athena, and Redshift. - Certifications: Must hold a recognized Data Science Certification (e.g., AWS Certified Data Engineer, Databricks Certified Data Scientist). - Database Expertise: Proficiency in both SQL and NoSQL, with specific experience in Vector Databases. - Ability to work effectively remotely in cross-functional teams. - Ability to meet deadlines and produce quality work. - Proficient in Microsoft Suite software including Outlook, Word, Excel, SharePoint, and PowerPoint. Preferred Skills and Experience - Bachelor's Degree Clearance Requirements U.S. Citizenship with the ability to obtain and maintain a DoD Secret clearance. Work Location: United States - Remote - Our company prioritizes the benefits of flexibility and collaboration, whether that happens in person or remotely. - If the position is remote or hybrid, you may periodically work from a Pantheon Data office location or client site. - If this position is assigned to a Pantheon Data office location or client site, you'll work with colleagues and clients in person, as needed for specific client requirements. Compensation The salary range for this position is $140,000 - $175,000. This is not, however, a guarantee of compensation or salary. Rather, salary will be set based on experience, geographic location and possibly contractual requirements and could fall outside of this range. Benefits Overview We are always looking for good people! Pantheon Data is committed to providing its employees with competitive salaries and benefits in order to increase employee satisfaction and productivity. In addition to our benefits, we also offer SmartBenefits through the Washington Metro Area Transportation Authority, where you specify an amount of your pre-tax wages be paid directly to your SmarTrip account. In some cases, tuition assistance may be available for continuing education expenses and certifications related to their position. Additional details may be found at https://pantheon-data.com/careers/ Pantheon Data Important Information All qualified applicants will be considered for employment without regard to disability, status as a protected veteran, or any other status protected by applicable federal, state, local, or international law. As part of the application process, you are expected to be on camera during interviews and assessments. We reserve the right to take your picture to verify your identity and prevent fraud. If you require reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employee selection process, please direct your inquiries to our Talent Team at Recruiting@pantheon-data.com or by phone (571) 363-4020. This company uses E-Verify to confirm each employee's work authorization. For more information, click here E-Verify Participation Poster


