Job Closed
This listing is no longer active.
Leidos is an innovation company rapidly addressing the world’s most vexing challenges in national security and health.
Unstructured Data Engineer
Location
United States
Posted
100 days ago
Salary
$107.9K - $195.1K / year
Seniority
Lead
Job Description
Unstructured Data Engineer
Leidos
• Design, build, and manage end-to-end RAG pipelines for enterprise AI applications. • Lead preprocessing of unstructured data, including discovery, classification, cleansing, redaction, and metadata enrichment. • Develop and optimize document chunking, embedding, and vectorization strategies for structured and unstructured datasets. • Coordinate ingestion of curated datasets into vector databases and AI platforms. • Package curated unstructured datasets as governed, reusable data products for enterprise consumption. • Define and implement metadata tagging strategies to align with Collibra governance standards. • Partner with Data Governance and Data Quality teams to ensure AI-ready data meets enterprise standards for lineage, classification, and compliance. • Evaluate and optimize embedding models, retrieval strategies, and indexing performance. • Monitor and tune RAG pipeline performance, including latency, retrieval accuracy, and cost efficiency. • Implement automation for document ingestion, transformation, and publishing workflows. • Support integration with enterprise AI platforms (e.g., ChatGPT Enterprise, AskSage, Moveworks). • Conduct cost analysis and capacity planning for vector storage and processing workloads. • Provide technical guidance on AI data readiness and unstructured data lifecycle management. • Design, implement, and optimize enterprise-grade RAG and prompt engineering frameworks, including context engineering strategies (chunking, metadata enrichment, semantic filtering, dynamic context management) to improve retrieval accuracy, grounding, and response quality. • Develop and maintain scalable multi-modal data pipelines that ingest, preprocess, embed, and integrate text, documents, images, audio, and structured data into governed vectorized data products consumable by enterprise AI platforms.
Job Requirements
- Bachelor’s degree in Computer Science, Data Engineering, AI/ML, or related field and 8+ years of relevant experience.
- Hands-on experience designing and implementing RAG architectures in production environments.
- Experience working with unstructured data (PDFs, documents, email, transcripts, images with OCR, etc.).
- Strong proficiency in Python and experience with NLP/LLM frameworks (e.g., LangChain, LlamaIndex, Hugging Face, OpenAI APIs).
- Experience with vector databases (e.g., Pinecone, Weaviate, FAISS, OpenSearch, Azure AI Search).
- Experience implementing document chunking, embedding generation, and similarity search.
- Understanding of metadata modeling and governance principles.
- Experience building scalable data pipelines in cloud environments (AWS, Azure, or GCP).
- Hands-on experience with prompt engineering, evaluation metrics, and context window optimization.
- Strong understanding of multi-modal data processing and pipeline engineering.
- Strong knowledge of API integration and microservices architecture.
- US Citizenship is required.
Benefits
- Competitive compensation
- Health and Wellness programs
- Income Protection
- Paid Leave
- Retirement
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Wspieraniu klientów w tematyce inżynierii i analityki dużych zbiorów danych w środowisku chmurowym – tworzenie rozwiązań od koncepcji po wdrożenie oraz inne fazy procesu SDLC/DDLC • Definiowaniu rozwiązań Data Intelligence i zarządzaniu ich wdrażaniem pod kątem technicznym i metodycznym • Wspieraniu celów biznesowych naszych klientów poprzez opracowywanie i wdrażanie rozwiązań analitycznych • Budowaniu Proof of Concept (PoC) i nadzorowaniu architektury Microsoft dla naszych klientów • Dzieleniu się wiedzą i pełnieniu funkcji trenerskiej w zakresie Azure Data Intelligence, oferowaniu wsparcia pracownikom, związanego z tematyką BI i analityką danych • Współpracy z innymi działami biznesowymi w celu dostarczania przydatnych rozwiązań analitycznych • Proponowaniu nowych możliwości wykorzystania danych, wdrażaniu nowoczesnych rozwiązań analitycznych oraz dbaniu o najwyższą jakość analiz i raportów, które pomagają kształtować strategiczne decyzje biznesowe
• Design, develop, and maintain optimal data pipelines and workflows. • Provide technical guidance on architecture design and collaborate with Data Engineers for implementation. • Work with stakeholders to assist with data-related technical issues and support their data infrastructure needs. • Perform data integration, transformation, and cleaning to create insightful and actionable data for the business. • Translate business requirements into technical specifications and finished products. • Participate in code reviews and contribute to team knowledge sharing. • Lead the development and maintenance of documentation for data engineering processes and systems. • Create ad-hoc reports requested by internal and external partners. • Work as a team to support and troubleshoot errors for 100+ data pipelines. • Work closely with the application development team to update data and data structures. • Provide analysis, recommendations, and feedback to business process owners, leadership team, and the Information Technology department. • Propose automated solutions to repeated development tasks.
Senior Data Engineer, 1
People Inc. PublishingPeople Inc. is a major American digital and print media company founded in 1997 as The Mining Company, later renamed About Inc., Dotdash, Dotdash Meredith, and now rebranded as Peo
Job Title Senior Data Engineer, 1 Job Description About The Position | Major goals and objectives and location requirements About The Positions Contributions: Weight % Accountabilities, Actions and Expected Measurable Results 60% You will enhance our systems by building new data integration pipelines and adding new data to our data lakes and warehouses while continuously optimizing them. You will work with internal team members as well as stakeholders to scope out business requirements and see data deliverables through to the end where they will be used via our Looker platform. You will continuously look for ways to improve our data transformations and data consumption processes so that our systems are running efficiently, and our customers are able to use and analyze our data quickly and effectively. 40% You will champion coding standards and best practices by actively participating in code reviews, and working to improve our internal tools and build process. You will work to ensure the security and stability of our infrastructure in a multi-cloud environment. You will collaborate with our Analytics engineers to ensure data integrity and the quality of our data deliverables. The Role’s Minimum Qualifications and Job Requirements Education: Degree in a quantitative field, such as computer science, statistics, mathematics, engineering, data science, or equivalent experience. Experience: A minimum of 5+ years of experience in building and optimizing data pipelines with Python. You have experience writing complex SQL queries to analyze data. Window functions and nested subqueries are second nature to you. You have commendable experience with at least one cloud service platform (GCP and AWS preferred). You've worked with data at scale using Apache Spark, Beam or a similar framework. You're familiar with data streaming architectures using technologies like Pub/Sub and Apache Kafka. You are eager to learn about new tech stacks, big data technologies, data pipelining architectures, etc. and propose your findings to the team to try and optimize our systems. Specific Knowledge, Skills, Certifications and Abilities: Strong Python and SQL skills. Experience with Google Cloud Platform is a plus. It is the policy of People Inc. to provide equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, the Company will provide reasonable accommodations for qualified individuals with disabilities. Accommodation requests can be made by emailing hr@people.inc . The Company participates in the federal E-Verify program to confirm the identity and employment authorization of all newly hired employees. For further information about the E-Verify program, please click here: https://www.e-verify.gov/employees Pay Range Salary: $140,000 - $170,000 The pay range above represents the anticipated low and high end of the pay range for this position and may change in the future. Actual pay may vary and may be above or below the range based on various factors including but not limited to work location, experience, and performance. The range listed is just one component of People Inc's total compensation package for employees. Other compensation may include annual bonuses, and short- and long-term incentives. In addition, People Inc. provides to employees (and their eligible family members) a variety of benefits, including medical, dental, vision, prescription drug coverage, unlimited paid time off (PTO), adoption or surrogate assistance, donation matching, tuition reimbursement, basic life insurance, basic accidental death & dismemberment, supplemental life insurance, supplemental accident insurance, commuter benefits, short term and long term disability, health savings and flexible spending accounts, family care benefits, a generous 401K savings plan with a company match program, 10-12 paid holidays annually, and generous paid parental leave (birthing and non-birthing parents), all of which may vary depending on the specific nature of your employment with People Inc. and your work location. We also offer voluntary benefits such as pet insurance, accident, critical and hospital indemnity health insurance coverage, life and disability insurance. #CORP#
Staff Data Engineer, Compliance Engineering & Technology
BlockBlock builds simple, powerful tools that make progress towards an economy that’s truly open to all.
Block is one company built from many blocks, all united by the same purpose of economic empowerment. The blocks that form our foundational teams - People, Finance, Counsel, Hardware, Information Security, Platform Infrastructure Engineering, and more - provide support and guidance at the corporate level. They work across business groups and around the globe, spanning time zones and disciplines to develop inclusive People policies, forecast finances, give legal counsel, safeguard systems, nurture new initiatives, and more. Every challenge creates possibilities, and we need different perspectives to see them all. Bring yours to Block. The Role The Compliance Engineering & Technology (CET) team at Block supports the detection and reporting of suspicious financial crimes activity across Cash App, Square, and Afterpay. We work globally with partners in business, engineering, counsel, and product to provide a safe user experience for our customers while minimizing and potentially eliminating bad activity on our platform. You will report to the CET - Screening Engineering Manager, but work predominately alongside the CET - Data Engineering team. As a Data Engineer you will handle everything from data architecture and modeling to data pipeline tooling and dashboarding. You will enable other compliance teams to make impactful business decisions by laying the foundation of our large and unique datasets that span across multiple products. As a staff engineer, you will be helping bring the organization into a new level of consistency, helping create and evangelize best practices and standards for the wider organization. You Will Stay up to date on the latest data engineering best practices, decide which are most applicable for our use cases, and guide and teach the data engineering team the relevant tools. Create scalable patterns and solutions that help our team design, develop, and manage scalable ETL pipelines to unblock new product launches. Lead the creation and optimization of existing data models and schemas on top of Block data including but not limited to eventing, customer level, and process level data. Build monitoring to assess the health of the team's infrastructure as well as data quality and lineage. Participate in the data engineering team's on-call rotation: monitor daily execution, diagnose and log issues, and fix business critical pipelines to ensure SLAs are met with internal stakeholders Work with non-technical partners and product teams to understand their needs, translate business requirements into applicable data requirements, and come up with automated end-to-end solutions.



