Guidehouse logo
Guidehouse

Solving big problems, building trust in society, and empowering our clients to shape the future.

Data Infrastructure Engineer

Data EngineerData EngineerFull TimeRemoteEntry LevelTeam 10,001+Since 2018H1B SponsorCompany SiteLinkedIn

Location

Worldwide

Posted

7 days ago

Salary

$65K - $108K / year

Seniority

Entry Level

Job Description

Data Infrastructure Engineer

Guidehouse

Title: Data Infrastructure Engineer Location: US - Remote (Any location) Job Description: Job Family:Data Science& Analysis Travel Required:None Clearance Required:Ability to Obtain Public Trust We are seeking a Data Infrastructure Engineer to build and operate the data platform that powers AI/ML analytics modules. You will design and implement scalable data ingestion pipelines, robust ETL/ELT, and a modern data lake / delta lake (lakehouse) on AWS. You’ll also establish a managed metadata repository and governance layers (catalog, lineage, quality, access controls) and deliver automated cloud provisioning plus CI/CD for data pipelines to enable reliable, repeatable deployments across environments. This role is ideal for an engineer who enjoys platform building, automation, and enabling advanced analytics through trusted, well-governed data. What You Will Do: Build & Operate Data Pipelines (Batch + Streaming) - Design and implement batch and streaming ingestion from APIs, relational databases, file drops, event streams, and external partners. - Build and optimize ETL/ELT pipelines to produce curated, analytics-ready datasets for reporting and ML consumption. - Implement incremental processing patterns, change data capture (CDC) approaches where appropriate, and data contract standards. Deliver a Modern Lakehouse (Data Lake / Delta Lake) - Build and manage a scalable lakehouse on AWS object storage (e.g., S3) using open table/file formats and delta/lakehouse concepts (e.g., ACID tables, schema evolution, time travel patterns). - Optimize performance and cost through partitioning, compaction, lifecycle policies, and efficient compute/storage usage. - Establish environment standards for dev/test/prod and consistent promotion across stages. Metadata, Governance, Lineage & Quality (Trust Layer) - Implement a managed metadata repository for dataset cataloging, ownership, glossary/definitions, tagging, and discoverability. - Enable end-to-end lineage (source → transformations → consumption) to support auditability and impact analysis. - Implement governance controls including policy-based access, data classification, retention, and secure data handling. - Build operational data quality checks (freshness, completeness, validity, anomaly detection) and publish SLAs/SLOs. AWS Automation + CI/CD for Data Pipelines - Implement automated cloud provisioning in AWS using Infrastructure as Code (IaC) for consistent environments and secure-by-default baselines. - Build and enhance CI/CD for data pipelines, including automated tests, validation gates, promotion workflows, and rollback strategies. - Improve observability with metrics/logs/alerts, dashboards, runbooks, and incident response readiness. Cross-Team Collaboration & Documentation - Work closely with engineering, security, networking, and application teams to support mission needs and delivery timelines. - Maintain high-quality engineering documentation including SOPs, system diagrams, and secure configuration baselines. - Summarize and present findings and recommendations—both written and verbal—to technical and non-technical stakeholders. What You Will Need: - Bachelor’s degree in Engineering, IT, Computer Science, or related field (or equivalent experience). - Zero(0) to Two(2) Years of experience. - Experience building production data pipelines and/or data platforms. - Knowledge in implementing data ingestion and ETL/ELT workflows, including data modeling and transformation best practices. - Knowledge in building a data lake / delta lake (lakehouse) on AWS (or equivalent cloud) using object storage and modern table formats/patterns. - Proficiency in SQL and one programming language commonly used for data engineering (Python preferred; Scala/Java acceptable). - Knowledge with metadata management and governance: cataloging, lineage, ownership, access controls, classification and policy enforcement. - Knowledge in implementing automated AWS provisioning using IaC and operating across multiple environments. - Proven experience developing RAG applications - Solid security fundamentals: IAM/least privilege, encryption, secrets management, secure SDLC practices. - Must be able to OBTAIN and MAINTAIN a Federal or DoD "PUBLIC TRUST"; candidates must obtain approved adjudication of their PUBLIC TRUST prior to onboarding with Guidehouse. Candidates with an ACTIVE PUBLIC TRUST or SUITABILITY are preferred. What Would Be Nice To Have: - Hands-on experience with Databricks - Experience in operating CI/CD pipelines for data workflows (testing, packaging, deployment automation, environment promotion). - Hands-on experience utilizing modern DevOps practices, including tools like Git, Terraform, Jenkins, AWS CodePipeline, and Docker. - Experience utilizing AI-assisted coding tools (e.g., GitHub Copilot, ChatGPT, Cursor, Kiro) to safely accelerate implementation while maintaining strict code quality through testing, code reviews, and security practices. - Knowledge graph and Graph RAG experience, including: - Graph modeling and ontology/taxonomy alignment - Entity resolution and relationship extraction - Hybrid retrieval approaches combining graph traversal with semantic/vector search to improve grounding and explainability The annual salary range for this position is $65,000.00-$108,000.00. Compensation decisions depend on a wide range of factors, including but not limited to skill sets, experience and training, security clearances, licensure and certifications, and other business and organizational needs. What We Offer: Guidehouse offers a comprehensive, total rewards package that includes competitive compensation and a flexible benefits package that reflects our commitment to creating a diverse and supportive workplace. Benefits include: - Medical, Rx, Dental & Vision Insurance - Personal and Family Sick Time & Company Paid Holidays - Parental Leave - 401(k) Retirement Plan - Group Term Life and Travel Assistance - Voluntary Life and AD&D Insurance - Health Savings Account, Health Care & Dependent Care Flexible Spending Accounts - Transit and Parking Commuter Benefits - Short-Term & Long-Term Disability - Tuition Reimbursement, Personal Development, Certifications & Learning Opportunities - Employee Referral Program - Corporate Sponsored Events & Community Outreach - Care.com annual membership - Employee Assistance Program - Supplemental Benefits via Corestream (Critical Care, Hospital Indemnity, Accident Insurance, Legal Assistance and ID theft protection, etc.) - Position may be eligible for a discretionary variable incentive bonus About Guidehouse Guidehouse is an Equal Opportunity Employer–Protected Veterans, Individuals with Disabilities or any other basis protected by law, ordinance, or regulation. Guidehouse will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of applicable law or ordinance including the Fair Chance Ordinance of Los Angeles and San Francisco. If you have visited our website for information about employment opportunities, or to apply for a position, and you require an accommodation, please contact Guidehouse Recruiting at 1-571-633-1711 or via email at RecruitingAccommodation@guidehouse.com. All information you provide will be kept confidential and will be used only to the extent required to provide needed reasonable accommodation. All communication regarding recruitment for a Guidehouse position will be sent from Guidehouse email domains including @guidehouse.com or guidehouse@myworkday.com. Correspondence received by an applicant from any other domain should be considered unauthorized and will not be honored by Guidehouse. Note that Guidehouse will never charge a fee or require a money transfer at any stage of the recruitment process and does not collect fees from educational institutions for participation in a recruitment event. Never provide your banking information to a third party purporting to need that information to proceed in the hiring process. If any person or organization demands money related to a job opportunity with Guidehouse, please report the matter to Guidehouse’s Ethics Hotline. If you want to check the validity of correspondence you have received, please contact recruiting@guidehouse.com. Guidehouse is not responsible for losses incurred (monetary or otherwise) from an applicant’s dealings with unauthorized third parties. Guidehouse does not accept unsolicited resumes through or from search firms or staffing agencies. All unsolicited resumes will be considered the property of Guidehouse and Guidehouse will not be obligated to pay a placement fee.

Related Categories

Related Job Pages

More Data Engineer Jobs

Capco logo

Senior Data Engineer – Data Governance, English

Capco

Capco, a Wipro company, is a management & technology consultancy dedicated to the financial services & energy industries

Data Engineer7 days ago
Full TimeRemoteTeam 1,001-5,000Since 1998H1B Sponsor

• Prepare, organize and make data available for analytical, operational and governance use. • Design, develop and maintain scalable data pipelines for extraction, integration, consolidation, transformation and processing of data from multiple sources. • Create interfaces, flows and mechanisms that ensure secure, efficient and reliable access to information. • Implement and administer solutions using Informatica Cloud Data Governance and Catalog (CDGC). • Work together with business and technology teams to define and implement data governance policies, standards and best practices. • Integrate governance and data catalog solutions with enterprise platforms and cloud environments. • Support initiatives related to Data Quality, Master Data Management (MDM) and Reference Data Management (RDM). • Ensure that data assets are reliable, traceable, accessible and compliant with organizational standards.

Brazil

Data Engineer Contract Auckland 6 - 20 Years of Experience Data Engineer – 4 Month Contract (Remote) We’re looking for a high-caliber Data Engineer to join our client on a short-term contract, delivering impactful data solutions in a modern cloud environment. This is a great opportunity for someone who thrives in autonomy, can hit the ground running, and enjoys working closely with both technical teams and business stakeholders. About the Role You’ll be working under the direction of the Head of Data Engineering to design and deliver data products that drive real business value. This role requires a proactive, self-sufficient individual who is comfortable operating with minimal supervision while maintaining strong communication across the business. Key Responsibilities - Develop and deliver high-quality data products aligned to business needs - Work with the client’s standard tooling and patterns, including SQL, DBT, and GCP - Collaborate with business stakeholders to understand requirements and translate them into technical solutions - Produce clear and useful technical documentation - Ensure best practices in data engineering, performance, and maintainability Key Requirements - Proven experience as a Data Engineer in modern data environments - Strong SQL development skills - Hands-on experience with DBT (Data Build Tool) - Experience working within Google Cloud Platform (GCP) – essential - Ability to work independently with minimal supervision - Strong communication skills and stakeholder engagement experience - Professional, personable, and delivery-focused What’s on Offer - Fully remote working - Short-term, high-impact contract - Opportunity to work with modern data tooling and architecture - Collaborative and forward-thinking environment

AUK + 1 moreAll locations: AUK | New Zealand
Gugu Robotics logo

Staff Data Engineer

Gugu Robotics

The Future is Now; Beyond Boundaries, Beyond Imagination

Data Engineer7 days ago
Full TimeRemoteTeam 51-200Since 2016H1B No Sponsor

• Define data architecture and platform strategy, leading design across pipelines, warehouses, and data lakes • Build and optimize scalable data pipelines supporting batch and real-time processing • Define and enforce data governance, quality standards, and compliance frameworks across the platform • Build monitoring, logging, and alerting for data pipelines and services, and contribute to CI/CD workflows for data deployment and automation • Drive data platform modernization, optimizing for performance, cost, and scalability • Bring an AI-forward mindset to your daily work, using tools like Claude, Cursor, and other modern AI assistants to ship higher-quality work at pace • Design and implement data contracts and event flows in collaboration with backend, platform, and engineering teams • Lead the design and implementation of data pipelines for production AI/ML systems, including embeddings, vector stores, RAG data preparation, feature stores, and training/inference data flows • Integrate data services with APIs, middleware, and third-party systems to support end-to-end data consumption • Partner with leadership on data strategy, translating technical depth into decisions others can act on • Collaborate closely with engineering, analytics, AI, and product teams to align data platforms with broader goals • Advocate for data quality, governance, and platform best practices across teams and engagements • Establish data engineering standards that lift the quality and consistency of work across the team • Mentor junior and mid-level engineers, helping them grow their craft, confidence, and impact • Make high-stakes architectural decisions with clear ownership and consideration of long-term tradeoffs

Colombia
Wellth logo

Director of Data Engineering, Healthcare

Wellth

Better Outcomes using Behavioral Economics

Data Engineer7 days ago
Full TimeRemoteTeam 51-200Since 2014H1B No Sponsor

• Leverage your data and healthcare experience to create and communicate the vision and roadmap of Wellth’s data ingestion, warehousing, transformation, analytics, and reporting systems. • Contribute to the design and implementation of Wellth’s data product capabilities (e.g., outcomes reporting, personalization, etc.) • Bring to life our next generation of attributed outcomes reporting that drives customer expansion – these improved member health outcomes are what we sell! • Manage and grow a high-caliber team of data and analytics engineers. • Deliver the vision by defining and driving the execution of data projects that deliver better health outcomes for members and clients. • Collaborate with product team members, technical leaders, and senior executives to understand and serve internal and external stakeholders. • Continuously maintain and improve the data quality bar throughout the data life cycle. You have experience working with data quality challenges that emerge from partner and third-party data. • Protect the privacy of Wellth’s data, its customers, and their patients by following secure SDLC guidelines to maintain our HITRUST certification.

United States
$190K - $230K / year