Job Closed

This listing is no longer active.

Nuvitek logo
Nuvitek

Speed Up True Modernization

Data Engineer

Data EngineerData EngineerFull TimeRemoteSeniorTeam 51-200Since 2012H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

8 days ago

Salary

$115K - $125K / year

Seniority

Senior

Bachelor Degree4 yrs expEnglishCloudETLPython

Job Description

Data Engineer

Nuvitek

• Design, develop, and maintain scalable RAG/CAG pipelines for AI-powered applications • Build and optimize document ingestion workflows for structured and unstructured data sources • Manage and maintain vector stores to support semantic search and retrieval capabilities • Develop OCR processing pipelines for historical and modern document collections spanning 1781–2025 • Optimize retrieval performance, relevance tuning, and ranking strategies for LLM-based systems • Build reliable data pipelines that support integrations with large language models and AI services • Collaborate with engineers, UX teams, product owners, and stakeholders to deliver scalable AI solutions • Ensure data quality, integrity, security, and performance across ingestion and retrieval systems • Implement monitoring, logging, and troubleshooting for AI and data processing workflows • Contribute to architecture decisions, technical documentation, and engineering best practices • Participate in agile pod-based development teams and continuous improvement initiatives

Job Requirements

  • 4+ years of experience in data engineering, data platform development, or AI/ML infrastructure
  • Strong experience building RAG and/or CAG pipelines
  • Hands-on experience with vector databases and semantic retrieval systems
  • Experience developing document ingestion and OCR processing workflows
  • Strong understanding of LLM integrations and AI data pipeline architectures
  • Experience working with structured, semi-structured, and unstructured datasets
  • Proficiency with Python and modern data engineering frameworks
  • Familiarity with APIs, ETL/ELT pipelines, and distributed processing systems
  • Experience building and operating data pipelines in secure federal cloud environments, including FedRAMP Moderate and Zero Trust architectures, with appropriate handling of sensitive data and Controlled Unclassified Information (CUI)
  • Ability to obtain and maintain a federal Public Trust (or higher) clearance
  • Strong analytical, troubleshooting, and performance optimization skills
  • Ability to work effectively in agile or pod-based delivery environments
  • Excellent communication and collaboration skills

Benefits

  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • Disability and Life Insurance
  • Parental Leave
  • 401K
  • Paid Time Off

Related Categories

Related Job Pages

More Data Engineer Jobs

CSpring logo

Enterprise Data Warehouse ETL/Data Engineer

CSpring

Unlocking the power and potential of data.

Data Engineer8 days ago
Full TimeRemoteTeam 51-200H1B Sponsor

• Design and develop reusable, parameter-driven ingestion and transformation pipelines • Build and maintain medallion architecture solutions • Develop performant ELT workflows • Create and optimize PySpark notebooks and distributed processing jobs • Design dimensional data models • Implement data vault patterns • Optimize distributed SQL workloads • Implement CI/CD processes • Build monitoring, logging, and auditing solutions • Lead or contribute to cloud modernization initiatives

Illinois

Data Engineer

UnitedHealth Group

UnitedHealth Group is a healthcare and well-being company that’s dedicated to improving the health outcomes of millions around the world. We are comprised of

Data Engineer8 days ago

Role Description We are seeking a highly skilled Senior Data Engineer to design, build, and optimize scalable data and AI platforms on Azure. This role will focus on enabling enterprise data pipelines, real-time processing, and AI/ML model integration using Databricks and modern cloud technologies. You will enjoy the flexibility to telecommute* from anywhere within the U.S. as you take on some tough challenges. - Design and develop scalable data pipelines using Databricks, Apache Spark, and Python on Azure - Build cloud-native solutions leveraging Azure Data Lake, Azure Data Factory, and Delta Lake - Collaborate with Data Science and AI teams to operationalize ML models and embed them into production workflows - Develop and maintain feature stores, model input pipelines, and real-time/streaming frameworks - Ensure data quality, governance, and security across the full data lifecycle - Build reusable frameworks, accelerators, and automation scripts to improve engineering efficiency - Optimize performance, scalability, and reliability of data workflows and batch/streaming pipelines - Participate in Agile development processes, including sprint planning, code reviews, and CI/CD pipelines - Provide production support and on-call coverage, ensuring system stability and rapid issue resolution - Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI Qualifications - Bachelor’s degree in Computer Science, Engineering, or IT related field - 6+ years of experience in Data Engineering with Python/PySpark - 6+ years of experience in building ETL/ELT pipelines using Databricks - 6+ years of experience working in Agile environments - 5+ years of strong experience in SQL / PL-SQL - 4+ years of experience with Azure Databricks and Delta Lake architecture - 4+ years of hands-on experience with CI/CD (GitHub Actions, Azure DevOps) - 3+ years of hands-on experience with Azure cloud services (ADF, ADLS, Databricks) - 2+ years of experience with Databricks Delta Live Tables (DLT) - 2+ years of experience with unit testing, validation, and pipeline testing frameworks Requirements - Familiarity with medallion architecture and SCD2 implementations - AI builder: Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI - Experience building enterprise-scale data platforms - Strong skills in performance tuning and debugging large-scale pipelines - Experience with real-time/streaming frameworks (Structured Streaming) - Ability to work in distributed, cross-functional global teams - Exposure to GenAI tools (e.g., GitHub Copilot) for engineering productivity - Strong understanding of secure coding practices and vulnerability remediation - Proven ability to analyze logs, troubleshoot production issues, and optimize performance - Demonstrated capability to design and deploy AI-powered solutions responsibly Benefits - Comprehensive benefits package - Incentive and recognition programs - Equity stock purchase - 401k contribution (all benefits are subject to eligibility requirements)

United States
$72.8K - $130K / year
Job Closed
Full TimeRemoteTeam 11-50H1B No Sponsor

• 8–10 years in Data Engineering and Data Analysis. • Strong hands-on experience in Informatica PowerCenter/IDQ for ETL design, development, and optimization. • Advanced skills in PySpark for large-scale data processing, transformation, and analytics. • Solid working knowledge of Hadoop technologies (HDFS, Hive, Sqoop, MapReduce). • Proficiency in Python and Kafka for streaming and batch data pipelines. • Strong understanding of database concepts, data design, data modeling, and ETL workflows. • Experience in analyzing, designing, and coding ETL programs including data extraction, ingestion, quality checks, normalization, and loading. • Hands-on experience with Agile methodology and Jira for project delivery. • Proven ability in client-facing roles with strong communication and leadership skills to coordinate across SDLC. • Exposure to AWS data components and analytics. • Familiarity with machine learning models and AI concepts. • Experience with data modeling tools such as Erwin.

Ohio + 1 moreAll locations: Ohio | Pennsylvania
Capgemini Engineering logo

FBS Data Engineer, SQL, Power BI

Capgemini Engineering

Capgemini Engineering, the leader in engineering and R&D services, helps clients unleash their R&D potential.

Data Engineer8 days ago
Full TimeRemoteTeam 10,001+H1B No Sponsor

• Responsible for acquiring, curating, and publishing data both on prem and in the cloud for analytical or operational uses. • Ensures the data is in a ready-to-use form. • Utilizes skills to translate business analytic requests/requirements. • Works with various technologies from big data, relational and non-relational databases. • Consults on data projects of intermediate complexity. • Practices code management and integration with current architectural and data governance guidelines. • Produces data building blocks and data flows for varying client requests. • Creates business user access methods to data. • Utilizes techniques such as mapping data and transforming data to satisfy business rules. • Translates business data stories into a technical story breakdown structure. • Develops and maintains moderately complex scalable data pipelines. • Executes on mid size projects and identifies opportunities to optimize data workflows.

Mexico