We are not a typical consulting firm and our people are not typical consultants.
Data Engineer, Databricks
Location
Virginia
Posted
4 days ago
Salary
$98.6K - $167.6K / year
Seniority
Senior
Job Description
Data Engineer, Databricks
ICF
• Enable secure, scalable, and efficient data exchange between federal client and external data sharing partners using Databricks Delta Sharing. • Support the design and development of data pipelines and ETL routines in Azure Cloud environment for many source system types including RDBMS, API, and unstructured data using CDC, incremental, and batch loading techniques. • Conduct data profiling, transformation, and quality assurance on structured, semi-structured, and unstructured data. • Identify underlying issues and translate them into technical requirements. • Assist in building and optimizing data lakes, feature stores, and data warehouse structures to support analytics and machine learning. • Prepare, structure, and validate data for data science and MLOps workflows, ensuring it meets the quality and format requirements for modeling. • Help monitor and maintain the flow of data across BI dashboards, analytics environments, and machine learning pipelines. • Engage directly with clients and stakeholders to understand data needs and translate them into scalable solutions. • Collaborate with UX designers, business analysts, developers, and end users to define data and reporting requirements • Work with external data partners to determine their data product needs and work within the Databricks platform to enable rapid prototyping and extensible use cases • Meet with government employees at executive levels, platform stakeholders, and vendor partners. • Work within Agile teams to support iterative development, backlog grooming, and sprint-based delivery. • Provide mentorship to junior resources.
Job Requirements
- Bachelor’s degree
- 5+ years in data engineering, data security practices, data platforms, and analytics
- U.S. Citizenship required due to federal contract requirements.
- Ability to obtain and maintain a federal public trust clearance or equivalent client-required background investigation.
- Candidate must reside in the U.S., be authorized to work in the U.S., and all work must be performed in the U.S.
- Candidate must have lived in the U.S. for three (3) full years out of the last five (5) years
- 3+ years Databricks Platform Expertise – SME Level Proficiency including: Databricks, Delta Lake, and Delta Sharing
- Deep experience with distributed computing using Apache Spark
- Knowledge of Spark runtime internals and optimization
- Ability to design and deploy performant end-to-end data architectures
- 4+ years of ETL Pipeline Development building robust, scalable data pipelines
- Databricks certifications - Professional or specialty certifications
- Hands-on experience with Azure services such as Synapse, Data Factory, or Databricks.
- Familiarity with data visualization tools such as Tableau, Power BI, or similar.
- Machine Learning and Analytical Skills including: MLOps - Working knowledge of ML deployment and operations
- Data Science Methodologies - Statistical analysis, modeling, and interpretation
- Big Data Technologies - Experience beyond Spark with distributed systems
- Experience with deployment pipelines, including Git-based version control and CI/CD pipelines and DevOps practices using Terraform for IaC.
Benefits
- Reasonable Accommodations are available, including, but not limited to, for disabled veterans, individuals with disabilities, and individuals with sincerely held religious beliefs, in all phases of the application and employment process.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
• Design, develop, and maintain scalable data pipelines using modern distributed data processing platforms and cloud environments. • Build and optimize ETL/ELT processes following industry best practices and cloud-native architectures. • Implement data models aligned with modern Data Lakehouse principles and data architecture frameworks. • Ensure data quality, consistency, and performance across ingestion, staging, and curated data layers. • Collaborate with data architects, analysts, and business stakeholders to understand complex healthcare data requirements. • Develop reusable data transformation logic and modular processing components for efficient, maintainable systems. • Support deployment processes following CI/CD and DevOps best practices. • Monitor and optimize data workflows for performance, scalability, and reliability in production environments. • Contribute to data governance, security, and compliance practices relevant to regulated healthcare environments.
Data Engineer
VetsEZVeterans EZ Info, commonly known as VetsEZ, is a top provider of information technology (IT) services for both commercial and government markets. The company is
• Design, develop, and maintain scalable data pipelines and ETL processes. • Build and optimize data solutions within Azure cloud environments, including Azure Synapse and Azure Data Factory. • Integrate data from multiple sources through APIs and other system integration methods. • Develop and support data analytics and reporting solutions using Power BI. • Ensure data quality, integrity, security, and performance across data platforms. • Collaborate with stakeholders to gather requirements and deliver data-driven solutions supporting healthcare and VA initiatives. • Manage and track development activities using Agile methodologies and tools such as Jira. • Take on additional tasks and responsibilities as needed to support team objectives and ensure the success of the project.
• Help build, maintain, and scale our data pipelines that bring together data from various internal and external systems into our data warehouse. • Partner with internal stakeholders to understand analysis needs and consumption patterns. • Partner with upstream engineering teams to enhance data logging patterns and best practices. • Participate in architectural decisions and help us plan for the company’s data needs as we scale. • Adopt and evangelize data engineering best practices for data processing, modeling, and lake/warehouse development. • Advise engineers and other cross-functional partners on how to most efficiently use our data tools.
Senior Software Engineer, AI Data Engineering
EvolutionIQLeading the artificial intelligence transformation for insurance carriers.
• Orchestrate High-Velocity Workflows: Leverage advanced agentic coding tools (e.g., Cursor, multi-agent environments) to dramatically accelerate feature prototyping, code generation, and test coverage. • Own the Guardrails & Quality: Act as the ultimate reviewer and architect; define the specifications, establish repo-context guardrails, and review AI-accelerated output for hidden security risks, scale bottlenecks, and architectural alignment. • Build Scalable Application and Data Layers: Design, build, and maintain our data pipelines and application to service our hundreds of users. • Bridging the Data: Partner closely with product, client services, and data science teams to transform raw pipeline outputs into meaning. • Drive Greenfield Engineering: Take complete ownership of designing, building, and launching enterprise-grade applications, moving fluidly between rapid prototyping and bulletproof production systems.




