FEI Systems, an IT services and analysis company based in Columbia, Maryland, was founded in 1999 to connect "every dimension of health and human services when
Data Engineer - ML/AI Data Platform
Location
United States
Posted
2 days ago
Salary
0
Seniority
Mid Level
Job Description
Data Engineer - ML/AI Data Platform
FEI Systems
Role Description We are seeking a Data Engineer to support Machine Learning and AI initiatives. Working closely with the Solution Architect, Data Architect, DevOps, and Application Engineering teams, this role is responsible for ensuring that data within our cloud-based platform is high quality, well-governed, feature-ready, and production-grade to support model training, deployment, and ongoing operations. The ideal candidate has 5+ years of cloud data engineering experience with strong proficiency in Snowflake, Python, and SQL, and solid familiarity with AWS-native data services. Immediate focus is Snowflake-based data engineering, pipeline development, and data quality. Feature engineering, model training support, and MLOps contributions are growth areas that will ramp over time as you become embedded with the team. Key Responsibilities - Data Pipeline Engineering - Design, build, and maintain scalable data pipelines supporting ML/AI workloads. - Engineer pipeline patterns including full loads, incremental loads, change-based loads, and slowly changing dimensions. - Ensure pipelines are reliable, performant, secure, and maintainable; troubleshoot and monitor pipelines within an AWS ecosystem. - Snowflake & Cloud Data Engineering - Perform data transformations in Snowflake using SQL and native Snowflake features. - Design and optimize schemas, tables, views, and materialized views for ML/AI consumption. - Support AWS-native data lake patterns using S3, Glue, Athena, Apache Iceberg, and S3 Tables. - Feature Engineering & Data Preparation - Perform data cleansing, normalization, and enrichment to support ML model development. - Design and implement feature engineering pipelines including aggregation and transformation. - Ensure consistency, reuse, and versioning of features across models and use cases. - Support feature store patterns to enable feature discoverability and reuse. - Collaborate with ML engineers and data scientists to operationalize features into training pipelines. - Model Training & MLOps Support - Support model training workflows, including dataset preparation and scheduled refreshes. - Ensure training datasets and features are reproducible, traceable, and auditable. - Integrate data pipelines into CI/CD workflows; support version control, testing, and deployment of data assets. - Monitor pipeline health, data freshness, and downstream impact on ML/AI systems. Qualifications - 5+ years of hands-on data engineering experience in a cloud environment. Required Skills & Experience - Core Technologies - Python — strong proficiency for data processing and pipeline development. - SQL — advanced skills with hands-on Snowflake transformation experience. - Snowflake — ELT pipeline design, schema optimization, performance tuning, cost management. - PostgreSQL — experience with querying, data modeling, and analytics; familiarity with SQL Server to PostgreSQL migration a plus. - AWS — S3, Glue, Athena, Snowflake integration, and managed relational databases (e.g., Aurora, RDS). - Apache Iceberg / S3 Tables — familiarity with open table format ecosystems. - Streaming ingestion tools (e.g., Kinesis, Kafka, or equivalent). - Workflow orchestration tools (e.g., Airflow, Step Functions, or equivalent). - Pipeline & Data Engineering - Experience with full loads, incremental loads, append-only pipelines, change-based processing, and SCDs. - Data validation, reconciliation, error handling, and restart/recovery patterns. - Data modeling for analytics, ML/AI, and downstream application use cases. - Ability to evaluate pipeline design trade-offs across performance, cost, reliability, and maintainability. - DevOps & Engineering Practices - Structured SDLC experience with CI/CD pipelines for data and ML workflows. - API-based and event-driven data integration patterns. - Distributed data processing environments. - ML/AI Data Foundations - Understanding of data requirements for ML/AI workloads. - Experience preparing training datasets and features from enterprise data lakes. - Familiarity with reproducibility, dataset versioning, and data lineage concepts. - Familiarity with GenAI concepts relevant to data engineering, such as embedding pipelines, vector databases, retrieval-augmented generation (RAG) data flows, or prompt-driven data processing — including awareness of data security and privacy considerations when working with LLMs. Education - Bachelor's degree in Computer Science, Data Engineering, Information Systems, or a related technical field. Equivalent professional experience will be considered. Location - Remote Status - Full time position with full company benefits.
Related Guides
Related Categories
Related Job Pages
More Platform Engineer Jobs
Senior Software Engineer, White Label Platform
UpstartOur mission is to enable effortless credit based on true risk.
• Collaborate with product managers, engineers, and business stakeholders to deliver projects that align with business goals • Assist in the design, development, and maintenance of self-service tools that enhance the investor experience • Work with business stakeholders to identify opportunities for process optimization and build solutions that improve business workflows • Develop scalable, reliable systems that meet the needs of both internal users and external investors • Ensure security, performance, and availability of our critical platforms • Participate in code reviews, testing, and the deployment of high-quality code
Senior Software Engineer – Lending Platform
UpstartOur mission is to enable effortless credit based on true risk.
• Design, build, and operate distributed, event-driven systems that power accounting close automation and capital risk workflows. • Deliver scalable data processing solutions that transform high-volume financial events into accurate, reliable business outcomes. • Partner closely with Product, Data Analytics, Finance, and Engineering teams to define technical solutions for complex financial processes. • Improve the reliability, scalability, and maintainability of critical financial systems while thoughtfully managing technical debt. • Lead multi-quarter engineering initiatives from architecture through production, balancing long-term platform investments with business priorities. • Leverage AI-assisted development tools to improve engineering productivity while maintaining high standards for quality, security, and correctness.
Senior Platform Engineer
Apollo GraphQLApollo is the GraphQL company. Our mission is to empower every developer with a graph.
• You’ll become a key contributor to the team, taking responsibility for the success of some of our subsystems. • You will be participating on medium to large impact team initiatives, and within a year be able to execute on such projects. • You’ll help with interviewing potential teammates. • You’ll create technical designs that proactively address cost efficiency, security, and observability. • You’ll deliver technical plans, one-pagers, DRs, and other artifacts. • You’ll work with Kubernetes, GCP, Helm, Terraform, DataDog, ArgoCD, CircleCI, Atlantis, Docker (the list goes on) to deliver your work. • You’ll be responsible for improving developer velocity across the company (leveraging frameworks like DORA) and hardening our reliability and observability. • You’ll participate in on-call rotations and help keep all of Apollo afloat • You’ll be fully empowered to fix the root cause of issues, and ruthlessly quell any noisy monitors
Senior Software Engineer, Sensor Simulation Platform
WaymoWaymo is an autonomous driving technology company creating a new way forward in mobility.
Role Description Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. The Sensor Simulator Platform Team maintains the core of the simulation pipeline that allows Waymo to drive millions of simulated miles each year. The team is responsible for both the rendering/simulation engine as well as the pipeline that lets a variety of internal customers run SensorSim at scale. - Help build and maintain efficient simulators capable of running millions of simulated miles to evaluate the safety of Waymo's onboard models. - Work with multi-modal data processing and simulation, including cameras, lidar, and radar. - Interface with AI and machine learning teams that develop parts of the simulation pipeline. Qualifications - Developed or worked with 3D game engines. - Experience in approximating the real world with data simulation, which might include physics simulation, animation capture, light transport, photogrammetry, NeRFs, or Gaussian splats. - Experience in working on efficient asset management pipelines for data storage and processing. Requirements - Prior experience in working on or integrating with ML pipelines and a general understanding of how such pipelines are trained and improved. - Experience in scaling complex data processing pipelines and an understanding of the trade-offs in structuring these pipelines to operate synchronously or asynchronously. Benefits - Waymo employees are eligible to participate in Waymo’s discretionary annual bonus program. - Equity incentive plan. - Generous Company benefits program, subject to eligibility requirements. Salary Range The expected base salary range for this full-time position across US locations is: $213,000 — $263,000 USD


