AI Data Infrastructure Engineer

Location

United States

Posted

4 days ago

Salary

$100K - $150K / year

Seniority

Mid Level

Job Description

AI Data Infrastructure Engineer

Bright Vision Technologies

Role Description We are seeking an AI Data Infrastructure Engineer to build and operate the large-scale data systems that power modern AI training and evaluation pipelines. The role combines deep data engineering expertise with a strong understanding of AI workloads, focusing on ingestion, transformation, quality assurance, lineage, and high-throughput delivery of data to training jobs across diverse modalities. The ideal candidate has experience operating petabyte-scale data systems, strong software engineering fundamentals, and clear understanding of how data infrastructure choices propagate into model quality and training efficiency. Key Responsibilities - Design and operate large-scale data pipelines supporting AI training, evaluation, and continual improvement workflows. - Build ingestion systems for diverse modalities including text, image, audio, video, and structured signals. - Implement data cleaning, deduplication, filtering, and quality assurance at petabyte scale. - Develop dataset versioning, lineage, and provenance tracking systems suitable for reproducible training. - Build high-throughput data loading systems that maximize GPU utilization during training. - Implement labeling workflows, active learning pipelines, and human-in-the-loop data improvement systems. - Design storage architectures balancing cost, throughput, and latency across data tiers. - Build evaluation dataset construction pipelines with strict integrity and contamination controls. - Implement data privacy, redaction, and consent enforcement throughout the pipeline. - Collaborate with ML researchers and engineers to align data systems with model development needs. - Drive observability of data quality, drift, and pipeline health across the AI data estate. - Optimize cost and performance through compression, format selection, and caching strategies. - Document data systems, schemas, and operational procedures for broad internal use. - Stay current with AI data infrastructure research and emerging open-source tools. Qualifications - Bachelor’s or Master’s degree in Computer Science or a related field. - Six or more years of data engineering experience, with significant work supporting ML or AI workloads. - Strong proficiency in Python and at least one JVM or systems language. - Deep experience with modern data processing frameworks such as Spark, Ray, or Beam. - Hands-on experience operating petabyte-scale storage and pipeline systems. - Strong understanding of distributed systems, data modeling, and storage formats. - Experience with dataset versioning, lineage, and reproducibility for ML workflows. - Familiarity with high-throughput data loading for accelerator-based training. - Strong software engineering practices including testing, CI/CD, and code review. - Excellent communication and cross-functional collaboration skills. Preferred Qualifications - Experience with multimodal datasets at large scale. - Familiarity with data quality tooling and dataset evaluation methodology. - Exposure to privacy-preserving data systems and regulated data handling. - Open-source contributions to data infrastructure projects. - Experience supporting frontier model training pipelines. Requirements - 100% Remote (Continental United States) - 6+ years of experience - Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party) - No new H1B sponsorship available. H1B transfers welcomed for qualified candidates. - Technical coding assessment is mandatory. Benefits - Competitive base salary commensurate with experience, plus benefits. How to Apply For immediate consideration, please send your resume to [email protected] or contact us at (908) 505-3545.

Related Categories

Related Job Pages

More Infrastructure Engineer Jobs

Quantrics Enterprises Inc. logo

Windows and Database Infrastructure Engineer

Quantrics Enterprises Inc.

At Quantrics, we believe in the power of the individual to create a better future - for our customers

Full TimeRemoteTeam 5,001-10,000Since 2016H1B No Sponsor

• Installing, configuring, and troubleshooting Windows Server environments • Writing SQL queries and MS SQL administration • Managing virtual machines using VMWare and OpenStack • Working with Red Hat OpenShift environments • Familiarity with load balancing concepts and multi-data center environments • Utilizing IaC tools (Terraform or Ansible) for deployment automation • Automating Windows administrative tasks using PowerShell.

Philippines
Quantrics Enterprises Inc. logo

Linux Infrastructure Engineer

Quantrics Enterprises Inc.

At Quantrics, we believe in the power of the individual to create a better future - for our customers

Full TimeRemoteTeam 5,001-10,000Since 2016H1B No Sponsor

• Design and automate RF design workflows to optimize cost and time • Assess and develop cutting-edge wireless propagation models • Translate complex regulatory guidelines into efficient RF engineering workflows • Continuously research and improve RF propagation modeling and predictions • Provide expert guidance to E2E teams on RF engineering best practices • Identify and resolve challenges creatively, proactively future proofing infrastructure • Constantly optimize and simplify propagation modelling and geodata assets

Philippines
Quantrics Enterprises Inc. logo

Senior Infrastructure Architect

Quantrics Enterprises Inc.

At Quantrics, we believe in the power of the individual to create a better future - for our customers

Full TimeRemoteTeam 5,001-10,000Since 2016H1B No Sponsor

• Operate expert-level knowledge of Linux environments, kernel tuning, and advanced administration. • Oversee advanced network routing, switching, load balancing, and firewall configurations. • Lead the strategy and implementation of Infrastructure as Code (IaC) to automate server and network device management. • Act as the primary technical authority, providing consulting during project initiation. • Continuously evaluate and optimize network throughput, Linux OS performance, and overall system availability.

Philippines
Full TimeRemoteTeam 51-200Since 2017H1B Sponsor

SentiLink provides innovative identity and risk solutions, empowering institutions and individuals to transaction with confidence. We’re building the future of identity verification in the United States replacing a clunky, ineffective, and expensive status quo with solutions that are 10x faster, smarter, and more accurate. We’ve seen tremendous traction and are growing extremely quickly. Our real-time APIs have helped verify hundreds of millions of identities, starting with financial services and rapidly expanding into new markets. SentiLink is backed by world-class investors including Craft Ventures, Andreessen Horowitz, NYCA, and Max Levchin. We’ve earned recognition from TechCrunch, CNBC, Bloomberg, Forbes, Business Insider, PYMNTS, American Banker, LendIt, and have been named to the Forbes Fintech 50. We have also been named a 2026 FICO Industry Vanguard Decision Award Winner. Last but not least, we’ve even made history - we were the first company to go live with the eCBSV and testified before the United States House of Representatives on the future of identity. SentiLink supports a variety of ways to work, ranging from fully remote to in-office. We operate as a digital-first company with strong collaboration across the U.S. and India. We maintain physical offices in Austin, San Francisco, New York City, Seattle, Los Angeles, and Chicago in the U.S., and in Gurugram (Delhi) and Bengaluru in India. If you’re located near one of these offices, we would love for you to spend time in the office regularly. Some roles are hybrid or in-office by design. For example, our engineering team in India works primarily from our Gurugram office. About the Opportunity:As a Senior Infrastructure Engineer, you’ll be responsible for the development of standards, processes, tooling, and systems that serve as the foundation of the SentiLink platform. We’re looking for someone driven towards improving the reliability and efficiency of our systems, services, and engineering teams. You will work closely with the Engineering, Data Science, and Analytics teams to understand their needs and pain points, and to build systems and tools to improve their velocity, reliability, efficiency, and visibility. The optimal candidate will have a bias towards secure solutions that follow engineering best practice. Technologies: Python, Aurora PostgreSQL, AWS infrastructure (EC2, S3, RDS, Redshift, etc.), Kubernetes, Docker, Terraform, CICD, observability tooling (e.g., Datadog, Prometheus, SumoLogic), OpenSearch, and Linux This is a remote, US-based role. Responsibilities: - Construct infrastructure as code. Develop and enforce best practice across configurations while preventing drift between Terraform configurations and infrastructure deployments - Design infrastructure that enables Engineering, Data Science, and Analytics to rapidly perform software development and data processing - Design, implement, and maintain scalable DevOps and CI/CD pipelines to automate application deployment, infrastructure provisioning, and system monitoring, ensuring high availability and efficient development workflows - Implement monitoring tools, dashboards, and functionalities for a variety of services and operations across SentiLink’s infrastructure and software platform - Formulate strategies and execute solutions for cloud identity and access management - Collaborate with the SRE and security teams to maintain secure, up-to-date infrastructure in our cloud environment - Supervise and monitor platform costs, working cross-functionally to keep costs in line with corporate financial expectations - Oversee, develop, and operate Kubernetes and service mesh infrastructure, ensuring smooth performance and reliability - Investigate operational alerts, pinpoint root causes, and compile comprehensive root cause analysis reports. Pursue action items relentlessly until they are thoroughly completed - Conduct in-depth examinations of database operational issues, actively developing and improving database architecture, schema, and configuration for enhanced performance and reliability Requirements: - 4+ years of relevant work experience - Familiarity with AWS cloud infrastructure, managing infrastructure as code, and cloud identity and access management - Experience developing cloud networking infrastructure, including DNS, CDNs, load balancers, VPCs, subnets, and security groups - Experience with scaling and migrating production systems with little to no downtime - Experience managing observability platforms, building monitoring dashboards, and configuring high quality, actionable alerting - Experience working on software delivery pipelines (CI/CD) and DevOps tooling a plus - Background in building secure container orchestration using Docker and Kubernetes is a plus - Experience operating enterprise-size databases. Postgres, Aurora, Redshift, and OpenSearch experience is a plus - Experience with Python or Golang is a plus - Hands on with development and testing of distributed systems at scale is a big plus - Candidates must be legally authorized to work in the United States and must live in the United States Salary Range: - $145,000/year - $250,000/year + equity + benefits Note: This salary range may be inclusive of several career levels, and the actual base salary within that range will be determined by several components including but not limited to the individual's experience, skills, and qualifications.Perks: - Employer paid group health insurance for you and your dependents - 401(k) plan with employer match (or equivalent for non US-based roles) - Flexible paid time off - Regular company-wide in-person events - Home office stipend, and more! Corporate Values: - Follow Through - Deep Understanding - Whatever It Takes - Do Something Smart

United States
$145K - $200K / year