Astreya

IT services that put people at the center of your business

AI Infrastructure TPM II

LLM EngineerMachine Learning EngineerFull Time Remote LeadTeam 1,001-5,000Since 2001H1B SponsorCompany Site LinkedIn

Location

California

Posted

58 days ago

Salary

$73.0K - $100.8K / year

Seniority

Lead

Bachelor Degree8 yrs expEnglishPMP

Job Description

• Lead large-scale AI infrastructure deployment programs across multiple sites, regions, or business units. • Drive end-to-end project execution for GPU clusters, AI compute environments, storage platforms, high-speed networks, and data center infrastructure. • Develop integrated project plans, implementation strategies, and operational readiness frameworks. • Manage cross-functional coordination between engineering, operations, supply chain, vendors, and executive stakeholders. • Identify and mitigate program risks, schedule impacts, technical dependencies, and operational constraints. • Lead infrastructure migration, expansion, upgrade, and modernization initiatives. • Drive governance reviews, project reporting, KPI tracking, and executive-level communications. • Coordinate infrastructure acceptance testing, deployment validation, and production readiness activities. • Mentor junior project managers and contribute to PMO process standardization and operational maturity. • Support vendor negotiations, technical evaluations, and infrastructure planning initiatives.

Job Requirements

Advanced understanding of AI infrastructure technologies including GPU platforms, storage systems, networking, and data center operations.
8+ years of technical project or program management experience within infrastructure environments.
Proven experience leading large-scale infrastructure deployment or transformation programs.
Strong risk management, executive communication, and stakeholder alignment skills.
Experience coordinating multi-vendor and cross-functional technical teams.
Ability to manage complex schedules, budgets, and operational dependencies.
Relevant certifications preferred (PMP, PgMP, ITIL, Agile, CCNA, etc.).

Benefits

Medical provided through UHC (PPO, HSA, Surest options) / Medical provided through Kaiser (HMO option only) for California employees only
Dental provided through UHC Nationwide
Vision provided by UHC
Flexible Spending Account for Health & Dependent Care
Pre-Tax Account for Commuter Benefit/Parking & Transit (location-specific)
Continuing Education and Professional Development via various integrated platforms, e.g. Udemy and Coursera
Corporate Wellness Program provided by Goomi Group
Employee Assistance Program
Wellness Days
401k Plan
Basic and Supplemental Life Insurance
Short Term & Long Term Disability
Critical Illness, Critical Hospital, and Voluntary Accident Insurance
Tuition Reimbursement (available 6 months after start date, capped)
Paid Time Off (accrued and prorated, maximum of 120 hours annually)
Paid Holidays
Any other statutory leaves, paid time, or other ancillary benefits required under state and federal law

Related Categories

LLM Engineer AI Engineer Machine Learning Engineer AI Research Scientist Computer Vision Engineer NLP Engineer

Related Job Pages

LLM Engineer Jobs in California Remote Full-time Jobs (US)More Remote Jobs

More LLM Engineer Jobs

AI Infrastructure DC Design Intern

Astreya

IT services that put people at the center of your business

LLM Engineer59 days ago

Internship RemoteTeam 1,001-5,000Since 2001H1B Sponsor

Company Site LinkedIn

• Assist in creating and updating AutoCAD drawings, rack elevations, cabinet layouts, and structured cabling documentation. • Support data hall design documentation, asset tracking, and revision management activities. • Assist engineering teams with infrastructure inventory validation and basic capacity tracking. • Help maintain design standards, templates, and project documentation repositories. • Participate in engineering reviews, design walkthroughs, and quality assurance activities. • Coordinate with cross-functional teams to gather project inputs and update design records. • Support preparation of project reports, spreadsheets, diagrams, and technical documentation. • Learn data center infrastructure concepts including power, cooling, cabling, rack configurations, and AI cluster environments. • Follow established operational procedures, engineering standards, and safety requirements. • Assist with administrative and project coordination tasks related to infrastructure deployment activities.

Cloud

View details: AI Infrastructure DC Design Intern

California

$15 - $24 / hour

Apply

Principal AI Researcher – Agentic Systems, AI Infrastructure

Trase

AI, Uncomplicated.

LLM Engineer59 days ago

Full Time RemoteTeam 11-50Since 2023H1B No Sponsor

Company Site LinkedIn

• Define and evolve the long-term AI/ML research strategy and technical roadmap for Trase OS in alignment with product and platform direction. • Lead large-scale experimentation and prototyping efforts requiring significant compute infrastructure, translating frontier AI research into scalable, production-grade systems with measurable impact. • Drive original research and technical breakthroughs in agentic systems, autonomous execution, multi-agent orchestration, post-training and fine-tuning systems, SLM/LLM-based architectures, and applied AI infrastructure. • Design how models operate within long-lived execution environments, including agent workflows, tool use, planning, memory systems, reasoning, and human-in-the-loop controls. • Establish evaluation methodologies and reliability frameworks for autonomous systems, including benchmarking, regression testing, safety, controllability, and production behavior analysis. • Drive architecture decisions across orchestration, model serving, routing, inference, and infrastructure governance, including latency, reliability, and cost optimization. • Partner closely with engineering and product teams to operationalize research outcomes into deployable systems and enterprise workflows. • Build AI systems that operate reliably in regulated and constrained environments, including secure cloud, on-premise, and air-gapped deployments. • Contribute to the broader AI research community through technical papers, publications, conference participation, architecture proposals, and thought leadership. • Serve as a senior technical authority and mentor across the organization, influencing technical direction, research rigor, experimentation practices, and best practices across research, engineering, and product teams.

Cloud Java Python

View details: Principal AI Researcher – Agentic Systems, AI Infrastructure

Virginia + 1 more

$250K - $300K / year

Apply

Job Closed

Principal AI Researcher – Agentic Systems, AI Infrastructure

Red Cell Partners

Impact Through Innovation

LLM Engineer59 days ago

Full Time RemoteTeam 11-50H1B Sponsor

Company Site LinkedIn

Cloud Java Python

View details: Principal AI Researcher – Agentic Systems, AI Infrastructure

Virginia + 1 more

$250K - $300K / year

Apply

Job Closed

AI Infrastructure Datacenter Technical Project Manager II

Astreya

IT services that put people at the center of your business

LLM Engineer60 days ago

Full Time RemoteTeam 1,001-5,000Since 2001H1B Sponsor

Company Site LinkedIn

Role Description The AI Infrastructure Datacenter Technical Project Manager Level 2 serves as a senior project leader responsible for managing large-scale AI infrastructure programs, complex technical deployments, and cross-functional strategic initiatives. This role drives execution excellence across compute, GPU, storage, networking, and data center infrastructure domains while ensuring alignment with business and operational objectives. Key Responsibilities - Lead large-scale AI infrastructure deployment programs across multiple sites, regions, or business units. - Drive end-to-end project execution for GPU clusters, AI compute environments, storage platforms, high-speed networks, and data center infrastructure. - Develop integrated project plans, implementation strategies, and operational readiness frameworks. - Manage cross-functional coordination between engineering, operations, supply chain, vendors, and executive stakeholders. - Identify and mitigate program risks, schedule impacts, technical dependencies, and operational constraints. - Lead infrastructure migration, expansion, upgrade, and modernization initiatives. - Drive governance reviews, project reporting, KPI tracking, and executive-level communications. - Coordinate infrastructure acceptance testing, deployment validation, and production readiness activities. - Mentor junior project managers and contribute to PMO process standardization and operational maturity. - Support vendor negotiations, technical evaluations, and infrastructure planning initiatives. Scope & Complexity - Leads highly complex infrastructure programs with multiple concurrent workstreams. - Manages enterprise-scale AI infrastructure deployments and operational initiatives. - Influences program execution standards, governance models, and delivery methodologies. Qualifications - Advanced understanding of AI infrastructure technologies including GPU platforms, storage systems, networking, and data center operations. - 8+ years of technical project or program management experience within infrastructure environments. - Proven experience leading large-scale infrastructure deployment or transformation programs. - Strong risk management, executive communication, and stakeholder alignment skills. - Experience coordinating multi-vendor and cross-functional technical teams. - Ability to manage complex schedules, budgets, and operational dependencies. - Relevant certifications preferred (PMP, PgMP, ITIL, Agile, CCNA, etc.). Requirements - Salary Range: $72,960.00 - $100,800.00 USD (Salary) - Please note that the salary information provided herein is base pay only (gross); it does not include other forms of compensation which may or may not apply to this specific position, namely, performance-based bonuses, benefits-related payments, or other general incentives - none of which are guaranteed, may be subject to specific eligibility requirements, and are wholly within the discretion of Astreya to remit. - Further, the salary information noted above is a range that consists of a minimum and maximum rate of pay for this specific position. Where an applicant or employee is placed on this range will depend and be contingent on objective, documented work-related considerations like education, experience, certifications, licenses, preferred qualifications, among other factors. Benefits - Medical provided through UHC (PPO, HSA, Surest options) - Medical provided through Kaiser (HMO option only) for California employees only - Dental provided through UHC - Nationwide Vision provided by UHC - Flexible Spending Account for Health & Dependent Care - Pre-Tax Account for Commuter Benefit/Parking & Transit (location-specific) - Continuing Education and Professional Development via various integrated platforms, e.g. Udemy and Coursera - Corporate Wellness Program provided by Goomi Group - Employee Assistance Program - Wellness Days - 401k Plan - Basic and Supplemental Life Insurance - Short Term & Long Term Disability - Critical Illness, Critical Hospital, and Voluntary Accident Insurance - Tuition Reimbursement (available 6 months after start date, capped) - Paid Time Off (accrued and prorated, maximum of 120 hours annually) - Paid Holidays - Any other statutory leaves, paid time, or other ancillary benefits required under state and federal law

View details: AI Infrastructure Datacenter Technical Project Manager II

United States

$73.0K - $100.8K / year

Apply

AI Infrastructure TPM II

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More LLM Engineer Jobs

AI Infrastructure DC Design Intern

Principal AI Researcher – Agentic Systems, AI Infrastructure

Principal AI Researcher – Agentic Systems, AI Infrastructure

AI Infrastructure Datacenter Technical Project Manager II