Astreya logo
Astreya

IT services that put people at the center of your business

AI Infrastructure DC Design Intern

LLM EngineerMachine Learning EngineerInternshipRemoteEntry LevelTeam 1,001-5,000Since 2001H1B SponsorCompany SiteLinkedIn

Location

California

Posted

4 days ago

Salary

$15 - $24 / hour

Seniority

Entry Level

Associate DegreeEnglishCloud

Job Description

AI Infrastructure DC Design Intern

Astreya

• Assist in creating and updating AutoCAD drawings, rack elevations, cabinet layouts, and structured cabling documentation. • Support data hall design documentation, asset tracking, and revision management activities. • Assist engineering teams with infrastructure inventory validation and basic capacity tracking. • Help maintain design standards, templates, and project documentation repositories. • Participate in engineering reviews, design walkthroughs, and quality assurance activities. • Coordinate with cross-functional teams to gather project inputs and update design records. • Support preparation of project reports, spreadsheets, diagrams, and technical documentation. • Learn data center infrastructure concepts including power, cooling, cabling, rack configurations, and AI cluster environments. • Follow established operational procedures, engineering standards, and safety requirements. • Assist with administrative and project coordination tasks related to infrastructure deployment activities.

Job Requirements

  • Currently pursuing or recently completed an Associate or Bachelor’s degree in Engineering, Information Technology, Computer Science, or related field.
  • Basic understanding of datacenter or IT infrastructure concepts preferred.
  • Familiarity with AutoCAD, Visio, or similar drafting/documentation tools is a plus.
  • Proficiency in Microsoft Office tools including Excel, PowerPoint, and Word.
  • Strong attention to detail, organization, and communication skills.
  • Ability to learn quickly and work in a collaborative team environment.
  • Interest in AI infrastructure, data centers, networking, or cloud technologies preferred.

Related Job Pages

More LLM Engineer Jobs

Full TimeRemoteTeam 11-50Since 2023H1B No Sponsor

• Define and evolve the long-term AI/ML research strategy and technical roadmap for Trase OS in alignment with product and platform direction. • Lead large-scale experimentation and prototyping efforts requiring significant compute infrastructure, translating frontier AI research into scalable, production-grade systems with measurable impact. • Drive original research and technical breakthroughs in agentic systems, autonomous execution, multi-agent orchestration, post-training and fine-tuning systems, SLM/LLM-based architectures, and applied AI infrastructure. • Design how models operate within long-lived execution environments, including agent workflows, tool use, planning, memory systems, reasoning, and human-in-the-loop controls. • Establish evaluation methodologies and reliability frameworks for autonomous systems, including benchmarking, regression testing, safety, controllability, and production behavior analysis. • Drive architecture decisions across orchestration, model serving, routing, inference, and infrastructure governance, including latency, reliability, and cost optimization. • Partner closely with engineering and product teams to operationalize research outcomes into deployable systems and enterprise workflows. • Build AI systems that operate reliably in regulated and constrained environments, including secure cloud, on-premise, and air-gapped deployments. • Contribute to the broader AI research community through technical papers, publications, conference participation, architecture proposals, and thought leadership. • Serve as a senior technical authority and mentor across the organization, influencing technical direction, research rigor, experimentation practices, and best practices across research, engineering, and product teams.

Virginia + 1 moreAll locations: Virginia | Washington
$250K - $300K / year
Full TimeRemoteTeam 11-50H1B Sponsor

• Define and evolve the long-term AI/ML research strategy and technical roadmap for Trase OS in alignment with product and platform direction. • Lead large-scale experimentation and prototyping efforts requiring significant compute infrastructure, translating frontier AI research into scalable, production-grade systems with measurable impact. • Drive original research and technical breakthroughs in agentic systems, autonomous execution, multi-agent orchestration, post-training and fine-tuning systems, SLM/LLM-based architectures, and applied AI infrastructure. • Design how models operate within long-lived execution environments, including agent workflows, tool use, planning, memory systems, reasoning, and human-in-the-loop controls. • Establish evaluation methodologies and reliability frameworks for autonomous systems, including benchmarking, regression testing, safety, controllability, and production behavior analysis. • Drive architecture decisions across orchestration, model serving, routing, inference, and infrastructure governance, including latency, reliability, and cost optimization. • Partner closely with engineering and product teams to operationalize research outcomes into deployable systems and enterprise workflows. • Build AI systems that operate reliably in regulated and constrained environments, including secure cloud, on-premise, and air-gapped deployments. • Contribute to the broader AI research community through technical papers, publications, conference participation, architecture proposals, and thought leadership. • Serve as a senior technical authority and mentor across the organization, influencing technical direction, research rigor, experimentation practices, and best practices across research, engineering, and product teams.

Virginia + 1 moreAll locations: Virginia | Washington
$250K - $300K / year
Astreya logo

AI Infrastructure Datacenter Technical Project Manager II

Astreya

IT services that put people at the center of your business

LLM Engineer5 days ago
Full TimeRemoteTeam 1,001-5,000Since 2001H1B Sponsor

Role Description The AI Infrastructure Datacenter Technical Project Manager Level 2 serves as a senior project leader responsible for managing large-scale AI infrastructure programs, complex technical deployments, and cross-functional strategic initiatives. This role drives execution excellence across compute, GPU, storage, networking, and data center infrastructure domains while ensuring alignment with business and operational objectives. Key Responsibilities - Lead large-scale AI infrastructure deployment programs across multiple sites, regions, or business units. - Drive end-to-end project execution for GPU clusters, AI compute environments, storage platforms, high-speed networks, and data center infrastructure. - Develop integrated project plans, implementation strategies, and operational readiness frameworks. - Manage cross-functional coordination between engineering, operations, supply chain, vendors, and executive stakeholders. - Identify and mitigate program risks, schedule impacts, technical dependencies, and operational constraints. - Lead infrastructure migration, expansion, upgrade, and modernization initiatives. - Drive governance reviews, project reporting, KPI tracking, and executive-level communications. - Coordinate infrastructure acceptance testing, deployment validation, and production readiness activities. - Mentor junior project managers and contribute to PMO process standardization and operational maturity. - Support vendor negotiations, technical evaluations, and infrastructure planning initiatives. Scope & Complexity - Leads highly complex infrastructure programs with multiple concurrent workstreams. - Manages enterprise-scale AI infrastructure deployments and operational initiatives. - Influences program execution standards, governance models, and delivery methodologies. Qualifications - Advanced understanding of AI infrastructure technologies including GPU platforms, storage systems, networking, and data center operations. - 8+ years of technical project or program management experience within infrastructure environments. - Proven experience leading large-scale infrastructure deployment or transformation programs. - Strong risk management, executive communication, and stakeholder alignment skills. - Experience coordinating multi-vendor and cross-functional technical teams. - Ability to manage complex schedules, budgets, and operational dependencies. - Relevant certifications preferred (PMP, PgMP, ITIL, Agile, CCNA, etc.). Requirements - Salary Range: $72,960.00 - $100,800.00 USD (Salary) - Please note that the salary information provided herein is base pay only (gross); it does not include other forms of compensation which may or may not apply to this specific position, namely, performance-based bonuses, benefits-related payments, or other general incentives - none of which are guaranteed, may be subject to specific eligibility requirements, and are wholly within the discretion of Astreya to remit. - Further, the salary information noted above is a range that consists of a minimum and maximum rate of pay for this specific position. Where an applicant or employee is placed on this range will depend and be contingent on objective, documented work-related considerations like education, experience, certifications, licenses, preferred qualifications, among other factors. Benefits - Medical provided through UHC (PPO, HSA, Surest options) - Medical provided through Kaiser (HMO option only) for California employees only - Dental provided through UHC - Nationwide Vision provided by UHC - Flexible Spending Account for Health & Dependent Care - Pre-Tax Account for Commuter Benefit/Parking & Transit (location-specific) - Continuing Education and Professional Development via various integrated platforms, e.g. Udemy and Coursera - Corporate Wellness Program provided by Goomi Group - Employee Assistance Program - Wellness Days - 401k Plan - Basic and Supplemental Life Insurance - Short Term & Long Term Disability - Critical Illness, Critical Hospital, and Voluntary Accident Insurance - Tuition Reimbursement (available 6 months after start date, capped) - Paid Time Off (accrued and prorated, maximum of 120 hours annually) - Paid Holidays - Any other statutory leaves, paid time, or other ancillary benefits required under state and federal law

United States
$73.0K - $100.8K / year
Full TimeRemoteTeam 51-200H1B Sponsor

• Identify, vet, and manage Tier 1/2 OEMs and regional distributors for high-density servers, network gear, and cabling. • Drive end-to-end contract lifecycles, including Master Purchase Agreements (MPAs), Service Level Agreements (SLAs), and complex warranty/support negotiations. • Monitor global semiconductor trends to mitigate long-lead-time risks. Support Solution Engineering by ensuring "just-in-time" inventory of mission-critical hardware (GPUs, NICs, Switches). • Partner with Systems Engineering and Architecture teams to translate technical specs into scalable, multi-year procurement roadmaps. • Oversee the procurement and delivery of integrated components, including NVIDIA Grace CPUs, NVLink, InfiniBand, and ConnectX-8 technologies. • Architect procurement workflows that satisfy stringent security, data residency, and national compliance requirements for Sovereign AI cloud deployments.

Washington
$130.8K - $163.5K / year