Nscale logo
Nscale

Nscale is the Hyperscaler engineered for AI.

Principal Network Architect- AI Infrastructure

Solutions EngineerSolutions EngineerFull TimeRemoteLeadTeam 201-500Since 2024H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

5 days ago

Salary

0

Seniority

Lead

Job Description

Principal Network Architect- AI Infrastructure

Nscale

Role Description Nscale is seeking a Network Architect Engineer to lead the evolution, reliability, and operational excellence of our global AI networking infrastructure. This role sits at the core of Nscale’s platform, where network performance directly impacts AI training outcomes. You will act as a technical authority across large-scale RDMA / Infiniband / RoCE fabrics, driving automation, availability improvements, and system-level design across a globally distributed GPU cloud. You will combine deep network protocol-level networking expertise with strong software and automation skills to operate and scale one of the most demanding AI networking environments in the industry. What You’ll Do - Technical Leadership & Strategy - Own the technical direction and operational lifecycle management of Nscale’s high-performance RDMA network fabrics. - Define long-term architecture, reliability strategy, and operational standards for AI interconnect networks. - Lead availability and performance improvement initiatives across globally distributed GPU clusters. - Act as a technical authority (SME) across networking, influencing platform-wide decisions. - Network Engineering & Operations - Support design, build, and evolve large-scale Infiniband and RoCE fabrics. - Drive deep debugging and resolution of complex cross-layer issues (hardware, firmware, kernel, distributed workloads). - Lead incident response and postmortems, ensuring systemic fixes and long-term improvements. - Define and enforce standards across: - Congestion control and traffic engineering. - Routing (BGP, ECMP, fabric-level routing strategies). - Firmware lifecycle and change management. - Network observability and telemetry. - Automation & Systems Development - Develop and scale automation frameworks for network provisioning, validation, and operations. - Build tooling to support high-reliability, low-touch network operations at scale. - Improve operational efficiency across hundreds of thousands of endpoints and high-throughput links. - Cross-Functional Leadership - Lead complex technical initiatives across Network, SRE, Compute, and Platform teams. - Serve as technical lead on critical programs, coordinating engineers and stakeholders. - Influence product and infrastructure roadmaps based on operational insights and customer needs. - Mentor senior engineers and raise the bar for technical rigor and execution. Qualifications - 10+ years of experience in network engineering in hyperscale, AI, or HPC environments. - Deep expertise in RDMA, Infiniband, and/or large-scale RoCE fabrics. - Strong understanding of: - RDMA internals and performance tuning. - Congestion control and fabric failure modes. - Distributed system communication patterns. Requirements - Expert-level knowledge of data center networking protocols (BGP, OSPF, ECMP). - Proven ability to debug multi-layer issues across network, system, and application layers. - Strong programming/scripting skills for automation (Python, Go, etc.). - Experience designing high-scale, highly available network systems. Leadership & Impact - Demonstrated ability to lead complex technical programs without direct authority. - Experience acting as a senior escalation point for critical production issues. - Strong ability to drive cross-team alignment and execution. - Systems-level thinking balancing performance, reliability, scalability, and cost. Nice to Have - Experience with NVIDIA / Mellanox networking platforms. - Familiarity with distributed AI training frameworks and GPU communication patterns. - Experience building network observability systems at scale. - Background influencing infrastructure strategy in high-growth environments. Equal Opportunities Statement We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds. If there’s anything we can do to accommodate your specific situation, please let us know. The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.

Related Categories

Related Job Pages

More Solutions Engineer Jobs

Astrolab logo

Senior Mission Integration Engineer

Astrolab

We build rovers for the Moon & Mars.

Full TimeRemoteTeam 11-50H1B No Sponsor

• The discovery, development, maturation and eventual integration of rover applicable technologies with government and commercial partners. • Cross discipline coordination and integration of design efforts. • Supporting business development in proposal responses, customer development and market exploration, leveraging a comprehensive overview of technical capabilities of the rover. • Supporting the rover payloads customer team from a technical integration standpoint.

California
GitLab logo

Senior Solutions Architect

GitLab

Build software faster. The One DevOps Platform enables your entire org to collaborate around your code. We're hiring.

Full TimeRemoteTeam 1,001-5,000Since 2014H1B No Sponsor

• Guide technical discovery, product demonstrations, and validation activities, including proofs of value, to confirm technical fit, accelerate evaluation milestones, and improve technical win rates for GitLab’s AI-powered DevSecOps platform. • Own the technical evaluation process for complex opportunities, including solution design, workshop facilitation, proof of concept or proof of value execution, and technical materials for tenders, audits, and assessments, with accountability for clear success criteria and documented evaluation outcomes. • Develop end-to-end technical strategies for assigned accounts that expand platform adoption, reduce delivery risk, and enable multi-team and multi-year transformation milestones. • Collaborate with Account Executives and regional sales teams in the East territory to shape account and territory plans, support qualification, and align technical strategy to customer priorities, opportunity progression, and business outcomes. • Advise technical practitioners and business leaders on modern software development, continuous integration, continuous deployment, security, cloud, and platform adoption practices to improve delivery efficiency, strengthen security outcomes, and increase adoption of GitLab workflows. • Drive competitive analysis and positioning for complex opportunities by using market, industry, and customer context to clarify GitLab’s differentiated approach and improve technical win readiness. • Represent the voice of the customer by sharing product feedback, use cases, integration needs, and field insights with Product Management, Engineering, Sales, and Marketing to improve roadmap decisions, integration readiness, and field effectiveness. • Mentor other Solutions Architects, contribute to team learning initiatives, improve technical collateral and documentation, and share subject matter expertise through GitLab’s common collaboration channels to increase team readiness, reuse of technical assets, and consistency across engagements.

New York
$137.4K - $231.2K / year
EDB logo

Partner Solution Architect

EDB

The leading Postgres data and AI company

Full TimeRemoteTeam 501-1,000Since 2004H1B No Sponsor

• Act as the technical voice of the company for our partner network • Design and deliver high-impact technical training programs • Provide architectural guidance and hands-on support during complex POCs • Serve as the trusted liaison between channel partners and internal Product Management

United States
Worth AI logo

Senior Solutions Engineer

Worth AI

AI Data-Driven Credit Score For Every Business 💸✨

Full TimeRemoteTeam 11-50H1B No Sponsor

• Partner with the sales team as a technical expert during discovery, scoping, and solution design while owning the technical narrative from first call through close. • Lead and deliver tailored product demonstrations and proof-of-concept engagements that map Worth's capabilities directly to a prospect's compliance, onboarding, and underwriting workflows. • Develop and execute detailed implementation plans, timelines, and success criteria for new client deployments, ensuring smooth handoffs and a fast time-to-value. • Serve as the primary technical point of contact during onboarding, coordinating across internal engineering, product, and data teams to resolve integration questions and unblock client teams. • Design and deliver customer training programs and technical documentation to drive platform adoption and self-sufficiency. • Capture and synthesize client feedback post-implementation, partnering with Product to translate real-world use cases into roadmap input. • Collaborate with Customer Success Managers to identify expansion opportunities, close training gaps, and build enablement materials that support long-term account health. • Stay current on FinTech, RegTech, and AI trends particularly in KYB/KYC, AML, fraud detection, and SMB lending and bring that context into client conversations and internal strategy discussions.

Florida