Nebius Group

Technical Director, Media & Entertainment AI Infrastructure

LLM EngineerMachine Learning EngineerFull Time Remote LeadTeam 1,001-5,000H1B No SponsorCompany Site LinkedIn

Location

California + 1 more

Posted

124 days ago

Salary

$295K - $365K / year

Seniority

Lead

12 yrs expEnglishCloud Distributed Systems Kubernetes Go

Job Description

• Own the Technical Blueprint: Personally architect the infrastructure solutions for our most strategic M&E partnerships, studio-scale content production pipelines, agency data consolidation plays, generative AI model deployments. • The Physics to P&L Narrative: Fluently demonstrate to executive stakeholders how infrastructure decisions, data lake locality, storage tiering, inference optimization, directly impact their business model and operability. • Deconstruct the Bottleneck: Go beyond the stated problem to find the technical truth. Translate vague business goals (e.g., “We need lower rendering costs”) into precise engineering requirements (e.g., “We need to optimize the inference batch size on L40s to reduce cost-per-token by 30%”). • Map the Transition: Identify exactly where a customer sits on the curve from legacy service bureau to AI-native tech platform and prescribe the specific infrastructure intervention needed to move them forward. • Build and Validate the Integration Layer: Identify, engage, and technically validate relationships with the most critical ISVs in the media and entertainment landscape, from rendering and VFX toolchains to generative AI platforms. • Define the Standard: It is not enough to support these tools. You will define the reference architectures for how they run best on Nebius infrastructure, and work directly with ISV engineering teams to build and publish those standards. • Decide What’s Worth Doing: In partnership with the GM, evaluate ISV and partner opportunities on their technical merit and strategic leverage, and be equally rigorous about what not to pursue. • Shape the M&E Roadmap: Use forensic evidence from the field to prioritize and justify the M&E vertical roadmap. You will work directly with Nebius’s global Head of Product and Head of Engineering to translate partner and customer needs into product direction. • Lead the M&E Product Summit: Chair a quarterly summit with Core Engineering leadership, using field evidence to drive roadmap decisions and maintain vertical momentum.

Job Requirements

12+ years of experience in cloud infrastructure, platform engineering, distributed systems, or a closely related technical domain.
Executive Presence: Capable of commanding a room of engineers and presenting a layered technical roadmap to a C-Suite. You have operated at the top-to-top level, your counterparts are CTOs and VPs of Engineering.
Builder Mentality: This is an IC role. You build things. You are not here to manage a team or delegate to an implementation function, you are here to architect and ship solutions alongside partners, and to build the assets (reference architectures, integration playbooks, technical frameworks) that make Nebius’s M&E infrastructure strategy defensible and scalable.
Product-Minded: Experience defining a platform strategy, not just executing tickets. You are comfortable telling a customer “No” when a request creates technical debt, and proposing a better alternative.
Ambiguity Tolerance: You thrive in environments where requirements are evolving. You do not wait for a roadmap; you build it.
Forensic Mindset: You are not satisfied with surface-level answers. You dig into the kernel, the logs, and the P&L to find the truth.
Mastery of the Stack: Expert-level, production-grade knowledge of GPU architectures (H100, L40s), Kubernetes orchestration including Soperator, high-performance and parallel file systems (e.g., Lustre, WEKA), data lake architecture, and networking constraints (InfiniBand/Ethernet).
Inference Optimization: You understand the nuances of model serving, batch sizes, quantization, KV caching, latency tradeoffs, and can architect solutions for both massive throughput and real-time (sub-50ms) demands.

Benefits

Health Insurance: 100% company-paid medical, dental, and vision coverage for employees and families.
401(k) Plan: Up to 4% company match with immediate vesting.
Parental Leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers.
Remote Work Reimbursement: Up to $85/month for mobile and internet.
Disability & Life Insurance: Company-paid short-term, long-term, and life insurance coverage.

Related Categories

LLM Engineer AI Engineer Machine Learning Engineer AI Research Scientist Computer Vision Engineer NLP Engineer

Related Job Pages

LLM Engineer Jobs in California Remote Full-time Jobs (US)More Remote Jobs

More LLM Engineer Jobs

AI/LLM Engineer

Ostro

Knowledge is the best medicine.

LLM Engineer124 days ago

Other RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Develop performant, scalable, and high quality APIs and backend processes for Ostro's SaaS platform, with a strong emphasis on LLM integration. • Collaborate with cross-functional teams to implement new features and refine existing ones, particularly those involving AI/LLM capabilities. • Provide feedback on roadmap and features for your team, contributing to the strategic direction of Ostro’s AI/LLM initiatives. • Ensure code quality and compliance through thorough reviews, unit testing, and adherence to best practices for LLM-powered applications and Ostro engineering. • Optimize application performance and scalability to meet user demands, especially for LLM inference and data processing. • Stay informed about emerging AI/LLM technologies, prompt engineering techniques, and industry trends. • Troubleshoot and resolve production issues, ensuring performance, reliability, and scalability of LLM-driven features.

Django Python

View details: AI/LLM Engineer

United States

$159.3K - $202.4K / year

Apply

Job Closed

AI/LLM Engineering – Working Student

Cognitx

Empowering your vision with transformative AI solutions.

LLM Engineer131 days ago

Part Time RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Build and improve agentic workflows (tool/function calling, planning, self-checks) for analytics, summaries, visualizations, and task automation. • Implement adapters and tools to connect LLMs with internal and external services. • Contribute to our FastAPI backend with clean interfaces, Pydantic validation, and tests. • Develop evaluation metrics to measure accuracy, latency, and cost. • Optimize prompts, retrieval/contexting, and execution strategies for privacy, reliability, and performance. • Ship services in containers (Docker) and collaborate on deployments (Kubernetes), CI, and observability. • Document technical decisions and share learnings with the team.

Docker Kubernetes Microservices PostgreSQL Python React Redis TypeScript

View details: AI/LLM Engineering – Working Student

Germany

Apply

Senior DGX Cloud AI Infrastructure Software Engineer

NVIDIA

LLM Engineer133 days ago

Other RemoteTeam 10,001+Since 1993H1B Sponsor

Company Site LinkedIn

• Develop infrastructure software and tools for large-scale pre-training, post-training, and inference. • Develop and optimize tools and libraries to improve infrastructure efficiency and resiliency. • Co-design and implement APIs for integration with NVIDIA's resiliency stacks. • Enhance infrastructure and products underpinning NVIDIA's AI platforms. • Define meaningful and actionable reliability metrics to track and improve system and service reliability. • Skilled in problem-solving, root cause analysis, and optimization. • Root cause and analyze and triage failures from the application level to the hardware level.

Distributed Systems Prometheus Python

View details: Senior DGX Cloud AI Infrastructure Software Engineer

California + 3 more

$184K - $287.5K / year

Apply

Conversational AI Engineer

Zillow

Zillow is a leading online real estate marketplace covering the whole spectrum of purchasing, owning, and selling a home. In support of flexible work options an

LLM Engineer137 days ago

Other Remote

Company Site

• Design, build, and deploy intelligent chat agents and automated workflows to resolve common customer and frontline issues. • Integrate core systems (such as Salesforce) with AI tools to create a unified, compliant user experience. • Develop and optimize prompts to ensure the AI delivers accurate, relevant answers and help content. • Evaluate, onboard, and manage AI/ML tools and emerging technologies to enhance system performance. • Implement safeguards and monitoring to maintain accuracy, prevent misinformation, and build user trust. • Collaborate with Product, Engineering, QA, Content, and Analytics teams to embed conversational AI into business strategy and track performance. • Apply machine learning and large language models to improve natural language understanding and generation in our chat agents.

View details: Conversational AI Engineer

California + 15 more

$136.3K - $217.7K / year

Apply

Job Closed

Technical Director, Media & Entertainment AI Infrastructure

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More LLM Engineer Jobs

AI/LLM Engineer

AI/LLM Engineering – Working Student

Senior DGX Cloud AI Infrastructure Software Engineer

Conversational AI Engineer