Job Closed

This listing is no longer active.

Red Hat logo
Red Hat

The leading provider of enterprise open source solutions.

Forward Deployed Engineer, AI Inference, vLLM, Kubernetes

Artificial IntelligenceArtificial IntelligenceOtherRemoteLeadTeam 10,001+Since 1993H1B SponsorCompany SiteLinkedIn

Location

California + 3 moreAll locations: California | New York | Massachusetts | Washington

Posted

113 days ago

Salary

$184.9K - $305.1K / year

Seniority

Lead

Job Description

Forward Deployed Engineer, AI Inference, vLLM, Kubernetes

Red Hat

• Orchestrate Distributed Inference: Deploy and configure LLM-D and vLLM on Kubernetes clusters. • Optimize for Production: Go beyond standard deployments by running performance benchmarks, tuning vLLM parameters, and configuring intelligent inference routing policies to meet SLOs for latency and throughput. • Code Side-by-Side: Work directly with customer engineers to write production-quality code (Python/Go/YAML) that integrates our inference engine into their existing Kubernetes ecosystem. • Solve the "Unsolvable": Debug complex interaction effects between specific model architectures (e.g., MoE, large context windows), hardware accelerators (NVIDIA GPUs, AMD GPUs, TPUs), and Kubernetes networking (Envoy/ISTIO). • Feedback Loop: Act as the "Customer Zero" for our core engineering teams. You will channel field learnings back to product development, influencing the roadmap for LLM-D and vLLM features. • Travel only as needed to customers to present, demo, or help execute proof-of-concepts.

Job Requirements

  • 8+ Years of Engineering Experience: You have a decade-long track record in Backend Systems, SRE, or Infrastructure Engineering.
  • Customer Fluency: You speak both "Systems Engineering" and "Business Value".
  • Bias for Action: You prefer rapid prototyping and iteration over theoretical perfection. You are comfortable operating in ambiguity and taking ownership of the outcome.
  • Deep Kubernetes Expertise: You are fluent in K8s primitives, from defining custom resources (CRDs, Operators, Controllers) to configuring modern ingress via the Gateway API.
  • AI Inference Proficiency: You understand how a LLM forward pass works. You know what KV Caching is, why prefill/decode disaggregation matters, why context length impacts performance, and how continuous batching works in vLLM.
  • Systems Programming: Proficiency in Python (for model interfaces) and Go (for Kubernetes controllers/scheduler logic).
  • Infrastructure as Code: Experience with Helm, Terraform, or similar tools for reproducible deployments.
  • Cloud & GPU Hardware Fluency: You are comfortable spinning up clusters and deploying LLMs on bare-metal and hyperscaler Kubernetes clusters.

Benefits

  • Comprehensive medical, dental, and vision coverage
  • Flexible Spending Account - healthcare and dependent care
  • Health Savings Account - high deductible medical plan
  • Retirement 401(k) with employer match
  • Paid time off and holidays
  • Paid parental leave plans for all new parents
  • Leave benefits including disability, paid family medical leave, and paid military leave
  • Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

Related Job Pages

More Artificial Intelligence Jobs

RWS Group logo

AI Coordinator

RWS Group

Take global further

OtherRemoteTeam 5,001-10,000H1B No Sponsor

• Performing detailed task analysis, defining key quality drivers and necessary skills, developing efficient testing and onboarding methodologies for our Data Services vendors • Managing end-to-end production Control processes and providing feedback to suppliers • Reporting on delivered productivity and propose action plans for improvement • Addressing client quality escalations and collaborating with suppliers for root cause analysis • Implementing quality improvement plans for underperforming suppliers • Supporting data suppliers and providing timely answers to their queries

United States
$48.7K - $57.5K / year
Job Closed
InternshipRemoteTeam 10,001+Since 1993H1B Sponsor

• Propose, research, prototype and test innovative research ideas. • Publish groundbreaking work at top conferences and journals. • Collaborate with other research team members, fellow interns, internal product teams, external researchers and be mentored. • Contribute to technology transfer with engineers around NVIDIA as ideas graduate from research to product. • Make good use of top-of-the-line NVIDIA GPUs at scale for cutting edge research at the intersection of AI and climate science.

United Kingdom
Job Closed
Writesonic logo

GEO Strategist – Enterprise AI Search Consultant

Writesonic

Track & Boost Your Brand’s Visibility in AI Search (ChatGPT, Gemini, Claude, Google AI Overviews and 10+ AI Platforms)

Full TimeRemoteTeam 11-50Since 2021H1B No Sponsor

• Run client QBRs on AI visibility metrics and competitive positioning • Develop custom GEO playbooks based on industry and competition analysis • Present strategic recommendations to C-suite stakeholders • Configure accounts and test prompt strategies for maximum AI citations and mentions • Collaborate with product team to shape platform features What You'll Own • Lead enterprise accounts through AI search transformation • Deliver monthly AI visibility reports with clear performance narratives • Increase client AI visibility by 40% within first 90 days • Generate 3+ case studies per month showcasing client wins • Translate complex AI search behavior into actionable business strategies • Drive platform adoption across client marketing and SEO teams

India
Full TimeRemoteTeam 1,001-5,000H1B No Sponsor

• Generate video content and individual scenes using generative AI tools (Veo, Seedance, Kling, Midjourney, etc.) • Work with text-to-video and image-to-video pipelines • Write, refine, and optimize AI prompts for consistent and controllable results • Perform post-production on AI-generated footage: color grading, basic compositing, 2D motion graphics, UI elements, titles • Assemble complete videos based on scripts and concepts • Edit video content with an understanding of pacing, rhythm, and narrative flow • Work with audio: generate character dialogue, voice lines, and action-related sounds • Collaborate with producers and the creative team on series, storyboards, creatives, and advertising materials

Israel