NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
Senior Manager, Engineering – AI Developer Tools
Location
California
Posted
5 days ago
Salary
$272K - $431.3K / year
Seniority
Senior
Job Description
Senior Manager, Engineering – AI Developer Tools
NVIDIA
• Lead, mentor, and develop a team of 4-8 engineers • Work closely with product and engineering partners to define roadmaps • Collaborate with cross-organization teams to manage dependencies • Improve engineering efficiency through internal tools and frameworks • Ensure alignment with security and compliance standards
Job Requirements
- BS/MS in Computer Science or related technical field, or equivalent experience
- 8+ years of hands-on experience with AI or developer tools
- 4+ years of engineering leadership or management experience
- Experience in Agile software development
- Strong software engineering skills in Go, Python, and JavaScript
Benefits
- Health insurance
- 401(k) matching
- Flexible work arrangements
- Professional development
- Stock options
- Wellness programs
Related Guides
Related Job Pages
More AI Engineer Jobs
AI Developer Senior
Publicis Groupe Holdings B.VHi there! We’re Razorfish. We’ve been leading the marketing industry with our digital expertise since the start of the internet. But in 2020, we did a full reboot. What’s different? It all starts with people. Weird, wonderful, complex people - with diverse backgrounds in strategy, creative and technology. But no matter how different we are, we all have one thing in common. We believe our differences are our strength. So we push for inclusion, challenge convention and bring in new perspectives, to inspire new ideas. Because when we connect by understanding what makes people different, we can create unforgettable experiences that enrich lives. Join us at razorfish.com.
Role Description We are looking for a AI Developer Senior to play a key role in designing, shaping, and evolving the next generation of agent-based architectures. You will work closely with cross-functional teams to build innovative solutions that transform how brands operate, make decisions, and create value. - Lead the design and development of the backend architecture powering conversational agents and LLM-based systems, ensuring scalability, robustness, and long-term evolution. - Work on: - Orchestrating intelligent agents and multi-agent systems - Integrating foundation models (e.g. Google Gemini, Azure OpenAI) - Designing scalable APIs, including real-time streaming capabilities - Building cloud-ready infrastructure for production environments - Your work will directly impact tools used by strategy, media, creative, and data teams. - AI Agent Development: - Design and build AI agents using frameworks such as Google ADK, LangChain, or similar - Implement complex workflows with tool calling (search, retrieval, APIs, databases) - Optimize prompts and evaluate response quality - Backend & APIs: - Develop asynchronous APIs using FastAPI - Design modular and scalable architectures - Implement real-time streaming endpoints (SSE, WebSockets) - LLM Integration: - Integrate with APIs like Google Gemini or Azure OpenAI - Manage context, grounding, citations, and metadata - Optimize token usage and control costs - Infrastructure & Cloud: - Containerize services using Docker - Deploy and manage applications in cloud environments (GCP, Azure, etc.) - Handle secrets securely (Key Vault, Secret Manager, etc.) - Implement advanced observability (logging, metrics, alerting, tracing) - Data & Persistence: - Work with PostgreSQL and SQL-based systems - Use ORMs such as SQLAlchemy - Manage session history, traceability, and data flows - Engineering Quality: - Write clean, maintainable, and well-documented code - Apply testing practices (unit, integration, E2E when needed) - Follow SOLID principles, Clean Architecture, and DDD when relevant - Ensure proper versioning (Git), branching strategies, and code reviews - Build resilient, fault-tolerant systems for production Qualifications - Strong product mindset, focused on business impact and end-user value - Experience working in Agile environments with cross-functional teams - Solid backend experience in Python, building scalable services - Strong experience with asynchronous development (FastAPI or similar) - Hands-on experience integrating LLMs into production systems - Deep understanding of clean code, SOLID principles, and software design - Experience with modern architectures (hexagonal, clean architecture, etc.) - Strong knowledge of SQL databases and data modeling - Ability to make technical decisions with a long-term architectural vision Requirements - Experience with agent frameworks (Google ADK, LangChain, LlamaIndex) - Experience working with Google Cloud - CI/CD experience (GitHub Actions, Azure DevOps) - Real-time streaming implementations - Knowledge of RAG (Retrieval-Augmented Generation) - Experience in marketing, media, or data environments Benefits - Flexible Benefits (Coverflex): Enjoy more than just work with flexible compensation including meal vouchers, health insurance, transportation, and more. - Growth Opportunities: You can advance in your career not only through the experience of working with major clients but also by accessing local and global training programs specialized according to your role, covering both technical and soft skills. - Free Online Training: You can access unlimited courses from LinkedIn Learning and Udemy Catalogs through our artificial intelligence platform "Marcel". - Partner Certifications: You'll have the opportunity to obtain certifications from industry giants such as Meta, Google, or Amazon. - Work from anywhere: Telecommute up to 6 weeks from over 100 countries with our #WorkYourWorld program. - Attractive holidays package including your birthday & Advertising Day off plus some additional days off. Rest is also important! - Well-being: We prioritize the well-being of our staff and organize various health initiatives such as daily meditation or yoga among others.
• Serve as the technical reference for AI initiatives, guiding architectural decisions, implementation and the evolution of solutions. • Mentor developers, promoting best practices, conducting solution reviews and supporting the team's technical growth. • Design and evolve architectures for AI-based solutions, including LLMs, agents, RAG, integrations, asynchronous processing and caching. • AI in production: technically lead the delivery, operation, monitoring, troubleshooting and continuous improvement of AI solutions in production environments. • Build and enhance pipelines, automations and engineering practices for development, evaluation, deployment and monitoring of AI applications. • Define technical standards related to code quality, security, observability, documentation and maintainability of solutions. • Evaluate AI tools, frameworks and approaches to recommend the best technical paths for the company's context. • Collaborate with Product, Data, Security and other teams to ensure technical feasibility and business impact of solutions. • Contribute hands-on to solution development, standards definition and problem resolution.
• Run and optimize our self-hosted inference stack • Run the inference serving layer on our own GPU hardware: choose and tune the serving stack (vLLM, SGLang, TensorRT-LLM) for high throughput and low latency. • Optimize aggressively: tensor parallelism, quantization (FP8, AWQ, GPTQ), KV-cache and prefix caching, continuous batching, speculative decoding, concurrency tuning. • Serve multiple models and features off shared hardware: multi-LoRA, routing, and request scheduling that balances internal workloads against latency-sensitive product traffic. • Keep our AI fast, efficient, and observable • Make our AI workloads efficient: improve latency, throughput, and GPU utilization so we get the most out of what we run. • Build the visibility: instrument performance and usage across our AI surfaces so there's clear data on how everything is running. • Surface the technical tradeoffs (performance, latency, efficiency) so the people making the calls have what they need to make them. • Build AI features and proactive agents • Ship the in-app agent layer that helps families coordinate: proactive nudges, smart suggestions, agents that summarize, draft, schedule, and act for busy parents. • Build the substrate underneath: tools, memory, orchestration, guardrails, and evaluation harnesses, integrated cleanly with production APIs alongside our architecture team. • Work in nimble pairs with feature owners, standing up whatever's needed to test an idea, including a vibe-coded UI when that's the fastest path to a real customer. Ship rough, learn fast, harden what works.
• Run and optimize our self-hosted inference stack • Run the inference serving layer on our own GPU hardware: choose and tune the serving stack (vLLM, SGLang, TensorRT-LLM) for high throughput and low latency. • Optimize aggressively: tensor parallelism, quantization (FP8, AWQ, GPTQ), KV-cache and prefix caching, continuous batching, speculative decoding, concurrency tuning. • Serve multiple models and features off shared hardware: multi-LoRA, routing, and request scheduling that balances internal workloads against latency-sensitive product traffic. • Make our AI workloads efficient: improve latency, throughput, and GPU utilization so we get the most out of what we run. • Build the visibility: instrument performance and usage across our AI surfaces so there's clear data on how everything is running. • Surface the technical tradeoffs (performance, latency, efficiency) so the people making the calls have what they need to make them. • Ship the in-app agent layer that helps families coordinate: proactive nudges, smart suggestions, agents that summarize, draft, schedule, and act for busy parents. • Build the substrate underneath: tools, memory, orchestration, guardrails, and evaluation harnesses, integrated cleanly with production APIs alongside our architecture team. • Work in nimble pairs with feature owners, standing up whatever's needed to test an idea, including a vibe-coded UI when that's the fastest path to a real customer. Ship rough, learn fast, harden what works.


