Senior DL Algorithms Engineer – Inference Performance
Location
California
Posted
112 days ago
Salary
$184K - $356.5K / year
Seniority
Senior
Job Description
Senior DL Algorithms Engineer – Inference Performance
NVIDIA
• Implement language and multimodal model inference as part of NVIDIA Inference Microservices (NIMs). • Contribute new features, fix bugs and deliver production code to TRT-LLM, NVIDIA’s open-source inference serving library. • Profile and analyze bottlenecks across the full inference stack to push the boundaries of inference performance. • Benchmark state-of-the-art offerings in various DL models inference and perform competitive analysis for NVIDIA SW/HW stack. • Collaborate heavily with other SW/HW co-design teams to enable the creation of the next generation of AI-powered services.
Job Requirements
- PhD in CS, EE or CSEE or equivalent experience.
- 5+ years of experience.
- Strong background in deep learning and neural networks, in particular inference.
- Experience with performance profiling, analysis and optimization, especially for GPU-based applications.
- Proficient in C++, PyTorch or equivalent frameworks.
- Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture.
- Proven experience with processor and system-level performance optimization.
- Deep understanding of modern LLM architectures.
- Strong fundamentals in algorithms.
- GPU programming experience (CUDA or OpenCL) is a plus
Benefits
- equity
- benefits
Related Guides
Related Categories
Related Job Pages
More Engineer Jobs
• Build and maintain reliable, scalable data pipelines in collaboration with more experienced data engineers. • Assist with data modeling (raw, processed, and consumable layers). • Monitor data loads and respond to data incidents. • Support documentation of the data team's processes and best practices. • Collaborate with analysts and product and engineering teams to understand data requirements.
• Design and operationalize scalable, monitorable ELT/ETL pipelines. • Define data models to support reporting and analytical products. • Implement automated data quality tests (data quality checks). • Collaborate with data architects on platform improvements and integrations. • Support the definition of reliable metrics and data contracts with client teams.
Forward Deployed Engineer
Titan AIBuilding mobile RPG Games with AI ✨ Companions 🤖. First game: Hell Rush is coming in July 2024.
• Own the full lifecycle of AI products with banking clients - from initial deployment through production optimization and feature expansion • Work directly with bank executives, compliance teams, and operations teams to understand requirements and translate them into technical solutions • Design and architect integrations between Titan's AI platform and banks' existing systems (productivity suite, CRMs, loan origination systems, etc) • Drive technical product decisions that balance innovation, security, compliance, and operational realities of banking • Build and iterate on prototypes, proofs-of-concept, and production features that solve real banking workflows • Partner with AI engineers and backend teams to ensure our models and agents perform reliably in live banking environments
Redis/Valkey Contributing Engineer
PerconaScaling, Securing, and Managing the Best Open Source Databases on the Most Popular Platforms
• responsible for the high-quality technical execution of Percona’s Redis/Valkey product and service capabilities • Deliver professional services/consulting engagements for strategic or complex projects • Contribute to software in Valkey and related/supporting open source projects • Identify ways to enhance, improve, or add to customers use of Redis or Valkey • Actively seek collaboration between organizations to build mutually beneficial open source projects • Acts as a technical escalation point for Global Services - takes ownership of complex/escalated Support tickets and owns them through to resolution • Supports and assists in the continual improvement of Percona’s quality of Support/Service delivery by participating in ticket reviews/spot checks to identify opportunities for improvement related to newly introduced Redis/Valkey ecosystem software • Provides feedback and guidance on Percona’s Valkey - and broader, where appropriate - software product strategy - contributing to functional specifications for product enhancements, changes, or new software products as applicable • Follow the latest industry developments and stay up-to-date on corporate competitors • Identifies trendsetter ideas by researching industry and related events, publications, and announcements; tracking individual contributors and their accomplishments • Supporting the Percona Go To Market teams on strategic, Valkey related opportunities • Talking at strategically identified conferences throughout the year • Leads/Builds/Participates in a collaborative community of Redis/Valkey experts within Percona • Support the creation of technical content for the community and Percona subscribers to consume. Blogs, Whitepapers, KnowledgeBase articles, webinars, and conference talks are all media types that may be used • Your hours will be flexible between America and European timezone as required



