Job Closed
This listing is no longer active.
Deep Learning Software Engineer, LLM Performance
Location
California
Posted
66 days ago
Salary
$124K - $195.5K / year
Seniority
Mid Level
Job Description
Deep Learning Software Engineer, LLM Performance
NVIDIA
• Performance optimization, analysis, and tuning of LLM, VLM and GenAI models for DL inference, serving and deployment in NVIDIA/OSS LLM frameworks • Scale performance of LLM models across different architectures and types of NVIDIA accelerators • Scale performance for max throughput, minimum latency and throughput under latency constraints • Contribute features and code to NVIDIA/OSS LLM frameworks, inference benchmarking frameworks, TensorRT, and Triton • Work with cross-collaborative teams across generative AI, automotive, image understanding, and speech understanding to develop innovative solutions
Job Requirements
- Bachelors, Masters, PhD, or equivalent experience in relevant fields (Computer Engineering, Computer Science, EECS, AI)
- 2+ years of relevant software development experience
- Excellent Python/C/C++ programming, software design and software engineering skills
- Experience with a DL framework like PyTorch, JAX, TensorFlow
- Prior experience with a LLM framework or a DL compiler in inference, deployment, algorithms, or implementation
- Prior experience with performance modeling, profiling, debug, and code optimization of a DL/HPC/high-performance application
- Architectural knowledge of CPU and GPU
- GPU programming experience (CUDA or OpenCL)
Benefits
- Equity
- Benefits
Related Guides
Related Job Pages
More Full-stack Engineer Jobs
Senior Software Developer, Systems Software
DMI (Digital Management, LLC)At the Intersection of Public and Private Sectors
• Designs, develops, and delivers enterprise-grade software solutions supporting TSA’s mission-critical applications • Works within Agile and DevSecOps delivery frameworks to develop secure, scalable, and maintainable code • Collaborates closely with systems engineers, cloud architects, and the O&M contractor • Ensures software deliverables are transition-ready and fully documented for sustainment • Conduct requirements analysis, software design, coding, unit and integration testing, code review • Develops technical documentation including API specifications, developer guides, and runbooks • Supports software migrations and modernization efforts • Participates in operational testing periods post-transition • Available for post-deployment troubleshooting
• Directs day-to-day work prioritization • Plans, organizes, coordinates applications development • Leads projects regarding application analysis, coding, testing and enhancement • Provides guidance and mentorship to all engineers
• Develop and maintain responsive, user-friendly web applications using HTML, CSS, JavaScript, React, Next.js, and TypeScript. • Implement state management with Redux and style components using CSS frameworks like Bootstrap and Tailwind. • Build scalable microservices with Node.js, ensuring high availability and performance. • Develop and integrate RESTful and GraphQL APIs for efficient and secure communication between services. • Write and maintain unit tests for both backend services and frontend applications to ensure code quality. • Utilize AWS services like Lambda to build, deploy, and manage server less microservices, optimizing for performance and cost. • Implement search capabilities with OpenSearch, including setting up indexes, managing queries, and optimizing performance. • Create and execute queries with DynamoDB and relational DB. • Ensure software meets performance and security requirements. • Analyze logs, debug applications, and implement both immediate and long-term improvements. • Review team members' code for adherence to coding standards, structure, and best practices. • Assist in troubleshooting and resolving technical issues during development. • Provide technical expertise, guidance, and mentorship to team members, helping them solve complex problems. • Create and maintain comprehensive technical documentation. • Collaborate closely with cross-functional teams, including DevOps, QA, and product management.
Staff Software Engineer
EverbridgeAfter 9/11, Everbridge was founded to improve the way people communicate and find one another in critical situations. Through its Software-as-a-Service-based communications platfor
• Designing, developing, and supporting software solutions for the company’s critical event management platform and various web and mobile applications built on top of the core platform. • Collaborating directly with product management, QA, technical operations, and cross functional team leads to ensure the timely completion of projects. • Creating and maintaining robust, high-volume, and scalable applications to meet performance and reliability standards. • Designing and implementing microservices architectures that support modular, maintainable, and extensible systems. • Defining and implementing automated tests to maintain software quality and accelerate development cycles. • Building applications and infrastructure that run in AWS, following best practices for cloud-native development. • Participating in code reviews to ensure code quality, maintainability, and alignment with team standards. • Contributing as a scrum team member and technical leader, ensuring timely project delivery with high-quality output. • Designing, implementing, and optimizing data pipelines and analytics solutions using tools like Snowflake and Looker to support data-driven decision-making. • Monitoring and managing cloud infrastructure costs proactively, driving efficiency and implementing strategies for cost optimization. • Developing and maintaining scalable, event-driven architectures using Kafka or similar queue-based messaging systems to ensure reliable and efficient data processing.




