Job Closed
This listing is no longer active.
Senior Software Engineer – Deep Learning Compiler Verification, Infrastructure
Location
California + 3 moreAll locations: California | Oregon | Texas | Washington
Posted
101 days ago
Salary
$140K - $224.3K / year
Seniority
Senior
Job Description
Senior Software Engineer – Deep Learning Compiler Verification, Infrastructure
NVIDIA
• Drive CI and infrastructure capabilities that make deep learning compiler development fast, reliable, and scalable. • This includes improving signal-to-noise (flake reduction, reproducibility, and richer diagnostics), accelerating iteration cycles, scaling capacity and coverage across models/hardware/software configurations, and building strong observability (metrics, logging, tracing, dashboards) so failures are easy to understand and fix. • Explore practical uses of AI to enhance CI workflows—such as smarter test selection, automated triage/summarization, and faster issue isolation—ultimately increasing the quality and speed of deep learning compiler development, testing, and release.
Job Requirements
- BS, MS, or PhD (or equivalent experience) in Computer Science, Computer/Electrical Engineering, Mathematics, or related field
- 3+ years of professional experience designing and scaling CI/CD, build/release, or developer productivity infrastructure for DL/GPU software environments
- Strong software engineering skills (Python required) with ability to architect, implement, and debug complex systems end-to-end
- Hands-on experience building CI/MLOps platform capabilities—pipeline orchestration, artifact/package management, and production-grade observability (logs/metrics/dashboards)—with strong reliability and maintainability
- Experience with deep learning frameworks/runtime stacks (e.g., PyTorch, JAX, vLLM, SGLang, TensorRT, NeMo) and running real workloads in production-like environments
- Working knowledge of Linux-based development and debugging across complex software/hardware stacks (drivers, CUDA libraries, containers, cluster schedulers, etc.)
Benefits
- equity
- benefits
Related Guides
Related Job Pages
More Full-stack Engineer Jobs
Software Engineer III
6sense6sense Revenue AI™ reimagines the way revenue teams create, manage and convert pipeline into revenue.
• Architect, build, and scale services and infrastructure components • Own critical systems including Hadoop, Presto clusters, Kubernetes infrastructure, and deployment pipelines • Develop and deploy services to improve availability, usability, and observability • Write, review, and debug production-grade code in Python and Java • Design and maintain high-availability, fault-tolerant systems • Implement observability solutions (metrics, logging, alerting) • Contribute to infrastructure security and configuration management best practices • Support engineering teams with platform enablement and troubleshooting • Contribute to open-source projects when required
Software Engineer – Video
TwilioTwilio is a Platform-as-a-Service (PaaS) company established in 2007. In support of a flexible workplace, Twilio has previously posted freelance, flexible sched
• Design and implement real-time services with high throughput and low latency requirements, verify, deploy and operationalize them • Work closely with stakeholders to understand customer needs and, devise and deliver, simple, robust and scalable solutions • Be comfortable expressing thoughts and ideas as detailed prose and use it as an effective means to collaborate with leads, architects and cross functional teams • Embrace the challenge of scaling a complex distributed platform with points of presence globally, each one concerned with high availability, high reliability, high throughput, low latency, and media fidelity • Figure out novel ways of solving customer problems for the Voice channel
Senior Software Engineer, Checkout – PBA Core
AffirmWe create honest financial products that improve lives.
• Responsible for owning and delivering quarterly goals for your team. • Support peers and stakeholders in the product development lifecycle by collaborating with product management, design & analytics. • Proactively identify project, process, technology or business issues and advocate for solutions. • Support operations and availability of your team’s artifacts by creating and monitoring metrics. • Foster a culture of quality and ownership on your team by improving code review and design standards. • Help develop talent on your team by providing feedback and guidance.
• Build and launch a secure, production-grade client portal serving regulated healthcare innovators • Implement scalable multi-tenant architecture supporting long-term growth • Strengthen compliance posture through secure authentication, RBAC, and encryption aligned with HIPAA and SOC2 standards • Integrate critical SaaS systems to streamline operations and reduce manual workflows • Leverage AI-assisted development and agentic workflows to increase engineering velocity and system intelligence • Architect and build a secure client portal and internal admin dashboards • Develop scalable backend services and APIs • Implement robust frontend experiences using modern frameworks • Design multi-tenant architecture supporting future growth • Build and maintain integrations with Microsoft 365 (Graph API), HubSpot, and Asana • Implement OAuth flows, webhook listeners, and background sync jobs • Ensure reliability and data integrity across third-party platforms • Implement authentication, authorization, and role-based access control (RBAC) • Apply encryption best practices across data storage and transmission • Align system architecture with SOC2 and HIPAA standards • Integrate AI-assisted development practices into daily workflows • Research and deploy Retrieval-Augmented Generation (RAG) and agentic workflows • Automate document intake and workflow orchestration • Implement Infrastructure as Code using Terraform or Pulumi • Maintain CI/CD pipelines for reliable deployments • Monitor and optimize system performance and reliability




