Software Engineer, GPU Performance Tools

Full-stack EngineerSoftware EngineerFull TimeRemoteSeniorTeam 10,001+Since 1993H1B SponsorCompany SiteLinkedIn

Location

California + 1 moreAll locations: California | Oregon

Posted

2 days ago

Salary

$124K - $195.5K / year

Seniority

Senior

Postgraduate Degree3 yrs expEnglishPython

Job Description

Software Engineer, GPU Performance Tools

NVIDIA

• Build innovative features for NVIDIA's GPU profiling tools from inception to execution • Incorporate new hardware profiling capabilities into tools and workflows • Work independently based on high-level requirements, filling in build details and making sound engineering decisions • Collaborate with architects, performance engineers, and other software teams to understand requirements and deliver solutions • Improve and maintain a large, evolving codebase with high standards for quality and reliability

Job Requirements

  • B.S., M.S., or PhD in Computer Science, Computer Engineering, or a related field (or equivalent experience)
  • 3 years or more experience writing production software in Python and C++
  • Proven foundation in computer architecture and performance analysis
  • Experience in parallel programming or accelerated computing
  • Track record of building tools or infrastructure for other engineers, with a strong sense of what makes a great developer experience
  • Up to date with the latest software engineering practices including AI-enabled development tooling
  • Contributions to open-source performance analysis tooling preferred
  • Experience as a user or creator of CPU or GPU profiling tools preferred
  • Experience in GPU computing or accelerated computing platforms preferred
  • Background with building software tools on top of hardware capabilities preferred
  • Familiarity with AI workloads and their performance characteristics preferred

Benefits

  • Equity
  • Benefits

Related Job Pages

More Full-stack Engineer Jobs

Role Description We engage experienced professionals as paid advisors to share their expertise on enterprise software in areas such as sustainability and ESG, safety and EHS, compliance, supply chain, and product stewardship. As an advisor, you'll take part in confidential one-on-one consulting conversations about how organizations evaluate, select, and use these tools, and where current solutions meet or miss the mark. Sessions are conducted remotely by video or phone and scheduled entirely around your availability. We're interested in your real-world expertise and perspective, so there are no right or wrong answers. Qualifications - Senior professional at a large organization - Involved in evaluating, selecting, or purchasing enterprise software in risk, sustainability, compliance, safety, supply chain, or product stewardship - Familiarity with tools in one or more of these categories (for example, platforms used for ESG reporting, EHS and incident management, supplier compliance, or chemical and product compliance) - Able to speak to real-world experience with how these tools are chosen and used - Comfortable joining a remote session by video or phone Requirements - $500 per hour for your time and insight - Fully remote, participate from anywhere - Flexible scheduling around your work and personal commitments - Confidential, low-commitment engagements with no preparation required

Worldwide
$500 / hour
Optro logo

Senior Software Engineer II, Infra Engineering

Optro

Optro helps enterprises transform risk into opportunity, redefining GRC for the agentic future of risk management.

Full TimeRemoteTeam 501-1,000Since 2014H1B No Sponsor

• Architect and implement new cloud infrastructure to enable high-performance SaaS applications globally in the cloud. • Drive infrastructure features end-to-end, from design documentation through implementation, rollout, and operational ownership. • Build and deliver observability tools and analyze data, working with the application development team to ensure a consistently superb customer experience. • Continue to grow automation for infrastructure provisioning, developer efficiency, and internal tooling efficiency. • Apply AI-assisted development tools to accelerate IaC, automation, code review, and on-call workflows, while maintaining a high bar for security, correctness, and review discipline. • Mentor fellow engineers and contribute to team-wide architecture decisions. • Assist our customer support and sales teams with technical issues regarding the Optro application. • Maintain reliability for our production systems to exceed our SLA requirements, including participation in an on-call rotation for production issues. • Help define internal Service Level Objectives and Service Level Indicators for the Infrastructure Engineering team to drive better reliability. • Collaborate with engineers, designers, and product managers; give and receive feedback well in a dynamic environment.

United Kingdom
£100K - £137.5K / year
redbee logo

Tech Lead – Fullstack, Nest.js, React Native

redbee

Connecting businesses and technology expertise. Conectando negocios con expertise tecnológico.

Full TimeRemoteTeam 51-200H1B No Sponsor

• Liderar técnicamente equipos de desarrollo fullstack, acompañando su crecimiento y asegurando buenas prácticas. • Definir y promover estándares de desarrollo, arquitectura y calidad de código. • Participar en decisiones técnicas clave, colaborando con otras áreas para construir soluciones escalables. • Diseñar e implementar soluciones backend y mobile modernas, robustas y performantes. • Impulsar la adopción de arquitecturas basadas en microservicios y microfrontends. • Colaborar en la mejora continua de procesos, herramientas y metodologías de desarrollo.

Argentina
CoreWeave logo

Senior Engineer, Network Observability

CoreWeave

CoreWeave is a specialized cloud provider, delivering a massive range of GPU compute resources on demand and at scale.

Full TimeRemoteTeam 11-50Since 2017H1B No Sponsor

• We’re seeking a talented and experienced Senior Engineer for Network Observability to join our Network Observability team. In this role, you will be a key player in designing, developing, and maintaining the monitoring, telemetry, and observability systems that keep CoreWeave’s GPU cloud network operating reliably and at scale. • You’ll focus on building solutions that provide real-time insights into network performance, ensuring that issues are detected proactively and resolved quickly. • Develop, optimize, and maintain network observability platforms. Use your skills in Python and Golang to create and automate collectors, exporters, and dashboards that provide deep visibility into network health and performance. • Collaborate with Network Engineering and Platform teams to ingest and unify logs, metrics, and events from a variety of platforms (Arista EOS, NVIDIA Cumulus Linux, Nokia SR OS, SR Linux, etc.) into a single observability pipeline. • Design and implement scalable telemetry solutions using protocols like gNMI, SNMP, and streaming analytics. Ensure advanced alerting and anomaly detection with tools such as Prometheus, Grafana, and Alertmanager. • Work closely with network developers, site reliability engineers, and security teams to integrate observability solutions across the broader infrastructure. • Participate in design discussions, RFCs, and architectural decisions. • Join a rotating on-call schedule to troubleshoot and resolve observability-related issues. Provide timely support to operations teams, quickly isolating and fixing problems when they arise. • Guide junior team members, share best practices, and foster a culture of continuous learning and improvement within the observability domain.

United Kingdom