Principal Platform Software Engineer – RAS

Full-stack EngineerSoftware EngineerOtherRemoteLeadTeam 10,001+Since 1993H1B SponsorCompany SiteLinkedIn

Location

California

Posted

138 days ago

Salary

$272K - $431.3K / year

Seniority

Lead

Bachelor Degree15 yrs expEnglishGrafanaPrometheusPython

Job Description

Principal Platform Software Engineer – RAS

NVIDIA

• Drive next generation fleet management solutions for scaling AI infrastructure using GPUs and Grace solution from Nvidia • Work with customers, product management and other architects to narrow down on requirements for implementation • Bring up clarity on architecture for fleet health monitoring and fault-remediation solution at scale • Work with customers and other architects, understand their requirements on health monitoring • Detailed architecture, do POCs to validate architecture • Educate customers about product architecture and take feedback • Write architecture specs, design documents and own end to end delivery of product • Do code review for the code produced because of architecture specs • Ensure product is properly tested by working with the development team • Drive product life cycles with QA teams to productize the code and be responsible as a product owner • Articulate requirements as part of Jira and bug management tools and work out an end-to-end execution plan • Contribute to all phases of product development, from product definition, architecture, and design, through implementation, debugging, testing and early customer support.

Job Requirements

  • BS, MS, or PhD in EE/CS or related field of education (or equivalent experience)
  • 15+ years hands-on coding experience
  • Strong knowledge of time series databases like Influxdb & Prometheus
  • Strong knowledge of building and consuming REST APIs (Redfish is big plus)
  • Strong knowledge of telemetry visualization solutions like Grafana & Influx
  • Strong knowledge of firmware architecture, optimize firmware for low latency APIs
  • Strong knowledge of analyzing algorithms for time & space complexity and project system resource requirements
  • Proven record of solutions for scalability
  • Strong and demonstrable skill in C/C++ and Python
  • Experience programming and debugging skills for server platforms
  • Experience in SCM (e.g., Git, Perforce) and project management tools like Jira.

Benefits

  • Equity
  • Benefits

Related Job Pages

More Full-stack Engineer Jobs

ClickHouse logo

Senior Full Stack Software Engineer – ClickPipes Platform

ClickHouse

ClickHouse, Inc. is a database management system that allows users to generate analytical reports using real-time SQL queries. The company’s technology works

• Build scalable UI systems that handle large datasets, async operations, and real-time state changes • Own features end-to-end, from initial design through production launch and long-term maintenance • Collaborate closely with product, design, and other engineering teams to deliver new features • Partner on API design and system contracts • Participate in an on-call rotation to support ClickPipes in production, helping diagnose incidents, and mitigate issues • Take ownership of production quality, including monitoring, debugging, performance tuning, and reliability improvements

United Kingdom
ClickHouse logo

Senior Full Stack Software Engineer – ClickPipes Platform

ClickHouse

ClickHouse, Inc. is a database management system that allows users to generate analytical reports using real-time SQL queries. The company’s technology works

• Build scalable UI systems that handle large datasets, async operations, and real-time state changes • Own features end-to-end, from initial design through production launch and long-term maintenance • Collaborate closely with product, design, and other engineering teams to deliver new features • Partner on API design and system contracts • Participate in an on-call rotation to support ClickPipes in production, helping diagnose incidents, and mitigate issues • Take ownership of production quality, including monitoring, debugging, performance tuning, and reliability improvements

Netherlands
Instacart logo

Senior Software Engineer II, Storage

Instacart

Instacart invites the world to share love through food. This is how homemade is made.

OtherRemoteTeam 1,001-5,000Since 2012H1B Sponsor

• be a senior engineer in the team responsible for Storage platforms, with ownership and autonomy • work closely with other application engineering teams and internal stakeholders, owning a large part of the process • ship high quality, scalable and robust solutions with a sense of urgency • have the freedom to suggest and drive high-impact initiatives related to Storage solutions

United States
$192K - $242.5K / year
Job Closed
Abnormal Security logo

Staff Software Engineer – GenAI Innovations

Abnormal Security

Abnormally-Precise, Cloud-Native Email Security

OtherRemoteTeam 501-1,000H1B Sponsor

• Architect the 'Agent-Ready' infrastructure, building the sandboxed environments and headless interfaces that allow AI agents to execute safely without human hand-holding • Embed with product teams to identify friction in the 'Plan → Code → Test' loop, shipping 0-to-1 internal tools that ruthlessly automate bottlenecks • Replace manual human validation with deterministic proof; build the validation frameworks that allow agents to verify their own work • Act as a technical scout for the organization, critically evaluating emerging agentic frameworks to inform our build-vs-buy strategy

United States
$209.8K - $246.8K / year
Job Closed