Job Closed

This listing is no longer active.

Senior CUDA Driver, DevOps Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 10,001+Since 1993H1B SponsorCompany SiteLinkedIn

Location

India

Posted

167 days ago

Salary

0

Seniority

Senior

Bachelor Degree8 yrs expEnglishDockerKubernetesLinux

Job Description

Senior CUDA Driver, DevOps Engineer

NVIDIA

• Decomposing and modularizing build processes for reusability across multiple projects • Debugging GitHub Actions/GitLab pipelines to ensure timely and efficient CI execution • Working on scripting and infrastructure to handle dependencies across various environments and build systems • Bringing up builds and CI across platforms (x64/arm64) and OSes (Linux/Windows/Mac) and other unreleased hardware and software • Working with engineering leadership to identify the support matrix and define the scope of the build matrix • Crafting and updating documentation and coordinating with partners to scope and take on multi-functional projects • Automating scheduled work for all of the above

Job Requirements

  • Bachelor’s Degree in Systems/Software/Computer Engineering, CS or equivalent experience
  • 8+ years of relevant industry experience or equivalent academic experience after BS
  • Experience working across multiple highly-coupled projects (in Git or another VCS)
  • Experience collaborating with cloud providers, Kubernetes, GitHub Actions, and other systems
  • Familiarity with automating container builds, updates, and debugging multi-container workflows
  • Background with CI/CD systems including Github and Gitlab
  • Understanding of testing principles and how to quantify/improve coverage, developer velocity
  • Knowledge of release management practices
  • Strong analytical, debugging, and problem-solving skills
  • Familiarity with containerization technologies (e.g. Docker)

Benefits

  • Competitive salaries
  • Comprehensive benefits package

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Agile Lab logo

Site Reliability Engineer II

Agile Lab

Harvest the power of your data

DevOps Engineer168 days ago
Full TimeRemoteTeam 51-200Since 2014H1B No Sponsor

• Ensure high reliability of microservices running in OpenShift environments • Lead and coordinate a technical team of 3–4 engineers for operational excellence • Manage incident resolution and ticketing workflows via ServiceNow • Collaborate with development teams to drive performance optimization and tuning • Design, configure and maintain monitoring dashboards (Grafana, Prometheus, etc.) • Coordinate with Service Control Room to maintain effective alerting and response • Oversee release processes of new features, hotfixes, and updates in production

Italy
€38.5K - €48.5K / year
EnterpriseAlumni logo

Head of DevOps, Cloud & Infrastructure

EnterpriseAlumni

Corporate Alumni Engagement & Management Platform For The Enterprise

DevOps Engineer168 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

• Architect, build, and maintain scalable, secure, multi-regional cloud infrastructure on AWS • Own our Infrastructure as Code practices using Terraform, ensuring reproducibility and auditability • Design and optimize CI/CD pipelines across Jenkins and CircleCI, including iOS and Android build systems • Manage container orchestration via EC2/ECS/ECR and Kubernetes as well as ingress/routing through Traefik • Lead observability strategy using Grafana and Prometheus — ensuring comprehensive monitoring, alerting, and incident response capabilities • Drive high availability and disaster recovery planning across regions • Ensure infrastructure meets SOC 2, ISO 27001, and Cyber Essentials+ requirements • Implement and maintain robust security practices, including encryption at rest, in transit, and in use • Stay current on evolving compliance requirements for banking and professional services clients • Lead security audits and remediation efforts • Continuously monitor and optimize cloud spend, staying ahead of AWS pricing changes and leveraging reserved instances, savings plans, and right-sizing strategies • Establish cost visibility and accountability across teams • Present regular cost analyses and recommendations to leadership • Build, mentor, and lead the DevOps and infrastructure team • Set clear goals, provide regular feedback, and support career development • Foster a culture of ownership, collaboration, and continuous improvement • Manage vendor relationships and negotiate contracts where applicable • Partner closely with development teams to ensure infrastructure supports application needs • Communicate infrastructure strategy, risks, and trade-offs clearly to non-technical stakeholders • Participate in incident response and establish on-call practices that balance reliability with team well-being

Brazil
Job Closed
OtherRemoteTeam 201-500H1B No Sponsor

• Implement and maintain observability tools and dashboards using [e.g., AWS CloudWatch, Datadog, Sentry, OpenTelemetry]. • Go beyond basic CPU/memory metrics; instrument applications for high-value Application Performance Monitoring (APM) traces, custom business metrics, and real-user monitoring (RUM). • Enhance security monitoring in our observability stack. Implement automated alerts for anomalous behavior, access pattern violations, and potential security threats. • Implement logging and retention configurations to meet defined data retention policies and relevant standards (e.g., GDPR, CCPA, SOC2) and ensure PII is appropriately redacted or handled. • Assist with cloud cost visibility and optimization. • Analyze infrastructure usage patterns to identify waste, implement aggressive tagging strategies, and recommend rightsizing adjustments to reduce spend. • Manage Reserved Instances, Savings Plans, and Spot Instance usage to maximize value. • Manage and enhance our CI/CD pipelines (using [e.g., GitHub Actions, GitLab CI, Jenkins]). Your goal is to optimize for speed, reliability, and ease of use for developers • Integrate security scanning (SAST/DAST/container scanning) and compliance checks directly into the CI pipeline. • Manage the tooling and processes for deploying applications to AWS EKS / Kubernetes / ECS / Serverless • Facilitate modern deployment strategies, such as Blue/Green deployments, Canary releases, and feature-flag rollouts, to minimize blast radius during releases. • Maintain and evolve our Infrastructure as Code (IaC) base using [Terraform / OpenTofu / CloudFormation / Pulumi].

United States
Job Closed
Bugcrowd logo

Staff Site Reliability Engineer

Bugcrowd

See Security Differently™

DevOps Engineer168 days ago
OtherRemoteTeam 201-500Since 2012H1B No Sponsor

• Define and drive the technical vision for infrastructure reliability across the organization • Architect large-scale, fault-tolerant systems on AWS using Terraform • Lead cross-functional initiatives to improve system reliability, scalability, and efficiency • Establish standards for infrastructure-as-code, CI/CD, and deployment practices • Design and implement solutions for our most complex operational challenges • Lead incident response for critical outages and drive systemic improvements • Mentor senior engineers and help grow the SRE team’s capabilities • Evaluate and introduce new technologies that improve operational excellence • Influence engineering culture around reliability, observability, and operational maturity

California + 1 moreAll locations: California | New Hampshire
$151.0K - $188.8K / year
Job Closed