Quantiphi logo
Quantiphi

Pioneering AI-first solutions, solving complex business challenges through expertise, cloud, data engineering, and AI.

Architect – Platform Engineer

Platform EngineerPlatform EngineerFull TimeRemoteLeadTeam 1,001-5,000Since 2013H1B SponsorCompany SiteLinkedIn

Location

California

Posted

5 days ago

Salary

0

Seniority

Lead

Job Description

Architect – Platform Engineer

Quantiphi

• Design and implement scalable infrastructure for LLM and GenAI workloads across multi-GPU environments. • Perform GPU profiling, benchmarking, and performance optimization for distributed training workloads. • Manage and schedule compute-intensive jobs using Slurm-based clusters and OpenShift/Kubernetes environments. • Enable and optimize the NVIDIA GPU stack (CUDA, cuDNN, NCCL, Triton, RAPIDS, etc.). • Collaborate with cross-functional teams to deploy models in research and production environments. • Build and support GenAI pipelines (fine-tuning, RAG, multi-modal inferencing, LLMOps). • Develop reusable infrastructure templates using tools like Terraform and Helm.

Job Requirements

  • 10+ years of experience in relevant field.
  • Strong experience with Slurm and distributed training environments.
  • Hands-on expertise with Red Hat OpenShift and/or Kubernetes.
  • Deep knowledge of the NVIDIA GPU ecosystem (CUDA, cuDNN, NCCL, Nsight, Triton/TensorRT).
  • Strong foundation in Linux systems, performance tuning, and multi-GPU optimization.
  • Experience deploying GenAI workloads (LLM fine-tuning, RAG pipelines, multi-modal systems).
  • Familiarity with Infrastructure-as-Code tools (Terraform, Ansible).
  • Experience with cloud GPU environments (GCP, Azure, AWS, OCI) and/or on-prem GPU clusters.

Benefits

  • Up-skill and discover your potential as you solve complex challenges in cutting-edge areas of technology alongside passionate, talented colleagues.
  • Work where innovation happens - work with disruptive innovators in a research-focused organization with 60+ patents filed across various disciplines.
  • Stay ahead of the curve, immerse yourself in breakthrough AI, ML, data, and cloud technologies and gain exposure working with Fortune 500 companies.
  • If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us !

Related Categories

Related Job Pages

More Platform Engineer Jobs

Feathr logo

Principal Platform Engineer

Feathr

Digital marketing tools for events & nonprofits

Full TimeRemoteTeam 51-200H1B No Sponsor

• Serve as a technical anchor on the Platform Engineering Team, driving architectural decisions, leading complex initiatives, and raising the bar for how we build and operate infrastructure. • Sit at the intersection of deep individual contribution and technical leadership: you're hands-on in the work while also shaping how the team approaches problems, mentoring engineers, and partnering cross-functionally to ensure our systems are reliable, secure, and built to scale. • Lead the design and implementation of scalable, reliable, and secure system architecture across the platform. • Identify and drive systemic improvements, proactively surfacing technical debt, reliability gaps, and opportunities to modernize. • Own and optimize cloud infrastructure, networking, and database systems for high availability and performance. • Implement and monitor security protocols to protect systems against vulnerabilities and maintain compliance with applicable standards. • Manage and continuously refine CI/CD pipelines to support reliable, efficient software deployments. • Collaborate with cross-functional teams to maintain consistent and reproducible development environments.

United States
$145K - $165K / year

Platform Engineer

Ometria

Headquartered in London, England, United Kingdom, Ometria is an internet company developing a retention marketing platform designed for retailers. With its platform, the company wo

Role Description We have an interesting new opportunity for a Platform Engineer to join our team! As we are experiencing rapid growth, the role has been created to help build, scale and maintain our fully cloud based backbone of our platform. Your role will be key in helping to scale our infrastructure, build out the underlying platform, evolve our systems and build new ones. Your day to day will be: - Building and operating Kubernetes across development, test and production environments - Working with developers to improve the operability, availability and observability of our systems - Building and improving Continuous Delivery pipelines - Defining and evangelizing SRE and DevSecOps best practices within product teams - Championing application and infrastructure security as well as cost management - Helping us to scale our system and services to handle client accounts of 30 million+ contacts Qualifications - Like the sound of how we like to work, the technologies we use and our product - Enjoy learning new things, and solving real world problems - Care about infrastructure as code and reproducibility - Understand the importance and value of DevOps and SRE principles - Enjoy working in a platform team and in cross functional product teams - Believe in Continuous Delivery Requirements - Experience in building and deploying infrastructure in high traffic environments (not essential, but useful) - Experience in operating highly available cloud infrastructure (not essential, but useful) - Experience or knowledge about deploying applications to and operating Kubernetes in production (not essential, but useful) - Enjoy measuring anything and everything, and driving insight into systems and application performance (not essential, but useful) Benefits The amazing people of Ometria are the core of our business. We believe in making it awesome to be here for all Ometrians and place a continued focus on making Ometria an inclusive, respectful and diverse environment. We're an equal opportunity employer and all applicants will be considered for employment without attention to ethnicity, religion, sexual orientation, gender identity, age, family or parental status, national origin, veteran, neurodiversity status or disability status.

Portugal

Role Description We are seeking a seasoned SAP Basis / SAP Platform Engineer responsible for the architecture, installation, configuration, performance tuning, and day-to-day operation of complex SAP landscapes that may span on-premises, hybrid, and cloud-hosted environments. This role is critical to keeping mission-critical SAP systems available, performant, secure, and compliant. The ideal candidate will bring deep expertise across SAP NetWeaver, S/4HANA, HANA database administration, system copies, upgrades, and migrations, and will partner with infrastructure, security, and SAP functional teams to deliver a stable, well-governed SAP platform. Qualifications - Bachelor’s degree in Computer Science, Engineering, or a related technical discipline. - Five or more years of SAP Basis / platform engineering experience in large enterprise environments. - Strong hands-on experience administering SAP HANA databases in production. - Experience operating S/4HANA, ECC, BW/4HANA, Solution Manager, and other NetWeaver systems. - Hands-on experience with SAP migrations to cloud platforms such as AWS, Azure, or GCP. - Strong experience with SAP upgrades, EHP/SP application, and system conversions. - Solid experience with SAP HA/DR strategies, including HANA System Replication and clustering. - Working knowledge of SAP transport management and Solution Manager ChaRM. - Scripting skills in Bash, PowerShell, Python, or Ansible. - Strong troubleshooting, communication, and documentation skills. Requirements - Architect, install, configure, and operate SAP landscapes including S/4HANA, ECC, BW/4HANA, Solution Manager, and other NetWeaver-based systems across on-prem and cloud environments. - Administer SAP HANA databases — including installation, sizing, replication, backup, recovery, and performance tuning — to meet demanding availability and performance SLAs. - Plan and execute SAP system copies, refreshes, and post-copy automation activities to support development, testing, training, and migration scenarios. - Lead SAP upgrades, EHP installations, Support Package application, kernel upgrades, and S/4HANA conversions with minimal disruption to business operations. - Design and implement SAP migrations to cloud platforms (AWS, Azure, GCP) and SAP RISE/Grow environments, including network and identity integration. - Configure and operate high-availability and disaster-recovery solutions for SAP, including HANA System Replication, cluster-based failover, and validated DR runbooks. - Implement and maintain SAP transport management strategies, including Solution Manager ChaRM and CTS+ workflows. - Manage SAP user provisioning, role management, and audit support in close collaboration with the SAP Security/GRC team. - Monitor SAP system performance, capacity, and health using Solution Manager, Focused Run, native HANA tools, and external APM platforms. - Develop automation scripts and runbooks using Ansible, PowerShell, or Python to reduce operational toil and accelerate routine tasks. - Manage and remediate SAP security notes, kernel patches, and database patches in accordance with internal security and compliance policies. - Provide 24x7 on-call support for production SAP systems and lead structured incident response and post-incident reviews. - Document SAP architecture, operations procedures, and standard operating processes to support knowledge sharing and team scalability. - Mentor junior Basis engineers and contribute to the broader SAP platform roadmap. Benefits - Competitive base salary commensurate with experience, plus benefits. - 100% remote work opportunity. - Long-term, multi-year engagement aligned to the Bright Vision SOW delivery roadmap.

United States
$100K - $150K / year
Job Closed

Platform Support Engineer

Scalingo

La mission de Scalingo est de construire la meilleure plateforme cloud européenne pour les développeuses et développeurs.

Role Description En tant que Platform Support Engineer , tu joues un rôle central dans l’expérience utilisateur de la plateforme Scalingo et le support client. Ton rôle est à la fois : - Technique , en analysant, diagnostiquant et résolvant des problématiques clients de complexité variable. - Structurant , en contribuant à l’amélioration continue du support, du produit et de la plateforme. - Transverse , en collaborant avec les équipes support, engineering, produit et SRE. Tu interviens comme point d’entrée ou point d’escalade selon la nature des demandes, depuis les problématiques simples jusqu’aux incidents nécessitant une analyse approfondie. En lien direct avec les clients et les équipes internes, tu contribues à apporter des réponses claires, fiables et pertinentes. Qualifications - Compétences techniques : - Profil support technique / Ops : appétence pour les environnements de production, les problématiques clients et la résolution de sujets techniques complexes. - Diagnostic et troubleshooting : capacité à analyser des comportements systèmes ou applicatifs, formuler des hypothèses et identifier les causes racines. - Environnements cloud et architectures web : compréhension des briques d’un cloud provider, des architectures web modernes, des API, du DNS, du cache et des bases de données. - Bases de données et outils web : aisance avec les bases relationnelles ou NoSQL, Git, les outils de CI/CD et les environnements de déploiement web. - Sécurité, conformité et documentation : sensibilité aux bonnes pratiques de sécurité, capacité à rédiger des solutions claires et à capitaliser les retours d’expérience. - Efficacité augmentée par l’I.A. : aisance dans l’utilisation d’outils d’Intelligence Artificielle pour gagner en efficacité au quotidien. - Compétences comportementales : - Priorisation et rigueur : capacité à gérer les urgences, les changements de priorité et les situations d’incertitude. - Communication et collaboration transverse : échanges clairs, structurés et adaptés avec les clients comme avec les équipes internes. - Posture professionnelle : curiosité, persévérance, sang-froid, approche blameless et sens de l’impact utilisateur. - Écoute et pédagogie : capacité à comprendre les besoins clients, instaurer un climat de confiance et rendre les sujets techniques accessibles. Requirements - Support technique niveau 1 à 3. - Étudier et diagnostiquer des problèmes techniques en utilisant des outils de débogage, des journaux système et d’autres méthodes d’analyse. - Poser des questions ciblées aux clients pour comprendre rapidement l’origine des problèmes rencontrés. - Mener des investigations techniques avancées : logs, métriques, traces, débogage. - Identifier les causes racines et collaborer avec les équipes de développement pour définir et mettre en œuvre des solutions pérennes. - Assurer le suivi post-résolution auprès des clients et des équipes internes. - Analyser les incidents et les problématiques récurrentes afin d’identifier des axes d’amélioration produit ou plateforme. - Collecter et structurer les retours clients pour alimenter les échanges avec les équipes Produit et Engineering. - Contribuer à la priorisation des évolutions techniques ou fonctionnelles liées à la fiabilité, la performance ou la sécurité. - Participer à l’amélioration continue des outils, processus et indicateurs de l’équipe support. - Documenter les investigations, solutions et enseignements en documentation interne et externe. - Enrichir et maintenir la base de connaissances. - Partager régulièrement les informations clés : incidents, tendances, retours clients et bonnes pratiques. - Contribuer à la diffusion d’une culture de qualité, de fiabilité et de collaboration au sein de Scalingo. Benefits - Full remote. - Événements d’entreprise : offsite annuel et afterworks réguliers. - Prime mensuelle de télétravail (57,60 €). - Tickets restaurant (11,52 €) et carte Swile. - Mutuelle d'entreprise avantageuse (BENEFIZ). - Ordinateur portable sous Linux. - Budget d’équipements complémentaires (participation forfaitisée). Company Description La mission de Scalingo est de construire la meilleure plateforme cloud européenne pour les développeuses et développeurs.

France