We get to the heart of the matter.....real people......real solutions

AI Platform Engineer

Platform EngineerPlatform EngineerFull Time Remote SeniorTeam 1-10H1B No SponsorCompany Site LinkedIn

Location

India

Posted

135 days ago

Salary

Seniority

Senior

Bachelor Degree4 yrs expEnglishAnsible AWS Azure GCP Kubernetes Python PyTorch TensorFlow Terraform

Job Description

• Architect and manage Kubernetes clusters tailored to AI/ML workloads. • Implement Run:ai and operators for GPU resource orchestration and workload scheduling. • Develop and maintain Python-based automation scripts and ML pipelines; automate infrastructure provisioning with Terraform and configuration management with Ansible. • Create and manage Jupyter Notebooks for experimentation and collaboration. • Integrate and optimize NVIDIA Enterprise Suite components (CUDA, NeMo Framework, Triton, TensorRT, GPU drivers) for accelerated computing. • Establish and maintain MLOps best practices for model lifecycle management, CI/CD, and monitoring (e.g., MLflow, Kubeflow). • Work closely with data scientists and platform engineers to ensure efficient resource utilization and scalability across environments.

Job Requirements

4+ years in platform architecture or solutions architecture, with 2+ years focused on AI/ML workloads.
Experience with high-performance computing (HPC) environments.
Familiarity with distributed training and model optimization techniques.
Certification in Kubernetes or cloud platforms (AWS, Azure, GCP).
Strong proficiency in Python and experience with ML frameworks (TensorFlow, PyTorch).
Hands-on experience with Kubernetes and container orchestration.
Familiarity with Run:ai or similar GPU scheduling platforms.
Expertise in Terraform and Ansible for infrastructure automation.
Experience with Jupyter Notebooks for ML development.
Knowledge of NVIDIA Enterprise Suite (CUDA, NeMo Framework, Triton, GPU drivers).
Solid understanding of MLOps principles and tools (e.g., MLflow, Kubeflow).
Background in deploying and scaling AI workloads in cloud or hybrid environments.

Benefits

India Employment Benefits include:

Related Categories

Platform Engineer

Related Job Pages

Remote Full-time Jobs (US)Remote Python Jobs (US)More Remote Jobs

More Platform Engineer Jobs

Senior Manager, AI Platform Engineering

Socure

The leading provider of digital identity verification and fraud solutions. Salesinfo@socure.com

Platform Engineer136 days ago

Other RemoteTeam 501-1,000Since 2012H1B Sponsor

Company Site LinkedIn

• Develop and own the roadmap for Socure’s AI/ML platform, including data and feature engineering workflows, training infrastructure, experimentation tooling, model deployment/serving, monitoring, and governance. • Define architecture and standards that create clear, scalable, and secure paths for building and operating AI systems. • Assess technology options and drive consolidation across the company to reduce fragmentation and improve consistency across the ML toolchain. • Partner with Data Science, Engineering, Product, and Sales-Enablement teams to develop AI infrastructure that delights Customers. • Lead the design and operation of the end-to-end ML lifecycle: data ingestion, feature engineering, experimentation, training, model registry, deployment, and continuous monitoring. • Guide the team to deliver high-quality platform capabilities with predictable timelines and strong technical rigor. • Implement and enforce best practices around model versioning, auditability, lineage tracking, data governance, and security controls. • Lead, mentor, and grow both senior and junior ICs across ML infrastructure, MLOps, and distributed systems.

Distributed Systems

View details: Senior Manager, AI Platform Engineering

United States

$190K - $210K / year

Apply

Job Closed

Lead AI Platform Engineer

Prolific

Building a better world with better data.

Platform Engineer136 days ago

Full Time RemoteTeam 51-200Since 2014H1B Sponsor

Company Site LinkedIn

• Bridge the gap between research and real-world application. • Ensure high-performance infrastructure, automated pipelines, and deployment strategies. • Design and maintain scalable cloud environments (GCP/AWS) using Terraform. • Manage GPU/TPU resource allocation for training, fine-tuning, and interactive notebooks. • Build internal services and CLI tools for the AI team. • Design CI/CD and training pipelines using tools such as GitHub Actions, MLFlow, Vertex AI Pipelines. • Develop reusable patterns for model serving and manage service deployments to Kubernetes. • Manage and optimize vector databases and embedding pipelines for RAG-based systems. • Implement techniques to reduce latency and increase throughput.

AWS Cloud Google Cloud Platform Kubernetes Terraform

View details: Lead AI Platform Engineer

United Kingdom

Apply

AI Platform Engineer – Lead

Kayzen

Kayzen powers the world's best mobile marketing teams to take programmatic in-house.

Platform Engineer136 days ago

Full Time RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Design and build internal AI frameworks, SDKs, and shared libraries • Enable teams to integrate AI features with minimal friction • Set up standardized patterns for using LLMs, embeddings, agents, and workflows • Build reusable components for prompt management, evaluation, observability, and safety • Define best practices for AI usage, cost control, and reliability • Evangelise AI internally through documentation, examples, and hands-on guidance • Rapidly prototype AI-powered features and turn them into reusable building blocks • Own AI tooling from experimentation to production

AWS Azure ETL GCP Python

View details: AI Platform Engineer – Lead

Worldwide

Apply

Senior Platform Engineer

vCluster Labs

vCluster Labs is a venture-backed tech startup headquartered in San Francisco, California, with a distributed, remote-first team spanning eight time zones. Foun

Platform Engineer137 days ago

Full Time Remote

Company Site

• Infrastructure Management: Own and improve our multi-cloud infrastructure spread across AWS, GCP, and Digital Ocean. You will manage Kubernetes clusters, handle patching, manage access, and enhance to ensure our tooling has robust alerts and metrics. • CI/CD Optimization: Drive the improvement of GitHub CI pipelines. You will be responsible for creating secure, repeatable testing environments and automating pipeline updates to streamline the developer experience. • Internal Services Architecture: Architect and host infrastructure for engineering development, including internal services and vCluster-specific platforms (e.g., loft.rocks, vCluster Cloud). You will empower engineers to build pipelines securely through education and tooling. • Customer Zero: Act as the first and most critical user of our products. You will push vCluster features to their limits to create useful internal tools, discovering bugs and providing feedback to Engineering to shape the future of our software. • Terraform Automation: Focus on automating updates and managing infrastructure as code using Terraform Spacelift. You will give the team the ability to create infrastructure on demand, ensuring scalability and consistency. • Execution: Manage a variety of Kanban tasks via Linear, ranging from improving observability to handling GitHub policy requests, release engineering, and access management.

AWS GCP Kubernetes Terraform

View details: Senior Platform Engineer

Germany

€80K - €110K / year

Apply

Job Closed

AI Platform Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Platform Engineer Jobs

Senior Manager, AI Platform Engineering

Lead AI Platform Engineer

AI Platform Engineer – Lead

Senior Platform Engineer