Job Closed

This listing is no longer active.

At Ford Motor Company, we believe freedom of movement drives human progress. We also believe in providing you with the freedom to define and realize your dreams. With our incredible plans for the future of mobility, we have a wide variety of opportunities for you to accelerate your career potential as you help us define tomorrow’s transportation.

HPC - AI/ML Platform Engineer

Platform EngineerPlatform EngineerOther Remote Mid LevelTeam 10,001+Since 1903H1B SponsorCompany Site LinkedIn

Location

United States

Posted

79 days ago

Salary

$113K - $190K / year

Seniority

Mid Level

Linux Kubernetes OpenShift CI/CD Python Shell Prometheus Grafana Ansible Terraform

Job Description

The selected candidate will join the team responsible for engineering and operating large-scale GPU and compute platforms that power AI/ML and high performance computing workloads across multiple datacenters. The team manages Kubernetes-based GPU environments, cluster infrastructure, and the supporting systems that enable internal engineering teams to train models, run simulations, and develop advanced software at scale. This role focuses on building reliable, scalable GPU platforms and helping internal users successfully run AI/ML and high-performance workloads on Kubernetes and related compute infrastructure. - Design, implement, and support GPU/Kubernetes clusters and supporting infrastructure - Supporting AI/ML training, simulation, and HPC workload customers - Develop automation and tooling for cluster provisioning, configuration management, and platform operations - Collaborate with application and research teams to optimize workloads running on GPU infrastructure - Implement monitoring, observability, and performance tuning across GPU and compute platforms - Troubleshoot infrastructure issues across compute, networking, and container platforms (occasional on-call support) - Contribute to platform reliability, scalability, and operational best practices - Produce clear technical documentation and operational runbooks Must Have: - 5+ years of Linux systems engineering or infrastructure experience - 2+ years working with container platforms such as Kubernetes or OpenShift - Familiarity with Kubernetes GPU scheduling and related tooling - Familiarity with CI/CD pipelines and platform engineering practices - Experience operating compute infrastructure for high-performance workloads or large distributed systems - Strong scripting or programming skills (Python, Bash, or similar) - Experience building infrastructure automation and operational tooling - Strong troubleshooting and problem-solving skills across complex infrastructure systems - Ability to communicate clearly with both platform engineers and application teams - Demonstrated ability to manage multiple technical initiatives simultaneously Nice to Have: - Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience - Experience with observability platforms such as Prometheus, Grafana, or similar - Experience with infrastructure automation tools (Ansible, Terraform, etc.) - Experience with high-speed networking technologies such as InfiniBand or RDMA You may not check every box, or your experience may look a little different from what we've outlined, but if you think you can bring value to Ford Motor Company, we encourage you to apply! As an established global company, we offer the benefit of choice. You can choose what your Ford future will look like: will your story span the globe, or keep you close to home? Will your career be a deep dive into what you love, or a series of new teams and new skills? Will you be a leader, a changemaker, a technical expert, a culture builder…or all of the above? No matter what you choose, we offer a work life that works for you, including: - Immediate medical, dental, and prescription drug coverage - Flexible family care, parental leave, new parent ramp-up programs, subsidized back-up child care and more - Vehicle discount program for employees and family members, and management leases - Tuition assistance - Established and active employee resource groups - Paid time off for individual and team community service - A generous schedule of paid holidays, including the week between Christmas and New Year’s Day - Paid time off and the option to purchase additional vacation time. For a detailed look at our benefits, click here: Benefit Summary This position is a salary grade 8. This position is a salary grade 8 and ranges from $113,580-190,500. *Visa Sponsorship is not provided for this role* Candidates for positions with Ford Motor Company must be legally authorized to work in the United States. Verification of employment eligibility will be required at the time of hire. We are an Equal Opportunity Employer committed to a culturally diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, color, age, sex, national origin, sexual orientation, gender identity, disability status or protected veteran status. In the United States, If you need a reasonable accommodation for the online application process due to a disability, please call 1-888-336-0660. #LI-Remote #LI-GH2

Job Requirements

5+ years of Linux systems engineering or infrastructure experience
2+ years working with container platforms such as Kubernetes or OpenShift
Familiarity with Kubernetes GPU scheduling and related tooling
Familiarity with CI/CD pipelines and platform engineering practices
Experience operating compute infrastructure for high-performance workloads or large distributed systems
Strong scripting or programming skills (Python, Bash, or similar)
Experience building infrastructure automation and operational tooling
Strong troubleshooting and problem-solving skills across complex infrastructure systems
Ability to communicate clearly with both platform engineers and application teams
Demonstrated ability to manage multiple technical initiatives simultaneously
Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience (Nice to Have)
Experience with observability platforms such as Prometheus, Grafana, or similar (Nice to Have)
Experience with infrastructure automation tools (Ansible, Terraform, etc.) (Nice to Have)
Experience with high-speed networking technologies such as InfiniBand or RDMA (Nice to Have)

Benefits

Immediate medical, dental, and prescription drug coverage
Flexible family care, parental leave, new parent ramp-up programs, subsidized back-up child care and more
Vehicle discount program for employees and family members, and management leases
Tuition assistance
Established and active employee resource groups
Paid time off for individual and team community service
A generous schedule of paid holidays, including the week between Christmas and New Year’s Day
Paid time off and the option to purchase additional vacation time

Related Categories

Platform Engineer

Related Job Pages

Remote Python Jobs (US)More Remote Jobs

More Platform Engineer Jobs

Delivery Platform Engineer

Five9

Helping Companies Bring Joy to CX.

Platform Engineer79 days ago

Full Time RemoteTeam 1,001-5,000Since 2001H1B Sponsor

Company Site LinkedIn

• Provide customers with configuration advice, training, and problem resolution throughout the setup and installation process of Five9’s call center software. • Design and configure Five9’s platform for each customer’s unique requirements. • Troubleshoot software solutions in a wide array of configurations and customer environments both remotely and on-site. • Provide customized training to ensure customers have a thorough understanding of these solutions. • Articulate the value of Five9’s Professional Services through demonstrations and open discussion with Customers and prospects. • Effectively communicate with internal and external stakeholders.

VoIP

View details: Delivery Platform Engineer

California

$70.4K - $196.3K / year

Apply

Job Closed

Platform Engineering Manager

Onebrief

Software for rapid military planning: make planning fast enough for today's environment

Platform Engineer79 days ago

Full Time RemoteTeam 1-10Since 2019H1B No Sponsor

Company Site LinkedIn

• Lead, mentor, and grow the team of Platform Engineers by providing direction, removing blockers, and empowering them to own core platform infrastructure and tooling. • Partner with Cybersecurity, Product, and Engineering teams to ensure the platform supports secure, reliable, high-velocity delivery into cloud-native and air-gapped environments. • Drive standards and frameworks for Infrastructure as Code, service onboarding, environment management, GitOps workflows, and overall platform releases. • Ensure observability, reliability, scalability, and cost-effectiveness of the platform; continuously monitor and improve service health, developer feedback, and performance metrics. • Own the operational aspects of the platform team: incidents, responsiveness, After-Action Reviews (AARs) / post-mortems, runbooks, and continuous improvement of operational maturity. • Act as a key stakeholder in architecture, influencing decisions about how the platform integrates with product, engineering, and operations, and ensuring the platform meets mission needs.

Ansible AWS Cloud Cyber Security Kubernetes Terraform

View details: Platform Engineering Manager

United States

$205K - $255K / year

Apply

Power Platform Developer

Stefanini LATAM

Co-creating solutions for a better future

Platform Engineer79 days ago

Full Time RemoteTeam 10,001+Since 1987H1B No Sponsor

Company Site LinkedIn

• El Power Platform Developer será el responsable de diseñar, desarrollar e implementar soluciones empresariales de alto impacto utilizando la suite completa de Microsoft Power Platform. • Este rol actúa como referente técnico dentro del equipo, liderando iniciativas de automatización, digitalización de procesos y generación de valor a partir de datos, alineando las soluciones tecnológicas a los objetivos estratégicos del negocio. • Se espera que el profesional demuestre capacidad de adaptación ante entornos cambiantes, pueda trabajar de forma autónoma y colaborativa, y contribuya activamente a la madurez digital de la organización. • Diseñar y desarrollar aplicaciones empresariales con Power Apps (Canvas y Model-Driven). • Crear y optimizar flujos de trabajo complejos en Power Automate, incluyendo integraciones con sistemas externos vía conectores y API REST. • Implementar chatbots y agentes conversacionales con Copilot Studio (Power Virtual Agents). • Gestionar y configurar portales externos con Power Pages. • Integrar soluciones Power Platform con SharePoint Online, Teams, Dataverse, Azure AD y servicios de Azure. • Administrar entornos, soluciones administradas/no administradas y gobierno de la plataforma en el Power Platform Admin Center. • Aplicar buenas prácticas de ALM (Application Lifecycle Management) con pipelines CI/CD y control de versiones. • Participar activamente en el levantamiento de requerimientos con stakeholders de negocio. • Proponer mejoras continuas, estándares de desarrollo y documentación técnica. • Mentoría y acompañamiento a perfiles junior del equipo. • Evaluar y gestionar riesgos técnicos en las soluciones bajo su responsabilidad.

Azure JavaScript RPA

View details: Power Platform Developer

Colombia

Apply

Job Closed

Senior Platform Engineer, Data Persistence

Ridgeline, Inc.

Platform Engineer79 days ago

Other Hybrid

Company Site

Senior Platform Engineer, Data Persistence New York, NY Are you a platform engineer who thrives on designing scalable, resilient data systems that support real-world, enterprise workloads? Do you enjoy owning complex persistence challenges—balancing performance, reliability, and cost—while enabling product teams to move faster with confidence? Are you excited to shape the foundation of a cloud-native SaaS platform by building durable, high-availability data architectures that stand up to growth and change? If so, we invite you to be a part of our innovative team. As a Senior Platform Engineer on the Data Persistence team, you will play a critical role in designing, building, and optimizing the data systems that power Ridgeline’s enterprise SaaS platform. You’ll lead the evolution of our cloud-native, distributed data architecture—solving challenges around scale, performance, availability, and cost—while delivering a best-in-class developer experience. In this role, you will collaborate closely with application and infrastructure teams to develop efficient, resilient, and high-performance data solutions that directly support customer-facing workloads. You’ll leverage cutting-edge technologies—including AI tools like GitHub Copilot and ChatGPT—to enhance productivity, accelerate problem-solving, and continuously improve how we design and operate our data platforms. At Ridgeline, how we work matters as much as what we build. Ridgeliners act like owners, choose growth over comfort, and communicate with transparency. We assume positive intent, bias toward action, and bring solutions—not just problems. We celebrate wins, learn from setbacks, and thrive in a resilient, collaborative, high-performing culture. If this excites you, we’d love to meet you. You must be work authorized in the United States without the need for employer sponsorship. The impact you will have: - Design and develop cloud-native data storage solutions that deliver scalable, reliable, and performant persistence across a multi-tenant SaaS platform - Optimize data access patterns and collaborate with application teams to improve end-to-end performance at the application layer - Drive RPO and RTO improvements by establishing high-availability architectures, failover strategies, and disaster recovery plans - Support high-throughput, low-latency workloads through effective data partitioning, caching, and indexing strategies - Improve observability by investing in database monitoring, automation, and performance telemetry - Balance performance and efficiency by identifying and implementing cost optimizations across database infrastructure - Collaborate transparently with cross-functional teams to solve complex data challenges and share best practices - Mentor engineers, foster technical growth, and contribute to a resilient, inclusive culture of engineering excellence - Take ownership of critical systems, acting with accountability and a long-term mindset aligned with Ridgeline’s values What we look for: - 5+ years of experience in software or infrastructure engineering, with deep expertise in data persistence, distributed systems, or database engineering - Proven experience building and operating distributed, multi-writer OLTP SQL systems (e.g., SingleStore, CockroachDB) and single-writer systems (e.g., Postgres), with a strong understanding of replication, sharding, and consistency tradeoffs - Hands-on experience with OLAP or analytical data stores such as ClickHouse and/or Apache Iceberg on S3 - Strong background in high availability and disaster recovery strategies, including cross-region replication, backups and point-in-time recovery, and clearly defined RPO/RTO targets - Solid experience with AWS services such as Aurora RDS, S3, ECS, Lambda, and related infrastructure - Proficiency in at least one programming language such as Kotlin, Java, or TypeScript - Experience using Datadog or similar tooling for database and storage observability - Proficiency with AI-assisted development tools such as Cursor, Claude, or GitHub Copilot - Strong problem-solving skills, clear communication, and a genuine interest in learning and continuous improvement Bonus: - Experience designing and operating high-throughput JVM services and data-access libraries, with deep knowledge of threading, connection pooling, and saturation behavior (e.g., HikariCP) - Background in event-driven architectures using Kafka or Pub/Sub, including familiarity with Debezium, Kafka Connect, schema registries, or CDC workflows - Experience in fintech, investment management, or other highly regulated data environments The typical starting salary range for new hires in this role is targeted at $146,000 - $172,000. Final compensation amounts are determined by multiple factors, including candidate experience and expertise, and may vary from the amount listed above. As an employee at Ridgeline, you’ll have many opportunities for advancement in your career and can make a true impact on the product. In addition to the base salary, 100% of Ridgeline employees can participate in our Company Stock Plan subject to the applicable Stock Option Agreement. We also offer rich benefits that reflect the kind of organization we want to be: one in which our employees feel valued and are inspired to bring their best selves to work. These include unlimited vacation, educational and wellness reimbursements, and $0 cost employee insurance plans. #LI-Hybrid

PostgreSQL Apache Iceberg Amazon S3 AWS Amazon ECS Amazon Lambda Datadog Kotlin Java TypeScript GitHub Apache Kafka Debezium

View details: Senior Platform Engineer, Data Persistence

New York

Apply

HPC - AI/ML Platform Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Platform Engineer Jobs

Delivery Platform Engineer

Platform Engineering Manager

Power Platform Developer

Senior Platform Engineer, Data Persistence