Job Closed

This listing is no longer active.

Platform Engineer

Platform EngineerPlatform EngineerFull TimeRemoteLeadTeam 1,001-5,000Since 1939H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

88 days ago

Salary

$100K - $125K / year

Seniority

Lead

Bachelor Degree8 yrs expEnglishAWSGrafanaLinuxPythonSplunkTerraform

Job Description

Platform Engineer

Jensen Hughes

• Build and operate cloud platform on AWS using Terraform • Design and implement Infrastructure CI/CD and PR-driven infrastructure delivery • Own platform-grade observability • Enable secure, production-ready agentic AI capabilities

Job Requirements

  • 8–10 years of experience in Platform Engineering / SRE / DevOps (or equivalent)
  • AWS expertise, including multi-account patterns (AWS Organizations / Control Tower preferred), networking, IAM/security, and operations
  • Terraform expert with proven ownership of org-scale infrastructure-as-code (modules, state, CI controls, large refactors)
  • Proven experience designing Infrastructure CI/CD and PR-driven infrastructure delivery (GitOps principles)
  • Strong production experience with observability platforms such as Splunk, Datadog, Grafana, or Dynatrace
  • Strong Linux and troubleshooting skills; proficiency in automation (Python or Go preferred)

Benefits

  • Competitive total rewards package
  • Retirement plan
  • Healthcare coverage
  • Broad range of other benefits

Related Categories

Related Job Pages

More Platform Engineer Jobs

Cloudiax logo

Platform Engineer – Kubernetes, GitOps, AI interest

Cloudiax

Global Business Cloud provider for SAP B1, Cloud Infrastructure, AI Server & more - made in Germany, available worldwide

Full TimeRemoteTeam 11-50Since 2002H1B No Sponsor

• You act as the interface between development and cloud administration • Your responsibility is to design our Kubernetes platform and deployment processes so developers can work efficiently, reproducibly, and with standardized workflows • You work closely with development, infrastructure, and security teams to provide a reliable platform • Enhance and operate our Kubernetes platform in on‑premise data centers, ensuring stability, scalability, and performance • Build and optimize self‑service processes that enable developers to perform efficient, standardized deployments • Optimize resource usage (CPU, RAM, storage, GPU) and develop fair workload strategies • Operate and evolve our PostgreSQL platform, including high availability, backup/recovery, and performance tuning • Automate infrastructure and deployment tasks using Terraform, Ansible, or similar tools • Build and run modern CI/CD and GitOps processes (e.g., Argo CD, Flux) • Define platform standards, templates, and best practices to ensure a consistent developer experience • Integrate and operate authentication and authorization systems (Keycloak) as well as API gateways (Kong) • Ensure multi‑tenancy, isolation, and compliance with security requirements • Implement and operate monitoring, logging, and tracing solutions (Prometheus, Grafana, OpenTelemetry) • Develop intelligent scaling mechanisms (e.g., HPA, custom metrics) beyond classical CPU/RAM metrics • Support development teams in using the platform and continuously improve the developer experience • Serve as the interface between development and cloud administration: gather requirements, improve workflows, and provide feedback • Bring your interest in AI topics and explore automation and orchestration opportunities

Germany
Job Closed
Blue River Technology logo

Senior Platform Engineer, CVML

Blue River Technology

Our mission is to create intelligent machinery that solves monumental challenges for our customers.

Full TimeRemoteTeam 201-500Since 2011H1B Sponsor

• Design, build, and evolve platform capabilities that support ML training, batch inference, and model deployment workflows at scale. • Own and improve core platform components (e.g., compute orchestration, data pipelines, inference systems) used by multiple teams across Blue River and John Deere. • Continuously enhance platform reliability, scalability, and performance, with a focus on real-world ML workloads. • Enable ML engineers to move faster by building intuitive, well-documented platform tools and workflows across the model lifecycle (experimentation, deployment, and iteration). • Improve model inference performance and throughput while balancing trade-offs among cost, latency, and reliability. • Support and scale distributed training and inference systems, including frameworks such as Ray and related tooling. • Develop and optimize hybrid compute environments (cloud + on-prem/GPU infrastructure) to support large-scale ML workloads. • Build and maintain infrastructure leveraging Kubernetes, Slurm, and cloud platforms (AWS preferred). • Identify and resolve bottlenecks in compute, storage, and data movement pipelines. • Evaluate existing platform systems and make thoughtful decisions on when to extend, refactor, or rebuild components. • Drive improvements in system architecture, balancing short-term delivery with long-term platform health. • Contribute to shaping the platform roadmap and technical direction in response to evolving business and ML needs. • Partner closely with ML engineers, robotics teams, infrastructure teams, and product stakeholders to translate requirements into scalable platform solutions. • Act as a technical bridge between teams, ensuring platform capabilities align with real-world use cases and constraints. • Influence platform adoption and best practices across multiple teams. • Support platform capabilities that enable simulation-based testing and validation of ML systems, including synthetic data workflows. • Improve tooling that allows teams to test and validate models before production deployment. • Provide technical guidance and mentorship to junior engineers on platform and systems design. • Lead implementation efforts for key platform initiatives and ensure high-quality execution. • Demonstrate strong ownership and accountability for delivering impactful platform improvements.

United States
$160K - $287K / year
Job Closed
Blue River Technology logo

Sr Platform Engineer, CVML

Blue River Technology

Our mission is to create intelligent machinery that solves monumental challenges for our customers.

OtherRemoteTeam 201-500Since 2011H1B Sponsor

We’re Blue River, a team of innovators driven to create intelligent machinery that solves monumental problems for our customers. We empower our customers – farmers, construction crews, and foresters - to implement safer and more sustainable solutions, driving increased profitability with less reliance on scarce labor. We believe that focusing on the small stuff – pixel-by-pixel and task-by-task - leads to big gains. Blue River Technology aligns with John Deere’s vision to “innovate on behalf of humanity” by quickly identifying and solving high-value, high-uncertainty challenges in AI, machine learning, computer vision, and robotics. BRT acts as a research and development flywheel, building not only new products but also new platforms that reliably create value for both Deere and its customers. From fully autonomous machines to highly precise farming equipment, BRT and Deere are partnering to create technical breakthroughs in industries like agriculture and construction. Summary We are seeking a Senior CVML Platform Engineer to help design, build, and evolve the platforms that support computer vision and ML workloads at scale. This role focuses on enabling ML teams through well-designed infrastructure, tooling, and workflows, rather than developing models or conducting ML research. The ideal candidate brings strong technical judgment, is comfortable navigating existing and evolving platforms, and can incrementally improve systems while maintaining reliability. We strongly prefer engineers with a DevOps or platform engineering background who have moved into ML-adjacent systems and are motivated by building durable foundations that other teams rely on. This role requires both hands-on engineering and the ability to influence platform direction through collaboration and thoughtful design. - Employment Type: Full-Time - Work Location: Remote in the United States - Visa sponsorship will be considered on a case-by-case basis. Job Responsibilities A combination, not necessarily all-inclusive, of the following: - Platform Development & Ownership - Design, build, and evolve platform capabilities that support ML training, batch inference, and model deployment workflows at scale. - Own and improve core platform components (e.g., compute orchestration, data pipelines, inference systems) used by multiple teams across Blue River and John Deere. - Continuously enhance platform reliability, scalability, and performance, with a focus on real-world ML workloads. - ML Systems Enablement - Enable ML engineers to move faster by building intuitive, well-documented platform tools and workflows across the model lifecycle (experimentation, deployment, and iteration). - Improve model inference performance and throughput while balancing trade-offs among cost, latency, and reliability. - Support and scale distributed training and inference systems, including frameworks such as Ray and related tooling. - Infrastructure & Compute Systems - Develop and optimize hybrid compute environments (cloud + on-prem/GPU infrastructure) to support large-scale ML workloads. - Build and maintain infrastructure leveraging Kubernetes, Slurm, and cloud platforms (AWS preferred). - Identify and resolve bottlenecks in compute, storage, and data movement pipelines. - System Evolution & Technical Judgment - Evaluate existing platform systems and make thoughtful decisions on when to extend, refactor, or rebuild components. - Drive improvements in system architecture, balancing short-term delivery with long-term platform health. - Contribute to shaping the platform roadmap and technical direction in response to evolving business and ML needs. - Cross-Functional Collaboration - Partner closely with ML engineers, robotics teams, infrastructure teams, and product stakeholders to translate requirements into scalable platform solutions. - Act as a technical bridge between teams, ensuring platform capabilities align with real-world use cases and constraints. - Influence platform adoption and best practices across multiple teams. - Testing, Simulation & Validation - Support platform capabilities that enable simulation-based testing and validation of ML systems, including synthetic data workflows. - Improve tooling that allows teams to test and validate models before production deployment. - Technical Leadership - Provide technical guidance and mentorship to junior engineers on platform and systems design. - Lead implementation efforts for key platform initiatives and ensure high-quality execution. - Demonstrate strong ownership and accountability for delivering impactful platform improvements. Required Experience and Skills - 5+ years of professional engineering experience, with a focus on platform, infrastructure, or systems engineering. - Strong technical judgment, balancing the evolution of legacy platforms with the design and delivery of new, greenfield components shared across multiple teams and workloads. - Excellent Python skills, used in production systems, tooling, and platform components. - Solid understanding of ML systems and the end-to-end model development lifecycle, from experimentation to deployment and iteration. - Hands-on experience or strong familiarity with cloud platforms (AWS preferred) and container orchestration systems such as Kubernetes and Slurm. - Ability to partner effectively with ML engineers, infra teams, and product stakeholders to translate requirements into platform capabilities. - Ability to quickly ramp up on new domains, tools, and complex existing systems. Preferred Experience and Skills - Golang experience, particularly for platform or infrastructure components. - Experience building or integrating ML pipelines using tools such as Kubeflow and/or Airflow. - Understanding of model inference architectures, including performance, scalability, reliability, and cost considerations. - Experience enabling distributed training and inference through platforms and frameworks such as Ray. - Experience supporting ML systems in computer vision or robotics environments. Only individual applicants will be considered. We do not work with unsolicited third-party agencies or proxy interview services. At Blue River, we’re passionate about creating an inclusive workplace that promotes and values diversity. While we have more work to do to advance diversity and inclusion, we’re investing in our programs, including recruiting, mentorship, career development, and learning & development to ensure they support our Diversity, Equity, and Inclusion goals. We support each employee in living a full life, enabling a thriving career, and accomplishing a meaningful, challenging mission while collaborating with incredible people. We are dedicated to building a diverse and inclusive workplace, so if you’re excited about this role but your experience doesn’t align completely with the job description, we encourage you to apply anyway. We are an equal-opportunity employer and do not discriminate based on race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, perform essential job functions, and receive other benefits and privileges of employment. Please contact us to request an accommodation. The US annual base salary range for this position is $160,000 - $287,000, along with eligibility for Blue River’s bonus and benefit programs. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your location during the hiring process. During the recruitment process, we may identify an alternative role or level to which you are more suited. If your ideal role at Blue River differs from the advertised position, we will provide an updated pay range as soon as possible during the hiring process. #LI-AN1

United States
$160K - $287K / year
Millicom (Tigo) logo

Senior Cloud Platform Operations Engineer

Millicom (Tigo)

We build the digital highways that connect people, improve lives and develop the communities we proudly serve.

Full TimeRemoteTeam 10,001+Since 1992H1B No Sponsor

• Operar, mantener, gestionar, mejorar y hacer cumplir los servicios On-Prem Cloud IaaS para garantizar servicio y disponibilidad. • Participar en la rotación de guardia para proporcionar el soporte L2 y mantener la alta disponibilidad de las Plataformas en la Nube. • Revisar diariamente la lista de tickets asignados a él/ella y asegurar que se resuelvan a tiempo y con la calidad correcta. • También, identificar cualquier ticket mal asignado que no esté relacionado con actividades de Operaciones de la Plataforma en la Nube y escalarlo al líder para asegurar que se transfieran al equipo de Entrega de Servicio en la Nube según sea necesario. • Cuando se asigna un ticket, el Ingeniero de la Plataforma de Nube es responsable de la investigación, configuración y seguimiento hasta el cierre completo del ticket en el sistema Service Now. • Comunicar el estado de incidentes y tickets asignados a él/ella. • Asegurar que toda la información relacionada con cualquier incidente o solicitud esté bien documentada en un repositorio compartido de conocimientos al que pueda acceder todo el equipo e2e de Nube. • Monitorear el cumplimiento de la seguridad de acuerdo con los estándares, políticas y procedimientos.

Panama
Job Closed