Job Closed

This listing is no longer active.

Stitch Fix

Changing the way people find what they love.

Manager, Data – AI Platform Engineering

Platform EngineerPlatform EngineerOther Remote SeniorTeam 5,001-10,000Since 2011H1B SponsorCompany Site LinkedIn

Location

United States

Posted

144 days ago

Salary

$146.3K - $195K / year

Seniority

Senior

Bachelor Degree5 yrs expEnglishDistributed Systems Apache Kafka Kubernetes PyTorch Ray Apache Spark

Job Description

• Lead in a player-coach capacity in execution for Stitch Fix’s next-gen Data, ML, and GenAI platforms • Contribute towards modernization of data and ML foundations to support unified signals, adaptive models, experimentation velocity, and scalable AI/ML workloads. • Provide foundational APIs, SDKs, frameworks, and self-service tools that make it easy for data scientists, ML engineers, analysts, and application teams to build and deploy AI solutions quickly, safely, and at scale. • Partner with Data Science, Engineering, and Product teams to translate Data/ML/GenAI platform capabilities into production-grade features and intelligent experiences that deliver measurable business value. • Drive responsible AI and data adoption by creating reusable templates, documentation, and enablement programs. • Contribute towards improving governance practices including data contracts, lineage, metric definitions, access policies, and responsible AI guardrails - for trust, safety, and compliance. • Ensure operational excellence through platform reliability, performance, observability, cost efficiency, and simplification of legacy systems. • Lead and develop high-performing engineering teams fostering a culture of clarity, excellence, and trust. • Balance speed of innovation with platform stability, ensuring engineering efforts are tightly aligned to business priorities and long-term client value.

Job Requirements

5+ years in software, data, ML, or platform engineering; 1+ years leading engineering individual contributors is a plus.
Demonstrated success contributing towards large-scale data platforms, ML platforms, or AI/GenAI platforms in cloud environments.
Experience delivering platform modernization, unification, and multi-year architectural transformation.
Strong software engineering foundation, with experience designing and building large-scale distributed systems and resilient, high-quality APIs and services using modern programming languages and cloud-native architectures.
Track record operating and evolving modern data infrastructure, including some of the following: distributed compute and storage technologies (Spark, Trino, Iceberg), real-time processing frameworks (Kafka/Flink), metadata / catalog systems, and Kubernetes-based orchestration.
Expertise across the ML lifecycle - feature engineering, training pipelines, model deployment and serving, monitoring, validation, fine-tuning, and MLOps best practices.
Proven capability in building self-service platform abstractions and tooling that enable teams to develop, experiment, and deploy data and ML products efficiently.
Experience with modern GenAI architectures - semantic retrieval, knowledge-grounded indexing, LLM orchestration, agent workflows, and evaluation frameworks.
Familiarity with modern ML frameworks like PyTorch and Ray is a plus.
Strategic thinker able to align platform investments with business priorities and emerging AI opportunities.
Potential to be a strong people leader with a track record of contributing to make inclusive, high-performing engineering teams.
Excellent communicator who can influence both technical and business stakeholders across domains.

Benefits

This position is eligible for an annual bonus
Eligible for medical, dental, vision, and other benefits

Related Categories

Platform Engineer

Related Job Pages

More Remote Jobs

More Platform Engineer Jobs

Platform Engineer

Qodea

Qodea (formerly Appsbroker CTS) is Europe's largest Google Premier only transformation partner.

Platform Engineer144 days ago

Full Time RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Design and implement scalable, reliable, and cost-effective cloud solutions using Google Cloud Platform services and products. • Analyse business requirements and recommend appropriate cloud technologies and architectures. • Develop and maintain cloud infrastructure code using tools such as Terraform or Google Cloud Deployment Manager. • Manage and automate the deployment, scaling, and management of cloud infrastructure using tools such as Kubernetes or Google Cloud Deployment Manager. • Optimise infrastructure costs by analysing usage and implementing cost-saving measures such as reserved instances or instance rightsizing. • Implement infrastructure-as-code practices to ensure consistency, repeatability, and version control of cloud infrastructure. • Implement and maintain security controls and policies to ensure the confidentiality, integrity, and availability of cloud infrastructure and data. • Monitor and respond to security incidents and vulnerabilities, and perform regular security assessments and audits. • Ensure compliance with industry standards and regulations such as GDPR, HIPAA, and PCI DSS. • Monitor cloud infrastructure and services for performance, availability, and security issues using tools such as Stackdriver or Prometheus. • Perform root cause analysis and troubleshoot issues related to cloud infrastructure and services, and implement corrective actions and preventive measures. • Continuously improve the reliability and resilience of cloud infrastructure by implementing best practices such as fault-tolerance, redundancy, and disaster recovery. • Proactive management of customer related tasks and projects, by applying the learned skills and knowledge. • Proactiveness in: identifying and suggesting improvements where applicable, detecting missing documentation and filling in the gaps. • Providing technical support whilst attending meetings with stakeholders as needed. • Acting as a mentor for the more junior colleagues by sharing your knowledge and expertise, and guiding them through the successful resolution of customer related tasks.

Ansible Docker GCP Kubernetes Linux Prometheus Python Terraform

View details: Platform Engineer

Romania

Apply

Senior Platform Engineer

Qodea

Qodea (formerly Appsbroker CTS) is Europe's largest Google Premier only transformation partner.

Platform Engineer144 days ago

Full Time RemoteTeam 201-500H1B No Sponsor

Company Site LinkedIn

• Design and implement scalable, reliable, and cost-effective cloud solutions using Google Cloud Platform services and products. • Analyse business requirements and recommend appropriate cloud technologies and architectures. • Develop and maintain cloud infrastructure code using tools such as Terraform or Google Cloud Deployment Manager. • Manage and automate the deployment, scaling, and management of cloud infrastructure using tools such as Kubernetes or Google Cloud Deployment Manager. • Optimise infrastructure costs by analysing usage and implementing cost-saving measures such as reserved instances or instance rightsizing. • Implement infrastructure-as-code practices to ensure consistency, repeatability, and version control of cloud infrastructure. • Implement and maintain security controls and policies to ensure the confidentiality, integrity, and availability of cloud infrastructure and data. • Monitor and respond to security incidents and vulnerabilities, and perform regular security assessments and audits. • Ensure compliance with industry standards and regulations such as GDPR, HIPAA, and PCI DSS. • Monitor cloud infrastructure and services for performance, availability, and security issues using tools such as Stackdriver or Prometheus. • Perform root cause analysis and troubleshoot issues related to cloud infrastructure and services, and implement corrective actions and preventive measures. • Continuously improve the reliability and resilience of cloud infrastructure by implementing best practices such as fault-tolerance, redundancy, and disaster recovery. • Leading and managing potential new projects for our customers by gathering and understanding the requirements, providing estimations on effort required, etas of deliverables, attending regular customer meetings and supporting the CSM with technical input where needed. • Acting as a mentor for the more junior colleagues by sharing your knowledge and expertise, and guiding them through the successful resolution of customer related tasks.

Ansible Docker GCP Kubernetes Linux Prometheus Python Terraform

View details: Senior Platform Engineer

Romania

Apply

Job Closed

Container Platform Operations Engineer – Kubernetes

Owens & Minor

Empowering Our Customers To Advance Healthcare

Platform Engineer144 days ago

Full Time RemoteTeam 10,001+Since 1882H1B Sponsor

Company Site LinkedIn

• Provide advanced operational support and troubleshoot complex issues to minimize downtime. • Collaborate with development and operations teams to implement features and integrate user-facing elements. • Optimize container platform environments for performance, scalability, and security, following best practices. • Monitor system performance and resolve issues to ensure uptime and reliability. • Deploy and manage Kubernetes-based clusters, ensuring high availability, scalability, and security. • Configure infrastructure components for optimal performance and reliability. • Administer and maintain CI/CD pipelines using GitOps tools (e.g., Argo CD), ensuring security and compliance. • Perform advanced Linux administration, including installation and configuration, across cloud environments. • Document operational, maintenance, and upgrade procedures for clarity and accessibility. • Support knowledge sharing and assist team members in troubleshooting and best practices.

Azure DNS Grafana Kubernetes Linux OpenShift Prometheus

View details: Container Platform Operations Engineer – Kubernetes

India

Apply

Job Closed

Senior Director of Engineering, Platform Engineering

Zeitview

At Zeitview, we deliver advanced inspection software for high-value infrastructure.

Platform Engineer144 days ago

Other RemoteTeam 51-200H1B No Sponsor

Company Site LinkedIn

• Lead, mentor, and grow global, distributed engineering teams, including tech leads, engineers and potential managers. • Build a high-performance engineering culture centered on ownership, accountability, quality, and continuous improvement. • Provide regular coaching, feedback, performance evaluations, and career development guidance. • Drive effective hiring, onboarding, succession planning, and retention strategies. • Own the technical strategy and long-term vision for platform engineering. • Partner closely with architects and tech leads to define platform standards, architectural patterns, and best practices. • Ensure the platform serves as a strong foundation enabling rapid and safe product innovation. • Stay sufficiently hands-on to understand the complexity and risk of initiatives and guide execution and delivery. • Act as a key partner to Product, Data, Infrastructure, Security, and other cross-functional leaders.

View details: Senior Director of Engineering, Platform Engineering

California

Apply

Job Closed

Manager, Data – AI Platform Engineering

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More Platform Engineer Jobs

Platform Engineer

Senior Platform Engineer

Container Platform Operations Engineer – Kubernetes

Senior Director of Engineering, Platform Engineering