Solventum logo
Solventum

Enabling better, smarter, safer healthcare to improve lives.

Principal MLOps Engineer

Machine Learning EngineerMachine Learning EngineerFull TimeRemoteLeadTeam 10,001+H1B No SponsorCompany SiteLinkedIn

Location

Pennsylvania

Posted

7 days ago

Salary

$142.8K - $196.4K / year

Seniority

Lead

Job Description

Principal MLOps Engineer

Solventum

• Lead the operational architecture, deployment strategy, and reliability engineering for integrating AI into high-stakes Healthcare Information Systems (HIS) • Define the enterprise operational standards, govern the release processes, and build the resilient infrastructure required to maintain models in mission-critical clinical environments • Architect and govern the comprehensive release process, defining enterprise checklists, automated approval gates, release notes, and deployment readiness standards • Establish the deployment execution standards for promoting AI across all environments and ensure customer deployments adhere to strict internal production discipline • Architect and oversee the enterprise model registry, ensuring seamless integration with CI/CD pipelines and full version control traceability • Define and enforce monitoring standards, establishing critical SLAs/SLOs, service health metrics, and comprehensive dashboards across the AI ecosystem • Architect automated checks for input/output data quality and model drift, ensuring proactive detection of system degradation • Establish and lead the production incident process, including rigorous triage workflows, severity escalation paths, postmortems, rollback mechanisms, and recovery infrastructure • Partner with Platform teams to provide essential ATO (Authority to Operate) and compliance support, ensuring complete deployment traceability and strict operational controls • Oversee comprehensive operational reporting, providing leadership with status updates across production systems, pre-prod testing, customer rollouts, and incident metrics • Foster a culture of production discipline, guiding junior engineers in maintaining operational runbooks and reliable deployment pipelines

Job Requirements

  • Bachelor's Degree or Higher in Computer Science, Software Engineering, or related technical field
  • 10+ years of experience in software engineering, with at least 6 years dedicated to deploying and maintaining large-scale ML systems in production
  • Expert-level experience with Cloud Providers (AWS/GCP/Azure) and orchestration tools (Kubernetes, Kubeflow, or Airflow)
  • Expert-level Python and Java/Go (or similar)
  • Deep proficiency in backend frameworks, microservices, and system design patterns
  • Expert knowledge of monitoring stacks (Prometheus, Grafana, Datadog) and establishing enterprise SLAs/SLOs for AI services
  • Proven track record of designing automated deployment pipelines, managing complex rollback procedures, and enforcing model registry governance at scale.

Benefits

  • Medical
  • Dental & Vision
  • Health Savings Accounts
  • Health Care & Dependent Care Flexible Spending Accounts
  • Disability Benefits
  • Life Insurance
  • Voluntary Benefits
  • Paid Absences
  • Retirement Benefits

Related Job Pages

More Machine Learning Engineer Jobs

Full TimeRemoteTeam 5,001-10,000Since 1995H1B No Sponsor

• ML Model Development: Design, develop, and implement scalable machine learning models to solve complex business problems • MLOps & Production: Build and maintain robust ML pipelines, ensuring deployment, monitoring, and maintenance of models in production • Feature Engineering: Create and optimize features using dbt and PySpark, working with large volumes of data • Workflow Orchestration: Develop and manage data and ML pipelines using Apache Airflow • Data Processing: Perform large-scale distributed data processing with PySpark • Collaboration: Work closely with data scientists, data engineers, and product teams to deliver end-to-end solutions • Optimization: Monitor model performance, identify degradation, and implement continuous improvements • Documentation: Maintain clear technical documentation of architecture, models, and processes

Brazil
Arbitration Forums Inc. logo

MLOps Engineer

Arbitration Forums Inc.

AF is a remote working environment.

Role Description This role at Arbitration Forums is as unique as it is rewarding because of the AF IPAAL Values (Integrity, Passion, Accountability, Achievement, Leadership) and TRI Model (Trust, Respect, Inclusion). The MLOps Engineer is responsible for closing the gap between machine learning models development and their operational deployment. This role ensures that machine learning models are efficiently running in the production environment and are continuously monitored for performance. The MLOps Engineer contributes to Arbitration Forums AI-powered portfolio of products and services by enhancing the scalability and reliability of machine learning applications. This role works closely with data scientists, AI engineers, software development, and DevOps teams to automate and streamline the model lifecycle, from development to deployment and monitoring. Qualifications - Bachelor’s or Master’s degree in Computer Science, Information Systems, Data Science, or a related field. - Minimum of 6 years of experience in data science, machine learning, data management, data governance, or a related role. - Minimum of 6 years as a MLOps Engineer or in a similar role. - Technical Skills: - Working knowledge of cloud services (i.e., MS Azure, AWS, Google Cloud). - Experience with AI tools, such as MS Azure ML, Snowflake, Databricks, CortexAI, Dataiku. - Deep understanding of data science principles, algorithms, and tools. - Strong knowledge of data governance, data security, and compliance practices. - Proficiency in programming languages such as Python, R, or Java. - Experience with containerization tools like Docker and orchestration tools like Kubernetes. - Proficiency in ML frameworks such as TensorFlow, PyTorch, or Scikit-learn. - Working knowledge of CI/CD pipelines, DevOps practices, and automation frameworks. - Deep understanding of data engineering concepts and tools. - Familiarity with data visualization and reporting tools (e.g., Webfocus, Power BI). - Soft Skills: - Excellent analytical and problem-solving abilities. - Strong communication and interpersonal skills to collaborate with cross-functional teams. - Ability to lead projects and mentor junior staff. - Auto Insurance claims industry experience preferred. Requirements - Design, implement, and maintain machine learning pipelines and workflows for the continuous deployment and integration of machine learning models. - Optimize the pipelines for scalability, efficiency, and cost-effectiveness. - Collaborate with data scientists and AI engineers to understand model requirements and optimize deployment processes. - Automate the training, testing, and deployment processes for machine learning models. - Establish and enforce best practices for version control, documentation, and code quality in ML projects. - Monitor model performance and optimize algorithms for efficiency. - Conduct regular maintenance and updates to deployed models. - Collaborate with cross-functional teams to integrate machine learning solutions into business processes and applications. - Work with go to market, product management, and IT functions as well as stakeholders in AF and its members to identify the optimal methods for model rollout and adoption. - Maintain and optimize the cloud-based machine learning infrastructure and make recommendations for improvements. - Manage and allocate resources effectively, including computer power and storage for model inference. - Develop practices and utilize tools for data validation, model testing, and versioning. - Troubleshoot and resolve machine learning operational issues. - Document processes, workflows, and best practices for ML Operations. - Provide technical leadership and mentorship to junior data team members. Benefits - Support data observability efforts to ensure the data continuum and enforce governance standards. - Other duties as assigned by manager or project needs. Americans with Disability Specifications - PHYSICAL DEMANDS: The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. While performing the duties of this job, the employee is occasionally required to stand; walk; sit; use hands to finger, handle, or feel objects, tools, or controls; reach with hands and arms; climb stairs; balance; stoop, kneel, crouch or crawl; talk or hear; taste or smell. The employee must occasionally lift and/or move up to 25 pounds. Specific vision abilities required by the job include close vision, distance vision, color vision, peripheral vision, depth perception, and the ability to adjust focus. - WORK ENVIRONMENT: This is a fully remote position requiring reliable high-speed internet access and a dedicated workspace. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

United States
Job Closed
Full TimeRemoteTeam 201-500Since 1979H1B No Sponsor

• Lead the strategic development and growth of Circuit Check's most significant customer relationships in the AI/ML data center infrastructure market • Architect and execute multi-year account strategies that drive large-scale program wins, deepen executive-level partnerships, and position the company as the preferred technology partner for hyperscale and enterprise AI customers • Serve as the primary executive-level interface with strategic accounts, building trusted advisor relationships with VP and C-suite decision-makers at hyperscale cloud providers, AI chipmakers, and data center operators • Coordinate engineering, product management and operations around customer requirements • Partner with Product Line Managers to translate customer insights into product roadmap inputs aligned with AI/ML market direction • Identify patterns in AI/ML technology adoption, supply chain shifts, and customer investment priorities that create strategic openings for Circuit Check

California + 1 moreAll locations: California | Minnesota
$120K - $200K / year
Full TimeRemoteTeam 1,001-5,000H1B No Sponsor

• Projetar, desenvolver e implementar soluções de Inteligência Artificial e Machine Learning em ambientes corporativos. • Construir pipelines de dados para treinamento, validação, monitoramento e re-treinamento de modelos. • Desenvolver e operacionalizar modelos preditivos, classificadores, sistemas de recomendação e soluções de IA Generativa. • Trabalhar com LLMs (Large Language Models), agentes inteligentes, RAG (Retrieval-Augmented Generation) e arquiteturas multiagentes. • Desenvolver APIs e serviços para disponibilização de modelos em produção. • Implementar práticas de MLOps e LLMOps para automação do ciclo de vida dos modelos. • Avaliar desempenho, acurácia, viés e governança dos modelos de IA. • Atuar na integração de soluções de IA com aplicações corporativas, ERPs, CRMs e plataformas digitais. • Garantir segurança, observabilidade, escalabilidade e conformidade das soluções implementadas. • Apoiar áreas de negócio na identificação de oportunidades de aplicação de IA. • Produzir documentação técnica e compartilhar conhecimento com equipes internas.

Brazil
Job Closed