AI & Analytics for today’s business challenges.
ML Ops Architect
Location
Texas
Posted
68 days ago
Salary
0
Seniority
Lead
Job Description
ML Ops Architect
Tiger Analytics
• Implement scalable and reliable systems leveraging cloud-based architectures, technologies and platforms to handle model inference at scale. • Deploy and manage machine learning & data pipelines in production environments. • Work on containerization and orchestration solutions for model deployment. • Participate in fast iteration cycles, adapting to evolving project requirements. • Collaborate as part of a cross-functional Agile team to create and enhance software that enables state-of-the-art big data and ML applications. • Leverage CICD best practices, including test automation and monitoring, to ensure successful deployment of ML models and application code. • Ensure all code is well-managed to reduce vulnerabilities, models are well-governed from a risk perspective, and the ML follows best practices in Responsible and Explainable AI. • Collaborate with Data scientists, software engineers, data engineers, and other stakeholders to develop and implement best practices for MLOps, including CI/CD pipelines, version control, model versioning, monitoring, alerting and automated model deployment. • Manage and monitor machine learning infrastructure, ensuring high availability and performance. • Implement robust monitoring and logging solutions for tracking model performance and system health. • Monitor real-time performance of deployed models, analyze performance data, and proactively identify and address performance issues to ensure optimal model performance. • Troubleshoot and resolve production issues related to ML model deployment, performance, and scalability in a timely and efficient manner. • Implement security best practices for machine learning systems and ensure compliance with data protection and privacy regulations. • Collaborate with platform engineers to effectively manage cloud compute resources for ML model deployment, monitoring, and performance optimization. • Develop and maintain documentation, standard operating procedures, and guidelines related to MLOps processes, tools, and best practices.
Job Requirements
- Master's or doctoral degree in computer science, electrical engineering, mathematics, or a similar field.
- Typically requires 7+ years of hands-on work experience developing and applying advanced analytics solutions in a corporate environment with at least 4 years of experience programming with Python.
- At least 3 years of experience designing and building data-intensive solutions using distributed computing.
- At least 3 years of experience productionizing, monitoring, and maintaining models.
- Must have skills:
- Understanding of Azure stack like Azure Machine Learning, Azure Data Factory, Azure Databricks, Azure Kubernetes Service, Azure Monitor, etc.
- Demonstrated expertise in building and deploying AI/Machine Learning solutions at scale leveraging cloud such as AWS, Azure, or Google Cloud Platform.
- Experience in developing and maintaining APIs (e.g.: REST).
- Experience specifying infrastructure and Infrastructure as a code (e.g.: Ansible, Terraform).
- Experience in designing, developing & scaling complex data & feature pipelines feeding ML models and evaluating their performance.
- Ability to work across the full stack and move fluidly between programming languages and MLOps technologies (e.g.: Python, Spark, DataBricks, Github, MLFlow, Airflow).
- Expertise in Unix Shell scripting and dependency-driven job schedulers.
- Understanding of security and compliance requirements in ML infrastructure.
- Experience with visualization technologies (e.g.: RShiny, Streamlit, Python DASH, Tableau, PowerBI).
- Familiarity with data privacy standards, methodologies, and best practices.
Benefits
- Significant career development opportunities exist as the company grows.
- The position offers a unique opportunity to be part of a small, fast-growing, challenging and entrepreneurial environment, with a high degree of individual responsibility.
Related Guides
Related Job Pages
More Machine Learning Engineer Jobs
Senior Machine Learning Engineer II, Search & Recommendations Ranking
InstacartInstacart invites the world to share love through food. This is how homemade is made.
• Architect the ranking backbone unifying query understanding, personalization, multi-objective ranking, ads, and merchandising • Design and build a search autosuggest system • Design long-horizon objective functions and build uplift/causal value models • Develop production-grade Multi-Task Learning to jointly learn relevance, propensity, margin, and churn risk • Own the inference layer with goal-aware re-rankers and optimization • Advance evaluation practices for incremental GTV and retention • Partner across ads, infrastructure, product, and design teams • Mentor ML engineers to build expertise
Senior Staff Machine Learning Engineer, Ads Quality
InstacartInstacart invites the world to share love through food. This is how homemade is made.
• Develop & design innovative AI-powered systems addressing a wide range of Ads Quality challenges • Balance between user, advertiser, and retailer needs • Identify and build gen AI solutions • Build advanced generative AI and recommendation frameworks • Collaborate with Data Scientists to establish metrics and methodologies • Define & own the Ads Quality technical strategy and roadmap • Serve as a technical leader and mentor • Build and maintain synergies with other ML engineering teams • Contributes to company-level technical initiatives
• Build and operate the AI platform infrastructure enabling developers to ship LLM-based services faster• Implement and maintain Kubernetes-based runtime environments (incl. AKS) for AI workloads• Manage infrastructure as code with Terraform (modules, environments, CI/CD automation)• Support LLM workflows: RAG, agents, prompt experimentation, evaluations, and deployment patterns• Integrate and operate tooling such as Azure AI Foundry, LiteLLM, Langfuse, MLflow• Orchestrate pipelines using Kubeflow Pipelines and/or Argo Workflows (build, deploy, evaluate)• Improve platform reliability and observability (monitoring, logging, tracing, cost/perf signals)• Collaborate closely with developers to streamline DX (APIs, templates, docs, golden paths, automation)
• Engage with clients to understand their unique challenges and deliver tailored AI/ML solutions • Design, implement, and maintain scalable, efficient, and robust end-to-end ML/AI systems • Manage the entire ML/AI development lifecycle, including planning, pipeline management, and deployment • Provide technical support to engineering team members, promoting best practices in ML engineering • Collaborate with AI Practice Lead to inform and shape the company's AI strategic direction



