Job Closed
This listing is no longer active.
Artera is a Swiss ISP that produces premium hosting and cloud services.
Machine Learning Engineer – AI Platform Lead
Location
United States
Posted
136 days ago
Salary
$180K - $220K / year
Seniority
Senior
Job Description
Machine Learning Engineer – AI Platform Lead
Artera.net
• Develop the long term vision and roadmap for Artera’s AI platform that will allow the company to continue to scale in terms of both increased inference volume and development workloads. • Accountable for Artera’s ML compute infrastructure including scaling up Artera’s Foundation Model development by developing distributed training infrastructure and developer libraries. • Build and evolve the core libraries used by AI scientists to develop, launch, and monitor AI products. • Work with model developers to optimize GPU and CPU efficiency and data throughput of large-scale foundation models and downstream model training runs. • Optimize Artera’s ability to store and serve terabytes of digital pathology data efficiently for the use in serving large-scale training regimes. • Ensure that Artera’s observability infrastructure provides a clear picture of how to continue to optimize performance across our model landscape.
Job Requirements
- 8+ years of industry software engineering experience
- 4+ years of industry experience in using ML orchestration frameworks such as Flyte, Ray, Kubeflow, Metaflow, MLFlow, Dagster, Argo Workflow or Prefect
- 4+ years of industry experience using one of PyTorch, TensorFlow, or JAX in Python
- 3+ years of industry experience building with AWS, Docker, and Kubernetes
- 1+ years of industry experience optimizing large-scale, high data-throughput, distributed machine learning training pipelines
- Experience using Terraform, SqlAlchemy
- Experience in multi-node and multi-gpu training.
- Experience deploying and maintaining infrastructure for machine learning training and production inference
- Familiarity with TorchScript, ONNXRuntime, DeepSpeed, AWS Neuron or similar approaches to inference optimization
- Work authorization requirement: Candidates must be authorized to work in the US or Canada without visa sponsorship
Benefits
- 401k matching
- unlimited paid time off (PTO)
- Equity is a core component of our compensation
Related Guides
Related Job Pages
More AI Engineer Jobs
• AI Orchestration: Design and deploy autonomous agents to automate dev processes (e.g., automated PR summaries, synthetic test data generation, and self-healing test suites). • Workflow Automation: Build and maintain complex product workflows leveraging LLMs and RAG (Retrieval-Augmented Generation) to enhance user experiences. • Infrastructure as Code (IaC): Own the cloud architecture using Terraform or Pulumi to ensure GCP resources (Vertex AI, Cloud Run, Cloud SQL) are version-controlled and reproducible. • Full-Stack Development: Maintain and evolve our dual-portal ecosystem: Backend: Node.js, Express, and TypeORM (PostgreSQL). Frontend: React (v17 & v18) using modern state management and routing (Zustand, TanStack). • DevOps Mastery: Implement robust CI/CD pipelines (GitHub Actions/Google Cloud Build) that handle both application code and AI model versioning.
• Collaborate with engineering teams on the design and implementation of solutions using LLM's • Develop, refine, and optimize prompts to ensure the highest quality and accuracy from AI models • Collaborate with product and engineering teams to integrate AI/ML models and services into our core products • Act as an AI generalist, empowering our internal teams to use AI tooling effectively • Work in a highly collaborative environment with the product and data team to explore new AI technologies • Utilize cloud platforms, particularly AWS AI services, to build, deploy, and manage AI-powered features.
Lead AI Engineer
Tidal Financial GroupTidal is an industry-leading ETF platform offering full-stack services to successfully launch, manage, and operate ETFs.
• Design, develop, and deploy LLM-powered applications • Lead the definition, development, and delivery of machine learning models • Lead the testing, deployment, and ongoing maintenance of AI systems • Optimize AI systems for performance, scalability, latency, and cost • Partner with data and engineering leads to architect AI-powered systems • Establish evaluation frameworks, monitoring, and observability
Senior AWS GenAI Engineer
Gainwell TechnologiesGainwell Technologies is an award-winning digital health technology company that supports the administration of healthcare and human services programs. In past flexible hiring, the
• Architect and maintain AWS demo environments using services such as EC2, ECS/Fargate, Lambda, S3, RDS, DynamoDB, IAM, KMS, API Gateway, CloudWatch, and AWS Glue • Develop Python-based solutions leveraging libraries like NumPy, Pandas, Scikit-learn, and others for data processing and automation • Build and manage Terraform modules for automated provisioning, environment resets, and decommissioning • Integrate AWS GenAI services (Bedrock, SageMaker) for real-time demo enrichment and scenario automation • Collaborate with Demo COE and Product SMEs to design and implement custom demo flows • Implement RBAC, feature toggles, and configuration management using AWS Systems Manager and AppConfig • Ensure environmental health through automated monitoring, alerting, and load testing (CloudWatch, ALB, X-Ray) • Support automated QA gates, metrics collection, and post-demo knowledge management • Provide secondary support for data engineering tasks (AWS Glue, Redshift, Athena) for ingestion and transformation • Contribute to hybrid cloud demos and integrations with Azure as needed




