Job Closed

This listing is no longer active.

Playlab logo
Playlab

Build, remix and share AI-powered educational tools.

Staff ML Infrastructure Engineer

Infrastructure EngineerInfrastructure EngineerOtherRemoteLeadTeam 1-10Since 2023H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

113 days ago

Salary

$180K - $240K / year

Seniority

Lead

Bachelor Degree7 yrs expEnglishAirflowAWSDockerETLKubernetesPython

Job Description

Staff ML Infrastructure Engineer

Playlab

• Build data pipelines that scrub PII, create research datasets, and power the research portal for educational AI studies • Architect the path toward self-hosted and on-device model deployments for privacy and global accessibility • Design and implement model orchestration systems that intelligently route requests across multiple AI providers (OpenAI, Anthropic, AWS Bedrock, open-source models) • Build cost optimization infrastructure - implement conversation compression, prompt caching, and smart model selection to keep AI accessible • Create comprehensive observability systems for ML operations - track costs, latency, quality, and usage patterns across thousands of applications • Design and implement infrastructure for fine-tuning and deploying custom models • Build monitoring and alerting systems that help us maintain reliability as AI interactions scale

Job Requirements

  • 7+ years building production ML/data systems, with experience in ML operations and infrastructure
  • Strong experience with model serving, orchestration, and optimization in production environments
  • Proficient in Python and data pipeline technologies (Airflow, ETL tools, etc.)
  • Experience with cloud infrastructure (AWS preferred) and containerization (Kubernetes, Docker)
  • Experience with cost optimization strategies for LLM-based systems
  • Thrive in high-agency, high collaboration cultures
  • Great communication that makes working remote-first work.

Benefits

  • AI tools and hands-on professional development

Related Categories

Related Job Pages

More Infrastructure Engineer Jobs

OtherRemoteTeam 51-200Since 1988

• Implement and support infrastructure technologies such as Microsoft Azure, VMware and networking technologies • Execute migrations of on-premises platforms to cloud infrastructure • Manage enterprise support requests from clients subscribing to Kraft Kennedy’s enterprise managed services • Execute planned evening and weekend maintenance tasks in support of Kraft Kennedy’s enterprise managed services clients, when necessary • Participate in weekly on-call rotation for evening and weekend support assistance, as requested by enterprise managed services clients • Escalate to internal and, when necessary, external resources in an appropriate time frame to manage the resolution of complex client issues • Provide on-site support, as necessary

Connecticut + 17 moreAll locations: Connecticut | District of Columbia | Florida | Illinois | Kentucky | New York | North Carolina | Ohio | Maryland | Massachusetts | Pennsylvania | South Carolina | Tennessee | Texas | Utah | Vermont | Virginia | Washington
$85K - $140K / year
Apogee Global RMS logo

IT Infrastructure Support Engineer

Apogee Global RMS

Taking People, Process and Technology to the Next Level

OtherRemoteTeam 1-10Since 2018H1B No Sponsor

- IT Support & Helpdesk - Provide Tier 1–2 technical support for desktops, laptops, printers, and mobile devices - Troubleshoot hardware, software, OS, and application issues - Set up, configure, and maintain user accounts, email, and access permissions - Respond to tickets, document issues, and ensure timely resolution - Support onboarding/offboarding of employees (devices, accounts, access) - Systems Administration - Install, configure, and maintain Windows/Linux servers and workstations - Manage Active Directory, user/group policies, and permissions - Monitor system performance, backups, patches, and updates - Maintain virtualization environments (VMware/Hyper-V or similar) - Ensure security best practices, antivirus, patching, and access controls - Document systems, procedures, and configurations - Network Administration - Configure and maintain LAN/WAN, switches, routers, firewalls, and Wi-Fi - Monitor network performance, uptime, and security - Troubleshoot network connectivity and performance issues - Manage VPNs, DNS, DHCP, and basic firewall rules - Assist with network upgrades, expansions, and improvements

California
SearchApi logo

Senior Infrastructure Engineer

SearchApi

Making public data accessible to everyone.

Full TimeRemoteTeam 1-10H1B No Sponsor

• Own and improve observability across the entire stack: metrics, logging, alerting, dashboards. • Architect and operate browser cloud infrastructure: containerized browsers, session isolation, crash recovery, autoscaling. • Handle websocket connections, CDP tunnels, and TCP-level tuning for browser automation. • Write and maintain Terraform for all infrastructure. Everything is code. • Tune autoscaling, capacity planning, and cost optimization. • Expand to new regions as we grow. • Experiment with Lambdas, new cloud providers for custom scraping needs. • Debug production issues: memory leaks, zombie processes, network failures. • Build disaster recovery and incident response runbooks. • Work with Linux systems, Dockerfiles, container orchestration daily.

Lithuania
RevenueBase logo

Senior Adversarial Infrastructure Engineer

RevenueBase

B2B data for AI agents and GTM tools. 350M+ contacts. Unmetered access.

OtherRemoteTeam 11-50Since 2021H1B No Sponsor

• Architect Distributed Stealth Systems: Design and deploy horizontally scalable scraping nodes (Go or Python/Asyncio) on AWS Lambda/Fargate or Kubernetes to maintain massive throughput. • Hidden API Discovery: Use Charles Proxy, Fiddler, or Burp Suite to map out undocumented internal APIs and mobile backend endpoints—finding the "backdoor" pathways that bypass heavy front-end obfuscation. • Fingerprint Evasion: Implement advanced techniques to bypass enterprise-grade bot detection, specifically focusing on TLS fingerprinting (JA3), HTTP/2 headers, and browser fingerprinting evasion. • Orchestrate Massive Infrastructure: Manage large-scale residential and mobile proxy pools to ensure a sustained rate of 1M+ requests per hour while optimizing for cost and success rate. • Build for Resilience: Develop real-time monitoring to track success rates and "burn" rates of proxy infrastructure, ensuring the system can recover automatically when targets update their defenses.

United States
Job Closed