Concentrate AI
Remote Jobs
2 Jobs
• Work closely with customers to understand LLM deployment needs and solve technical problems in production • Debug issues end to end across application behavior, AI API integrations, infrastructure, and model and provider performance across OpenAI, Anthropic, Gemini, and open source models • Build product features, internal tools, and platform improvements based on patterns you see in the field • Improve multi-provider routing, LLM reliability, AI observability, latency, and token cost efficiency across multiple LLM providers • Help customers reduce AI infrastructure costs, navigate rate limits, and architect for provider failover and redundancy • Partner closely with founders on customer deployments, product direction, and technical strategy
• You will own production infrastructure, reliability, and security. Your job is to make the platform secure, scalable, and reliable as usage grows fast. • Design and operate a secure, resilient, multi-cloud platform, making and defending architectural choices as the system scales • Operate high-throughput distributed systems; define SLOs, capacity models, and cost controls • Write and maintain production code, tooling, and automation with a strong focus on performance, reliability, and correctness. • Ensure safe releases, fast rollbacks, strong observability, and effective incident response • Infrastructure ownership for SOC 2 and similar regimes (access controls, isolation, encryption, auditability) • Improve the local-to-prod experience to unlock engineering speed