AI Runtime & Inference Engineer – LLM Platforms
Location
Brazil
Posted
16 days ago
Salary
0
Seniority
Senior
Job Description
AI Runtime & Inference Engineer – LLM Platforms
Compass
• Operar, otimizar e evoluir o runtime de agentes e a infraestrutura de inferência de LLMs em produção; • Definir e implementar arquitetura de endpoints de modelo com foco em SLOs de latência e disponibilidade; • Projetar e manter pipelines completos de observabilidade: métricas, logs estruturados, traces distribuídos e alertas inteligentes; • Conduzir otimizações avançadas de performance: batching dinâmico, caching semântico, quantização e gestão de contexto; • Liderar resposta a incidentes e análises de causa raiz em falhas do ambiente de inferência; • Definir padrões de resiliência e estratégias de failover para workloads de LLM em produção; • Produzir runbooks, playbooks e documentação operacional de referência para o ambiente;
Job Requirements
- Expertise em operação de modelos de linguagem em produção com foco em performance e disponibilidade;
- Domínio de frameworks de LLM serving em escala: vLLM, TGI (Text Generation Inference), Triton Inference Server ou equivalentes;
- Experiência avançada com Kubernetes e gerenciamento de workloads de inferência com aceleradores;
- Expertise em observabilidade em ambientes complexos: Prometheus, Grafana, OpenTelemetry e correlação de sinais;
- Profundo conhecimento de AWS e seus serviços de ML (SageMaker Endpoints, Bedrock, EKS);
- Experiência com otimização avançada de modelos: quantização (GPTQ, AWQ), distilação e compilação para inferência;
- Conhecimento prático de GPUs e aceleradores (NVIDIA A100/H100) em contextos de produção;
- Experiência com caching semântico e estratégias avançadas de gestão de contexto para LLMs;
- Histórico de atuação em SRE ou engenharia de plataforma em ambientes de missão crítica;
- Experiência com arquiteturas multi-região e estratégias de disaster recovery para workloads de IA;
Related Guides
Related Categories
Related Job Pages
More Artificial Intelligence Jobs
• You lead AI and Machine Learning projects at our clients • You translate business requirements into pragmatic technical solutions • You are responsible for architecture, delivery and technical quality • You work closely with clients • You bring structure, clarity and momentum to complex projects • You act as a technical sparring partner on equal footing
Title: Obesity Medicine Physician Location: United States Job type: Full Time Department: Clinical Work type: Remote USD 200,000-215,000 / year Job Description: The Company: Ilant Health is a value-based healthcare provider focused on cardiometabolic obesity, a group of interconnected conditions that drive over $700 billion in annual U.S. healthcare expenses. We are dedicated to increasing access to treatment while reducing the total cost of care for employers and payers. We are not just a provider, we are a technology-enabled practice driven by analytics. Our mission is to "de-average" care, using proprietary algorithms to match Members with the right interventions at the right time, while establishing habits that support treatment success and sustained outcomes. By combining clinical rigor with a consumer-grade digital experience, we drive life-changing outcomes and measurable ROI. The Role: We are seeking a board-certified obesity medicine physician to join our team. The candidate will be responsible for providing comprehensive obesity treatment and management to patients. Key Responsibilities: - Conduct comprehensive evaluations of patients with obesity to develop personalized treatment plans - Provide individualized medical management of obesity-related comorbidities such as hypertension, diabetes, and sleep apnea - Provide ongoing support and guidance to patients throughout their weight loss journey - Stay up-to-date with the latest research and advances in the field of obesity medicine - Collaborate with other healthcare professionals such as dietitians, exercise specialists, mental health providers and peer coaches to provide comprehensive care to patients - Maintain accurate and thorough medical records and document patient progress - Adhere to all legal and ethical standards in patient care Qualifications: - Board certification in Obesity Medicine - Medical degree from an accredited institution - Active medical license, bonus for multi-state licenses or a state license in an ILMC state - Strong knowledge of nutrition, exercise physiology, and behavioral therapy - Ability to effectively communicate with patients and other healthcare professionals - Ability to work independently as well as part of a multidisciplinary team - Commitment to ongoing professional development and education - Compassionate and empathetic approach to patient care - Prefer Family Medicine or Internal Medicine board certification - Prefer experience in outpatient setting, managing chronic conditions such as Diabetes and Hypertension as well Benefits and Perks: We believe great work happens when people are supported, trusted, and given the flexibility to thrive. Here’s what you can expect when you join our team: - Fully remote environment – work from anywhere while maintaining meaningful collaboration with a distributed team - Comprehensive health benefits – medical, dental, and vision coverage to support you and your family - Paid time off – 2 weeks of PTO to rest, recharge, and take the time you need - Flexible floating holiday – one additional day each year to celebrate what matters most to you - Paid sick leave – 5 sick days so you can prioritize your health when needed - 11 paid company holidays throughout the year - 401(k) retirement plan to help you invest in your future - Healthcare and Dependent Care FSA options for additional tax-advantaged savings
• Drive customer adoption, usage, and renewal of Glia's AI features. • Empower and enable our customers to grow their knowledge and appreciation of AI through the development and execution of training initiatives for client teams. • Empower and enable our customers to drive adoption of Glia's AI features by their own end users. • Implement AI products on time and on budget. • Ensure that customers are achieving the desired business impact from Glia's AI features following implementation. • Establish credibility with client stakeholders as an AI subject matter expert in relation to Glia's AI features. • Provide pre-sales consultancy on AI features to assist our prospective clients in adopting Glia. • Analyze AI performance metrics and provide data-driven recommendations for improvement. • Provide assistance and best practices to clients on topics relating to prompt engineering and working with Generative AI. • Serve as the primary technical advisor for AI-related inquiries and challenges. • Offer guidance on Glia Virtual Assistant best practices to help clients leverage the full potential of the platform and optimize their conversational interactions with customers. • Collaborate with Glia's product and engineering teams to communicate client feedback, feature requests, and identify areas for product improvement. • Collaborate closely with implementation managers, solution architects, and implementation engineers as part of customer implementations. • Document client processes and diagnose how Glia's AI products or integrations can improve efficiency and overall experience.
Role Description The Software Engineer, AI-Native is a core contributor on the software engineering team, responsible for designing, building, and maintaining features across Libra Solutions’ platforms. This role requires practical full-stack / full-lifecycle experience, a solid understanding of modern software engineering practices, and the ability to work effectively within an agentic delivery model where AI agents participate directly in the SDLC alongside human engineers. This position reports to the Manager, Software Engineering and can be based in any of our office locations in Denver, CO, Huntersville, NC, Rosemont, IL, or Las Vegas, NV. We welcome strong remote candidates, with occasional travel to Las Vegas as needed. Position Responsibilities: - Design and implement full-stack features using C# / ASP.NET Core, Entity Framework Core, and SQL Server / Azure SQL for back-end services, and React or Angular for front-end applications - Write well-tested, maintainable code with unit and integration tests - Participate actively in code reviews, providing and incorporating constructive feedback - Contribute to Azure DevOps CI/CD pipelines and deployment workflows - Troubleshoot and resolve production issues in a timely manner - Use AI-assisted development tools (GitHub Copilot, Claude) with structured, context-rich prompts to accelerate coding, test coverage, and documentation - Critically evaluate all AI-generated output before committing — review for correctness, security risks, and alignment with team architecture and coding standards - Work within Libra’s agentic engineering model: AI agents handle code generation at scale; engineers own the judgment, quality bar, and architectural decisions on everything that ships - Collaborate with Product, QE, business stakeholders, and peers to deliver features that meet acceptance criteria and quality standards - Partner with offshore developers on the squad to scope tasks clearly, provide code review feedback, and maintain shared quality standards Qualifications - Bachelor’s degree in Computer Science, Software Engineering, or related field, or equivalent professional experience - 2+ years of professional software development experience - Proficiency in C# and ASP.NET Core for back-end development - Experience with SQL Server or Azure SQL and Entity Framework Core - Working knowledge of React or Angular for front-end development - Experience with Git-based source control, pull request workflows, and Azure DevOps - Familiarity with CI/CD practices and agile delivery environments - Hands-on experience with AI coding assistants (GitHub Copilot, Cursor) and general-purpose LLMs (Claude, ChatGPT, Azure OpenAI) for code generation, documentation, and technical exploration - Ability to write structured, context-rich prompts for AI coding tools to produce targeted output; habit of iterating on prompts to improve output quality - Consistent practice of critically evaluating AI-generated code for correctness, security risks, and alignment with team architecture and coding standards before committing - Foundational understanding of agentic development workflows: comfortable contributing in a pod model where AI agents participate in code generation and test scaffolding, able to scope generation tasks effectively and take full ownership of reviewing, integrating, and improving AI-generated output - Creative problem-solving and troubleshooting skills - Self-motivated, collaborative, and detail-oriented with strong communication workflows - Must be authorized to work in the U.S Benefits - Competitive compensation - Medical, dental, vision, and life insurance plans - 401k match - Paid time off



