AI Engineer

AI EngineerMachine Learning EngineerFull TimeRemoteSeniorTeam 11-50Since 2018H1B No SponsorCompany SiteLinkedIn

Location

Georgia

Posted

65 days ago

Salary

0

Seniority

Senior

Bachelor DegreeEnglishJavaScriptNext.jsNode.js

Job Description

AI Engineer

Ruby Labs

• Advanced Prompt Engineering: Designing complex, dynamic prompt templates with conditional logic and efficiently reusing information and context within prompts to maximize generation quality and reasoning. • Structured Outputs & Schemas: Implementing various response schemes (JSON mode, function calling, Zod/JSON schemas) to ensure AI outputs are predictable and ready for seamless integration into application logic. • Prompt Engineering & Evaluations: Building robust evaluation pipelines and using Langfuse to collect feedback and score the quality of responses in real time. • Tracing & Debugging: Performing deep debugging of complex LLM chains using Langfuse traces to identify bottlenecks and optimize for cost, latency, and context window usage. • AI A/B Testing: Running systematic experiments across different models via OpenRouter (e.g., comparing Claude 3.5 Sonnet vs. GPT-4o) and analyzing results based on quantitative metrics. • Data-Driven Decisions: Making deployment decisions for new prompts or models strictly based on quantitative benchmarks and trace data, rather than intuition. • Output Scoring & Analysis: Developing scoring systems to analyze the “Problem → Solution” chain and identify root causes of hallucinations or logic errors using Langfuse analytics. • Model Performance & Fine-Tuning: Regularly re-evaluating model performance as new architectures emerge and performing fine-tuning when necessary to meet specific domain requirements.

Job Requirements

  • Node.js & Next.js: Deep knowledge of the stack to build reliable services and handle complex LLM-generated data.
  • Dynamic Prompting Skills: Proven experience in building prompts where content is highly dependent on input variables and context injection.
  • OpenRouter Experience: Experience working with unified APIs, managing rate limits, and selecting the most cost-effective models for specific tasks.
  • Langfuse (or similar): Understanding of LLM observability principles — setting up tracing, creating test datasets, and integrating scoring systems.
  • Evaluation Methodology: Experience with frameworks like RAGAS or building custom “LLM-as-a-judge” systems.
  • Analytical Mindset: Ability to transform raw generation logs into actionable business metrics and technical insights.
  • Iterative Mindset: Focus on continuous product improvement through constant feedback loops.

Benefits

  • Remote Work Environment: Embrace the freedom to work from anywhere, anytime, promoting a healthy work-life balance.
  • Unlimited PTO: Enjoy unlimited paid time off to recharge and prioritize your well-being, without counting days.
  • Paid National Holidays: Celebrate and relax on national holidays with paid time off to unwind and recharge.
  • Company-provided MacBook: Experience seamless productivity with top-notch Apple MacBooks provided to all employees who need them.
  • Flexible Independent Contractor Agreement: Unlock the benefits of flexibility, autonomy, and entrepreneurial opportunities. Benefit from tax advantages, networking opportunities, reduced employment obligations, and the freedom to work from anywhere.

Related Job Pages

More AI Engineer Jobs

Full TimeRemoteTeam 11-50Since 2023H1B No Sponsor

• Drive technical breakthroughs in agentic systems, applied ML infrastructure, and LLM-based applications. • Define and evolve the ML/LLM strategy and technology roadmap in alignment with product development. • Act as a principal technical authority, making high-impact architectural and modeling decisions across teams. • Develop prototypes for key technologies to validate new approaches and de-risk system design. • Own the full lifecycle from research and experimentation through production deployment, monitoring, and iteration. • Translate advances in ML into scalable, production-grade systems with measurable impact. • Design how LLMs operate within agent workflows, tool use, and multi-step reasoning and long-lived execution. • Implement and refine prompting strategies, multi-agent orchestration, memory management, and human-in-the-loop controls for safety and reliability. • Establish patterns for planning, decision-making, and tool orchestration within complex systems. • Own end-to-end quality evaluation of ML-powered systems, including defining metrics, benchmarks, and testing frameworks. • Establish evaluation systems that connect model performance to task success and system-level outcomes. • Ensure systems behave predictably, safely, and reliably in production through monitoring, regression testing, and robust failure handling. • Contribute to the design of ML systems supporting the full lifecycle, including training, fine-tuning, evaluation, deployment, and monitoring. • Drive architecture decisions across model serving, routing, orchestration, and latency and cost optimization. • Work across infrastructure layers, including cloud and containerized systems, to ensure scalable and efficient deployment. • Build and deploy enterprise-grade AI systems used by global customers in production environments. • Design systems that operate reliably in regulated and constrained settings, including on-premise, air-gapped, and secure cloud environments. • Ensure systems are auditable, explainable, and compliant with regulatory and organizational requirements. • Write technical reports and design documents summarizing R&D progress, system behavior, and key decisions. • Communicate complex ML concepts and tradeoffs clearly to both technical and non-technical stakeholders. • Drive alignment across research, engineering, and product through strong technical leadership. • Mentor junior and senior engineers and researchers, raising the bar for ML rigor and system-level thinking. • Establish and propagate best practices for ML system design, evaluation, and reliability across the organization. • Influence technical direction beyond immediate teams through high-impact, cross-functional work.

Virginia + 1 moreAll locations: Virginia | Washington
$230K - $300K / year
Job Closed
Rockstar logo

Founding AI Engineer

Rockstar

Helping rockstar candidates get introduced to their next role.

AI Engineer65 days ago
Full TimeRemoteTeam 1-10H1B Sponsor

• Build AI-Powered Remediation Systems: Design and implement machine learning models that can identify, diagnose, and automatically resolve system issues detected by the observability platform. • Own the AI/ML Pipeline: Take end-to-end ownership of the AI lifecycle — from data collection and preprocessing to model training, evaluation, and deployment. • Integrate with Observability Stack: Work closely with the core platform team to integrate AI solutions into the existing observability infrastructure (e.g., logs, metrics, traces). • Experiment and Iterate: Rapidly prototype and experiment with different models and approaches (e.g., anomaly detection, root cause analysis, LLM-based insights) to find what works best. • Collaborate Cross-Functionally: Partner with product, backend, and DevOps teams to align AI capabilities with user needs and infrastructure realities. • Set the Technical Direction: As an early technical hire, contribute to foundational architecture decisions and establish best practices for AI/ML within the company. • Ensure Reliability and Scalability: Build systems that perform reliably at scale and integrate safely into production environments. • Stay Ahead of the Curve: Keep up with the latest advancements in AI/ML and observability to help shape the product roadmap.

United States
Job Closed
Analytica logo

AI Engineer

Analytica

Data-driven consulting and technology services

AI Engineer65 days ago
Full TimeRemoteTeam 51-200H1B No Sponsor

Analytica is seeking an AI Engineer to support an enterprise-wide modernization of AI capabilities for a high-profile financial regulatory client. This role centers on sustaining and enhancing AI systems deployed within Microsoft Azure-based data management and analytics platforms. Analytica has been recognized by Inc. Magazine as one of the fastest-growing 250 businesses in the US for 3 years. We work with U.S. government clients in health, civilian, and national security missions to build better technology products that impact our day-to-day lives. The company offers competitive compensation with opportunities for bonuses, employer-paid health care, training and development funds, and 401k match.   Key Responsibilities: - Provide operational support for enterprise AI capabilities, including onboarding users, managing access controls, triaging issues, and monitoring performance. - Maintain AI solution configurations in Azure, including role based access, usage tracking, and troubleshooting. - Support AI Innovation team's solutions, including AI systems deployed across multi cloud environments (Azure, Appian, Salesforce, ServiceNow, etc.). - Participate in CI/CD workflows using GitHub Enterprise; manage PBIs, bugs, and change requests. - Support continuous validation, security compliance, and Authority to Operate (ATO) requirements for AI systems. - Assist in implementing responsible AI monitoring, model evaluation, and improving AI user experience per agency governance guidance. Required Qualifications - Master's degree in Statistics, Mathematics, Computer Science, or related field. - 3+ years of experience in data and AI engineering roles, including operational support. - Experience with CI/CD and Agile work management tools such as GitHub Enterprise. - Experience with Infrastructure as Code, especially Terraform - Familiarity with Azure AI services, Azure Machine Learning, or cloud native AI operations. - Must be a US citizen - Must be able to obtain and maintain a Public trust security clearance. About ANALYTICA: Analytica is a leading consulting and information technology solutions provider to public sector organizations supporting health, civilian, and national security missions. Founded in 2009 and headquartered in Bethesda, MD, the company is an established SBA small business that has been recognized by Inc. Magazine each of the past three years as one of the 250 fastest-growing companies in the U.S.  Analytica specializes in providing software and systems engineering, information management, analytics & visualization, agile project management, and management consulting services. The company is appraised by the Software Engineering Institute (SEI) at CMMI® Maturity Level 3 and is an ISO 9001:2008 certified provider. Analytica LLC is an Equal Opportunity Employer. We are committed to providing equal employment opportunities to all individuals, regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or any other characteristic protected by applicable federal, state, or local law. As a federal contractor, we comply with the Vietnam Era Veterans' Readjustment Assistance Act (VEVRAA) and take affirmative action to employ and advance in employment qualified protected veterans. We ensure that all employment decisions are based on merit, qualifications, and business needs. We prohibit discrimination and harassment of any kind. Analytica LLC also provides reasonable accommodations to applicants and employees with disabilities, in accordance with applicable law. To enhance efficiency, fairness, and accuracy, Analytica may use AI-assisted tools to support certain aspects of our hiring process. - Application Review: AI tools may help identify skills and experiences relevant to the role. - Interview Support: AI-powered notetaking tools may be used during interviews to document discussions and summarize key points. These tools are used to assist our team. All hiring decisions are made by Analytica recruiters and hiring managers. By submitting an application, you acknowledge that AI-assisted tools may be used to support parts of the application and interview process. When receiving email communication from Analytica, please ensure that the email domain is analytica.net to verify its authenticity.

United States

Role Description You can’t improve what you can’t measure. Every answer on Vera is a clinical decision being made by a clinician. That’s the bar. Not “mostly correct,” not “good on average.” We need to know exactly how the Vera behaves, where it holds up, where it breaks, and why. Think of the system like a chessboard with millions of different combinations. Different models, prompts, retrieval strategies, ranking, parameters, all interacting at once. Your job is to understand that space and push it in the right direction by optimizing every part of it, down to the smallest parameter. This isn’t about writing eval scripts and calling it done. It’s about going a step further, understanding how the system behaves across specialties, clinical contexts, and edge cases, and improving it so that answers hold up no matter the setting. You’ll be working on a complex search and reasoning engine, where the goal is simple but hard: make it better for every question, every specialty, every clinical setting. What You’ll Do - Optimize every layer of the system, from model behavior to retrieval to ranking to the smallest parameters - Take a system with thousands of moving parts and figure out which ones actually drive better answers - Push performance forward by testing, breaking, and refining how outputs are generated - Go deep on failures, not just what went wrong, but what pattern it reveals - Tune how the system behaves across different specialties, question types, and edge cases - Work directly with the founders on the highest-leverage improvements - Work with leading healthcare institutions to validate how the system performs in practice Qualifications - You don’t trust outputs you can’t explain - You care about being right, not just being close - You like working in messy spaces where there’s no obvious path - You move fast, but you don’t compromise when the stakes are high - You have strong technical depth in LLMs, search, or similar systems - You’re excited to work at the edge of what’s currently being done How we work - Ship, then improve. We get things in front of users fast and let real feedback drive iteration. A feature in production teaches you more than a month of planning. - Own it completely. There's no handoff culture here. You take a problem and see it through, from first principles to production. - Always look out for the user. Every decision comes back to the clinician on the other end. If it doesn't make their day better, it doesn't ship. Why Vera - Rocketship traction: 100%+ MoM growth, built by a team of 5 - Real impact: Every feature you build helps clinicians make better decisions and save lives - Best time to join: We’ve just raised our Series A and are entering the most important phase of the company, where a small team defines everything that comes next Details - Location: San Francisco (strongly preferred). For exceptional candidates, remote or relocation support (O-1A / H-1B sponsorship) available. - Compensation: Competitive salary + meaningful early-stage equity. Details shared in process. - Employment: Full-time

United States