Agent Quality / Evals Engineer

Location

Worldwide

Posted

2 days ago

Salary

$2K / month

Seniority

Mid Level

Job Description

Agent Quality / Evals Engineer

SOFTGIC

Role Description This is a remote position. Owns the eval harness and quality gate from the beginning. This role replaces the old late-stage “Evals Specialist” model with a standing owner for measurable agent quality. Key Responsibilities - Build and maintain the MVP eval harness: golden tasks, exception tasks, scorecard metrics, and regression packs. - Wire evals into CI so quality regressions fail builds and releases. - Define and maintain release-gate thresholds with Product and the Tech Lead. - Lay the path for later adversarial and drift-testing expansion without overbuilding MVP scope. Qualifications - Experience evaluating ML, LLM, or non-deterministic systems. - Strong test and benchmark design capability. - Comfort working with noisy metrics, thresholds, and probabilistic behavior. - Good scripting and automation skills. Requirements - Uses AI to generate candidate eval cases and failure hypotheses, but never confuses generated tests with validated quality. - Approaches AI quality as an operating system, not a QA afterthought. What Success Looks Like in the First 90 Days - The first reference agent has a published scorecard and gated eval path. - Golden and exception tests run automatically. - The team can explain what “good enough to ship” means in measurable terms.

Related Categories

Related Job Pages

More Engineer Jobs

Hotel Engine logo

Senior Automation Engineer – Latin America

Hotel Engine

Innovating business travel with a free-to-use hotel booking platform.

Engineer2 days ago
ContractRemoteTeam 201-500Since 2018H1B No Sponsor

• Design, develop, and deploy automation workflows that reduce manual effort and improve accuracy across back-office functions. • Apply artificial intelligence, machine learning, and modern automation tools to solve complex operational challenges and unlock new efficiencies. • Collaborate with Client Operations, Finance, People, and other G&A teams to identify automation opportunities, gather requirements, and deliver solutions that meet business needs. • Build and maintain integrations between platforms (Salesforce, JIRA, HRIS, ERP, etc.) to enable seamless data flow and process automation. • Develop robust, well-documented solutions with proper error handling, monitoring, and testing to ensure long-term stability. • Monitor automation performance, identify optimization opportunities, and iterate on solutions based on user feedback and data.

Latin America
Ensono logo

Senior IAM Engineer – ForgeRock

Ensono

Ensono is an information technology and services company on a mission to help technology leaders transform their businesses by becoming a “catalyst for change

Engineer2 days ago

• Oversee the day-to-day administration, operational maintenance, and custom expansion of our Identity and Access Management platform • Ensure high availability and optimal performance of the ForgeRock environment • Actively develop custom scripts, authentication journeys, and plugins to meet evolving business needs • Monitor platform health across the ForgeRock software suite • Manage system upgrades and critical security patches with minimal disruption • Maintain directory integrations ensuring steady synchronization between ForgeRock components and connected enterprise systems • Optimize system capacity by tuning JVM, database connectors, and LDAP server performances • Provide L3 technical support to resolve complex identity federation, SSO, and authentication routing incidents • Conduct root cause analysis on system failures and implement permanent remediation steps • Oversee backup and disaster recovery protocols • Maintain technical documentation including SOPs and architecture diagrams • Build custom authentication scripts and logic plugins utilizing Java, JavaScript, or Groovy • Configure authentication journeys incorporating multi-factor authentication and Zero Trust validation policies • Develop JSON-based route profiles within ForgeRock Identity Gateway

United States
$125K - $162K / year
H&R Block logo

Senior Software Engineer C#

H&R Block

With expert guidance, upfront pricing, and more ways to file, it’s #BetterWithBlock.

Engineer2 days ago
Full TimeRemoteTeam 10,001+Since 1955

Role Description The Data Platform team is on a mission to transform how data and AI empower decision-making across the enterprise. Our Data Platforms team is building scalable, secure, and intelligent infrastructure to support advanced analytics, real-time insights, and seamless data integration. We are creating a unified data foundation that accelerates innovation and drives measurable business impact. The Senior Software Engineer will design, build, and evolve modern, cloud-native applications and services that power our enterprise data platform. This is a hands-on technical leadership role requiring deep engineering expertise, strong architectural judgment, and the ability to mentor and influence others while delivering high-impact solutions. Day to day, you'll: - Design and build scalable, highly available data services using ASP.NET Core, Angular, and Azure services. - Implement DevOps & CI/CD: Establish robust CI/CD pipelines using tools like Azure DevOps, Terraform, and cloud services. - Champion code quality, testing, security, and observability. - Collaborate with product managers, architects, and peers to deliver high‑value features and improvements. - Mentor other engineers. Qualifications - Bachelor’s degree in a related field or the equivalent through a combination of education and related work experience. - 5+ years of professional experience building software systems. - Strong expertise in C#, TypeScript, SQL, and software design patterns. - Experience with CI/CD pipelines, DevOps practices, and infrastructure-as-code. - Strong communication skills and a collaborative mindset. - Passion for innovation, continuous learning, and mentoring others. Requirements - Experience using AI-assisted development tools such as GitHub Copilot to enhance code quality, productivity, and developer workflows. - Experience with Azure App Services, Kubernetes, and serverless architectures. - Knowledge of event-driven architecture and domain-driven design. - Certifications in Azure Developer, Azure Solutions Architect, or similar. - Exposure to AI/ML integration in software products. Benefits - Competitive pay. - Comprehensive benefits. - Support for life both in and outside of work. - Medical and prescription drug coverage. - Participation in the H&R Block Retirement Savings Plan (401(k) Plan). - Employee Assistance Program. - (Virtual) fitness center programs. - Associate discount program. - Business Travel Accident Insurance. - Associate Tax Prep benefit. Pay Range Information The pay range for this position is $101,200.00 - $161,900.00/Yr. Local minimum wage laws apply. Individual pay decisions will depend on job-related factors such as experience, education, skill, performance, and geographic location where work will be performed. Successful candidates may be able to participate in one or more incentive compensation or short-term incentive plans, which could generate additional earnings in accordance with the terms of each plan.

United States
$101.2K - $161.9K / year
Anypear logo

Engineer Draftsperson

Anypear

Connecting Kiwi and Aussie businesses to top global talent. Grow earlier. Scale faster. Save 80% on payroll.

Engineer2 days ago
Full TimeRemoteTeam 1-10Since 2021H1B No Sponsor

• Draft, format, and finalise high-quality engineering documents, including process descriptions, operating and maintenance manuals, safety protocols, and technical specifications for tender. • Utilise CAD software (e.g. Revit, AutoCAD) to produce accurate, high-quality technical drawings that meet industry and safety standards. • Support the drafting and development of Process Flow Diagrams (PFDs) and Piping & Instrumentation Diagrams (P&IDs) for NPI utility systems (e.g., water, fuel, air, and lubrication facilities). • Collaborate with senior engineers to translate conceptual designs into detailed, construction-ready blueprints. • Maintain accurate records in the company’s document control system, admin and document formatting in word and excel. • Support NPI mining projects through all phases of the project lifecycle: PFS, DES, FEED, and Detailed Design. • Assist in the development of robust tender packages and construction-ready documentation for mining infrastructure. • Ensure precise design continuity across all project phases to minimise redesign, avoid scope creep, and ensure constructability. • Track project changes and ensure early project assumptions align with final engineering outcomes.

Philippines