Making AI Work
AI QA Trainer – LLM Evaluation
Location
Worldwide
Posted
78 days ago
Salary
$6 - $65 / hour
Seniority
Senior
Job Description
AI QA Trainer – LLM Evaluation
Invisible Technologies
• Converse with the model on real-world scenarios and evaluation prompts • Verify factual accuracy and logical soundness • Design and run test plans and regression suites • Build clear rubrics and pass/fail criteria • Capture reproducible error traces with root-cause hypotheses • Suggest improvements to prompt engineering, guardrails, and evaluation metrics (e.g., precision/recall, faithfulness, toxicity, and latency SLOs) • Partner on adversarial red-teaming, automation (Python/SQL), and dashboarding to track quality deltas over time
Job Requirements
- Bachelor’s, master’s, or PhD in computer science, data science, computational linguistics, statistics, or a related field is ideal
- Shipped QA for ML/AI systems
- Safety/red-team experience
- Test automation frameworks (e.g., PyTest)
- Hands-on work with LLM eval tooling (e.g., OpenAI Evals, RAG evaluators, W&B)
- Skills that stand out include: evaluation rubric design, adversarial testing/red-teaming, regression testing at scale, bias/fairness auditing, grounding verification, prompt and system-prompt engineering, test automation (Python/SQL), and high-signal bug reporting
- Clear, metacognitive communication—“showing your work”—is essential.
Benefits
- Company-sponsored benefits such as health insurance do not apply
- You’ll supply a secure computer and high-speed internet
Related Guides
Related Categories
Related Job Pages
More QA Engineer Jobs
Graphic Designer – Photoshop, Visual QA
Floowi IncWe help marketing agencies hire top offshore talent in 15 days - Fully guaranteed
• Review images and videos to confirm accuracy and adherence to quality standards. • Execute light Photoshop edits (resizing, background cleanup, minor adjustments, basic color corrections). • Track project statuses, flag risks, and ensure on-time delivery of complete asset sets. • Verify that all required media files are present and properly packaged for client delivery. • Draft clear delivery emails and handoff notes for clients and internal teams. • Maintain file hygiene (naming/versioning/exports) and support basic office tasks (data entry, product uploads).
Crypto & Web3 Beta Tester
LivitInternational support ecosystem & Bali-based innovation hub for entrepreneurs and startup teams.
• Up to 60-minute interview • Conversation around your: - Investing habits - Due diligence process - Analytics & tools you use - Pains, delights, and best practices in crypto investing
Quality Assurance Engineer
Carrot InstituteLearn to code. Join Carrot Institute & learn the most in-demand skills for full-stack web & mobile development.
• Collaborate with cross-functional teams to understand project requirements and user stories • Design and develop test plans, test cases, and test scenarios based on project specifications • Execute functional, regression, integration, and performance tests to ensure software reliability and adherence to quality standards • Identify, document, and track software defects and inconsistencies using bug tracking systems • Work closely with developers to investigate and troubleshoot issues, providing detailed bug reports and test results • Participate in requirements and design reviews to provide input on testability and quality aspects of the software • Stay up to date with industry best practices and emerging trends in software testing and quality assurance • Contribute to the continuous improvement of QA processes, methodologies, and tools • Collaborate with cross-functional teams to ensure timely and effective delivery of high-quality software solutions • Provide accurate and timely progress reports and test metrics to project stakeholders
QA Process and System Specialist
ClinglobalA science-based and quality-driven group providing solutions to support and accelerate innovation in Animal Health
• Primary lead for scheduling and coordinating third-party qualification audits. • Monitor and oversee CAPA (Corrective and Preventive Actions) progression and site deviations, ensuring timely resolution and effectiveness. • Facilitate receipt, distribution, and tracking of Sponsor and/or Regulatory audit reports and responses. • Serve as the primary QA liaison for review of global Standard Operating Procedures (SOPs). • Provide backup support for study-based audits across Clinglobal companies as required. • Review of Computerized System validation and change management documents. • Contribute to continuous improvement of QA processes and systems.



