Job Closed

This listing is no longer active.

Toloka Annotators logo
Toloka Annotators

Be a key player in crafting the high-quality data essential for AI innovation. Perfect for aspiring freelancers

Freelance Annotator (English) - AI Trainer

LLM EngineerMachine Learning EngineerOtherRemoteMid LevelTeam 51-200Company SiteLinkedIn

Location

Texas

Posted

70 days ago

Salary

0

Seniority

Mid Level

English

Job Description

Freelance Annotator (English) - AI Trainer

Toloka Annotators

Please submit your resume in English and indicate your level of English. At Toloka, we connect smart, curious people from around the world with freelance online tasks that train and improve artificial intelligence. What we do The Toloka Annotators connects individuals with Generative AI projects from leading tech innovators. Our mission is to unlock the full potential of AI by involving real people from around the world in the development process. About the Role Annotation is what helps AI make sense of the world. As an annotator, you may be invited to take part in online projects such as rating AI-generated content, evaluating factual accuracy, or comparing responses - when projects are available. Responsibilities: - Carefully review provided data (text, images, or videos) - Label or classify content based on project guidelines - Identify and flag factually incorrect, sensitive, inappropriate, or unclear material Important note: This is project-based work. Tasks are available only when projects are active. You may be invited to one or more projects depending on your profile and current opportunities. Each project has its own compensation level based on scope and expertise required. On this project, AI trainers earn up to $23 per hour equivalent.

Job Requirements

  • Bachelor’s degree in any discipline
  • Minimum 1 year of experience in any professional role
  • Advanced level of English (C1 or higher), both written and spoken
  • Logical thinking, fact-checking and reasoning abilities
  • Strong attention to detail and ability to understand and follow complex instructions
  • Strong communication skills, including the ability to ask clarifying questions when needed
  • Genuine interest in technology and artificial intelligence

Benefits

  • Why this freelance opportunity might be a great fit for you?
  • Take part in a part-time, remote, freelance project that fits around your primary professional or academic commitments.
  • Work on advanced AI projects and gain valuable experience that enhances your portfolio.
  • Influence how future AI models understand and communicate in your field of expertise.

Related Job Pages

More LLM Engineer Jobs

Welo Data logo

Japanese Senior Prompt Engineer: LLM Migration & Optimization

Welo Data

Join our team at Welo Data and embark on a journey of growth and innovation.

LLM Engineer70 days ago
Part TimeRemoteTeam 1,001-5,000

Role Description Are you an expert at navigating the complex architecture of Large Language Models? Welo Data is seeking a highly technical Senior Prompt Engineer based in Japan to lead the end-to-end migration of template workflows into high-performance LLM autoraters. This is a specialized role for a technical architect who understands that "perfecting a prompt" is a rigorous engineering discipline. You will leverage advanced APG/APO tools and manual refinement to ensure our automated systems meet—and exceed—human accuracy baselines in both German and English contexts. - The Mission: Automated Quality at Scale - Architectural Migration: Take full ownership of the end-to-end technical migration of templates to LLM autoraters. - Optimization Leadership: Utilize Automatic Prompt Generation (APG) and supervise Automated Prompt Optimization (APO) tools to push model performance past plateaus and logic deadlocks. - Metrics-Driven Excellence: Continuously measure quality against "gold data" baselines, tracking precision, recall, and F1 scores to justify launch readiness. - Edge-Case Engineering: Manually draft and refine complex prompts to overcome anti-patterns and architecture gaps that automated tools cannot solve. Qualifications - Bachelor’s, Master’s, or PhD in Computer Science, Data Science, Computational Linguistics, or a related analytical field. - 4+ years of experience tuning LLMs for strict, structured outputs, complex classification, and few-shot learning. - High proficiency in identifying error patterns and using SQL or data analytics tools to monitor performance. - Fast learner capable of mastering proprietary internal tools and "Goose API" style interfaces with minimal oversight. Requirements - Native fluency in Japanese and professional fluency in English. - Part-Time (Set your own hours within project milestones). - 100% Remote (Must be currently based in Japan). - Freelance / Independent Contractor. Preferred Technical Skills - Familiarity with shadowbot monitoring and disagreement tracking between human and LLM ratings. - Hands-on experience with Chain-of-Thought (CoT) prompting and APO systems. - Deep linguistic expertise, including a strong understanding of semantics and formal logic. - Proven ability to draft high-level Launch Certification Documentation. Company Description Join our team at Welo Data and embark on a journey of growth and innovation.

Japan
Mercor logo

Language Model Evaluator

Mercor

Cincinnatus is an enterprise staffing company that partners with leading technology companies to source and employ highly skilled professionals for full-time and long-term contingent roles. Cincinnatus serves as the employer of record for these engagements, providing W-2 employment, payroll, benefits, and compliance, while placing employees directly within client teams to work on high-impact initiatives. Roles hired through Cincinnatus are not project-based or freelance engagements. They are structured, role-based positions that typically involve full-time or fixed-term commitments, close collaboration with a client's internal teams, and integration into standard enterprise workflows. Cincinnatus is a legal entity separate from Mercor. While opportunities may be discovered through Mercor's platform, employment, onboarding, payroll, and benefits for these roles are administered by Cincinnatus. Equal Employment Opportunity Cincinnatus is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or any other legally protected characteristic. Cincinnatus is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans throughout the job application process.

LLM Engineer70 days ago
OtherRemoteH1B No Sponsor

Role Description Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey. Position: Language Model Evaluator Type: Full-time or Part-time Contract Work Compensation: $23/hour Location: Geography restricted to Egypt, Saudi Arabia, UAE, USA Role Responsibilities - Evaluate LLM-generated responses on their ability to effectively answer user queries. - Conduct fact-checking using trusted public sources and external tools. - Generate high-quality human evaluation data by annotating response strengths, areas for improvement, and factual inaccuracies. - Assess reasoning quality, clarity, tone, and completeness of responses. - Ensure model responses align with expected conversational behavior and system guidelines. - Apply consistent annotations by following clear taxonomies, benchmarks, and detailed evaluation guidelines. Qualifications - Must-Have: - Bachelor’s degree - Native speaker or ILR 5/primary fluency (C2 on the CEFR scale) in Arabic - Significant experience using large language models (LLMs) - Excellent writing skills - Strong attention to detail - Adaptable and comfortable moving across topics, domains, and customer requirements - Background or experience in domains requiring structured analytical thinking - Excellent college-level mathematics skills - Preferred: - Prior experience with RLHF, model evaluation, or data annotation work - Experience writing or editing high-quality written content - Experience comparing multiple outputs and making fine-grained qualitative judgments - Familiarity with evaluation rubrics, benchmarks, or quality scoring systems Application Process - Upload resume - AI interview based on your resume - Submit form Resources & Support - For details about the interview process and platform information, please check: Interview Process Details - For any help or support, reach out to: support@mercor.com PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.

United States + 3 moreAll locations: United States | Egypt | United Arab Emirates | Saudi Arabia
$23 / hour
Job Closed
Welo Data logo

Prompt Engineer

Welo Data

Join our team at Welo Data and embark on a journey of growth and innovation.

LLM Engineer70 days ago
TemporaryRemoteTeam 1,001-5,000

Role Description Are you an expert at navigating the complex logic of Large Language Models? Welo Data is seeking a technical Prompt Engineer to lead the end-to-end migration of template workflows into high-performance LLM autoraters. This isn't just about writing prompts—it’s about engineering a technical bridge. You will use advanced APG/APO tools and manual refinement to ensure our automated systems meet (and exceed) human accuracy baselines. - The Mission: Architecting the Future of Rating - Technical Migration: Take ownership of the workflow for transitioning templates to LLM autoraters. - Optimization Leadership: Run and supervise Automated Prompt Optimization (APO) tools, identifying where logic plateaus and providing the manual "spark" to overcome deadlocks. - Metrics-Driven Accuracy: Continuously measure quality against gold data, calculating critical performance metrics like precision, recall, and $F_1$ scores. - Edge-Case Engineering: Solve complex scenarios by designing manual prompts that handle anti-patterns and broken logic in legacy architectures. Qualifications - 2+ years of experience as a Prompt Engineer. - BS, MS, or PhD in Computer Science, Data Science, Computational Linguistics, or a related analytical field. - Fast learner capable of mastering proprietary internal tools and interfaces (like the Goose API) with minimal supervision. - Strong ability to identify error patterns and use SQL or data analytics tools to analyze model performance. Requirements - Familiarity with shadowbot disagreement tracking between humans and LLMs. - Hands-on experience with Chain-of-Thought (CoT) and few-shot learning. - Proven ability to draft high-level Launch Certification Documentation. Benefits - Part-Time (Flexible hours within project milestones) - 100% Remote (Must be based in the United States) - Freelance / Independent Contractor Company Description Join our team at Welo Data and embark on a journey of growth and innovation.

United States
Welo Data logo

Senior Prompt Engineer

Welo Data

Join our team at Welo Data and embark on a journey of growth and innovation.

LLM Engineer70 days ago
TemporaryRemoteTeam 1,001-5,000

Role Description Are you an expert at navigating the complex architecture of Large Language Models? Welo Data is seeking a Senior Prompt Engineer to lead the technical migration of template workflows into high-performance LLM autoraters. This role is designed for a technical specialist who understands that "perfecting a prompt" is a rigorous engineering discipline. You will leverage advanced APG/APO tools and manual refinement to ensure our automated systems meet—and exceed—human crowd baselines in accuracy and nuance. - The Mission: Automated Quality at Scale - Architectural Migration: Take full ownership of the end-to-end technical migration of templates to LLM autoraters. - Optimization Leadership: Utilize Automatic Prompt Generation (APG) and supervise Automated Prompt Optimization (APO) tools to push model performance past plateaus and deadlocks. - Metrics-Driven Excellence: Continuously measure quality against "gold data" baselines, tracking precision, recall, and $F_1$ scores to justify launch readiness. - Edge-Case Engineering: Manually draft and refine complex prompts to overcome anti-patterns and architecture gaps that automated tools can't solve. Qualifications - Bachelor’s, Master’s, or PhD in Computer Science, Data Science, Computational Linguistics, or a related analytical field. - 4+ years of experience tuning LLMs for strict, structured outputs, complex classification, and few-shot learning. - High proficiency in identifying error patterns and using SQL or data analytics tools to monitor performance. - Fast learner capable of mastering proprietary internal tools and "Goose API" style interfaces with minimal oversight. Requirements - Familiarity with shadowbot monitoring and disagreement tracking. - Experience in AI model evaluation and software engineering. - Deep understanding of semantics, logic, and Chain-of-Thought (CoT) prompting. - Proven ability to draft high-level Launch Certification Documentation. Recruitment & Onboarding - Technical Review: Submit your CV and portfolio of LLM optimization work. - Prompt Assessment: Demonstrate your ability to navigate complex template clusters and APO deadlocks. - Tooling Deep-Dive: Get access to our internal technical suite and migration workflows. - Launch: Begin your part-time engagement and lead the shift to LLM-driven autorating.

United States