Job Closed
This listing is no longer active.
We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
Generalist Evaluator Expert
Location
United States
Posted
98 days ago
Salary
0
Seniority
Mid Level
Job Description
Generalist Evaluator Expert
Weekday (YC W21)
This role is for one of our clients Compensation: $35-$40 per hour We are seeking detail-oriented writing professionals to contribute to a high-impact AI research initiative in collaboration with a leading research lab. In this role, you will develop high-quality prompt–golden answer pairs used to train and evaluate advanced language models. This is a short-term, flexible opportunity ideal for individuals with strong academic foundations and exceptional clarity in written communication. The role is well-suited for professionals who enjoy translating complex ideas into structured, precise, and easy-to-understand content.
Job Requirements
- Key Responsibilities
- Design and Optimize Prompts: Develop detailed, constraint-rich prompts with clear instructions and multiple requirements
- Define Evaluation Standards: Establish expectations for high-quality responses in general consumer contexts and create comprehensive grading rubrics
- Model Testing and Assessment: Execute prompts using AI systems and evaluate outputs against defined standards
- Benchmarking & Quality Assurance: Collaborate in QA processes to ensure prompt tasks and rubrics meet high standards of rigor, clarity, and consistency before inclusion in benchmarking workflows
- Maintain structured documentation and adhere to project guidelines
- Minimum Qualifications
- Bachelor’s degree (BS or BA) from a reputable institution (completed or in progress)
- Strong writing, analytical, and critical thinking skills
- Ability to work independently and meet structured deadlines
- Meaningful familiarity with ChatGPT or similar AI tools for personal, academic, or professional use
- Must be based in the United States or Canada
- Preferred Qualifications
- Experience in teaching, curriculum design, academic research, or structured evaluation
- Experience developing grading rubrics or assessment frameworks
- Project Details
- Start: Immediate
- Duration: Approximately 2 months
- Commitment: Minimum 20 hours per week
- Fully remote with flexible scheduling
- Structured project environment with defined goals, workflows, and tools
- Application & Onboarding Process
- Complete a short AI-led interview (approximately 15 minutes)
- Complete a 45-minute written assessment focused on rubric development
- Selected candidates will receive project onboarding instructions
- Contract & Payment Terms
- Engagement will be structured as an independent contractor agreement
- Work can be completed remotely on your own schedule
- Projects may be extended, shortened, or concluded early based on performance and evolving project needs
- Assignments will not require access to confidential or proprietary information from any employer, client, or institution
- Payments are processed weekly via Stripe or Wise based on services rendered
- Visa sponsorship is not available; H1-B and STEM OPT candidates cannot be supported at this time
Related Guides
Related Job Pages
More Full-stack Engineer Jobs
Software Engineering Expert
Weekday (YC W21)We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
This role is for one of our clients Compensation: $50-$150 per hour We are seeking experienced Software Engineering professionals to contribute to high-impact research collaborations with leading AI laboratories. In this role, you will help enhance AI systems by working on code validation, prompt refinement, algorithm evaluation, and model benchmarking initiatives. This opportunity allows you to apply your engineering expertise toward advancing the next generation of intelligent systems while working in a flexible, project-based environment.
Special Projects Software Engineers
Weekday (YC W21)We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
This role is for one of our clients Compensation: $100-$200 per hour We are inviting select, highly capable software engineers to participate in specialized project-based engagements. This opportunity is designed for engineers who enjoy tackling unique, high-impact initiatives and working independently on clearly defined deliverables. This is a contract-based role offering flexibility, autonomy, and the ability to contribute to focused technical projects.
Software Developer
WixWix is the comprehensive platform that gives you total creative freedom online.
• Train Software Development team from Splunk basics to reporting, dashboards and events, search capabilities, APIs, creating Knowledge Objects, Data models, and other enterprise concepts with Splunk. • Collaborate with Software Development team to develop a Splunk solution to modernize the BCA’s Archive Services application using Splunk. • Work with planning managers, architects, business analysts, quality assurance analysts and agencies to define and transform business requirements into technical design and programming specifications conforming to the system architecture. • Implement enterprise solutions following established BCA patterns in application development and data security. • Build, deploy and test software applications and modules using automated test and deployment technologies and tools. • Provide direction in identifying appropriate technical solutions (application and system level) that meet business objectives, which include defining technical alternatives to derive the most cost-effective solutions to meet client requirements. • Review test cases and test plans in conjunction with quality assurance staff. • Execute unit and integration testing procedures. • Communicate highly technical concepts, and to introduce new technological tools or methods. • Establish strong business partnerships with senior MNJIS team members and with criminal justice and technical staff. • Knowledge transfer.
Senior Engineer – Quality
Avery DennisonWe are a global materials science and digital identification solutions company.
• Interact with the customer via email or telephone • Record information appropriately in the systems • Collect necessary information/evidence • Document evidence and make decisions on acceptance or rejection • Responsible for the claim entry process • Provide requested documentation of claim details • Serve as a neutral party in claim investigation


