Job Closed
This listing is no longer active.
We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
Engineering Expert – AI Systems Evaluation
Location
United States
Posted
102 days ago
Salary
$73 / hour
Seniority
Senior
Job Description
Engineering Expert – AI Systems Evaluation
Weekday (YC W21)
• Develop and refine prompts to guide AI behavior in engineering-specific scenarios • Evaluate model-generated responses for technical correctness, applied reasoning, completeness, and practical relevance • Fact-check technical claims using authoritative public sources and domain expertise • Annotate outputs by identifying conceptual gaps, flawed assumptions, and factual inaccuracies • Assess clarity, structure, and appropriateness of explanations for various audiences • Ensure responses align with expected conversational standards and system-level guidelines • Apply structured evaluation frameworks, taxonomies, and benchmarking standards consistently
Job Requirements
- PhD in Engineering or a closely related field
- Deep expertise in one or more of the following domains:
- Mechanical & Physical Systems Engineering
- Electrical, Electronic & Computer Engineering
- Chemical, Materials & Process Engineering
- Civil, Environmental & Infrastructure Engineering
- Strong familiarity with large language models (LLMs) and their practical applications
- Excellent written communication skills with the ability to clearly explain complex technical concepts
- High attention to detail and ability to detect subtle technical inaccuracies
- Experience reviewing, editing, or critiquing technical or academic writing
- Applied research, industry engineering workflows, or systems design (preferred)
- Experience with reinforcement learning from human feedback (RLHF), model evaluation, or structured data annotation (preferred)
- Teaching, mentoring, or explaining engineering concepts to non-expert audiences (preferred)
- Familiarity with structured evaluation rubrics, benchmarks, or quality assurance frameworks (preferred)
Related Guides
Related Job Pages
More Software Engineer Jobs
Senior Director of Engineering
ClickUpThe world's most productive AI Workspace for projects, tasks, chat, docs, and more. All software and humans - converged.
• Lead, mentor, and manage 2+ engineering teams, ensuring high morale, strong execution, and technical excellence • Drive the velocity of feature development while maintaining quality, reliability, and performance in your ownership area • Oversee hiring, onboarding, and professional development for your teams • Balance operational urgency handling incidents and driving resolution with long-term technical strategy • Foster a culture of accountability, continuous improvement, and innovation • Collaborate cross-functionally with Product, Design, and other stakeholders to deliver on company priorities • Identify and remove obstacles to team success, ensuring healthy, autonomous teams that require minimal oversight • Champion ClickUp’s mission to unify productivity tools and shape the future of work
Senior Mobile Engineer
ClickUpThe world's most productive AI Workspace for projects, tasks, chat, docs, and more. All software and humans - converged.
• Build features and systems with attention to detail and performance. • Own end-to-end development, from conception to production. • Engineer not only solutions but also tools to understand the problem. • Work closely with cross-functional stakeholders to deliver a world-class product. • Identify and implement improvements to our existing systems. • Help establish a fun, fulfilling, world-class engineering culture.
HubSpot Developer
Hunt StWe help Aussie companies find top 3% remote talent in the Philippines & Nepal for a single finder's fee.
• Implement and manage full HubSpot portals across marketing, sales, and service hubs • Build and maintain workflows, automations, and process logic to improve efficiency and data consistency • Customize pipelines, properties, and objects to align with business requirements • Handle third-party integrations and ensure smooth data synchronization between systems through custom N8N integrations. • Develop and support dashboards, reports, and analytics to provide actionable insights • Troubleshoot issues, optimize existing setups, and provide ongoing HubSpot support
• Assist in documenting legacy system behavior and dependencies during structured discovery efforts. • Support analysis of legacy components identified in the systems inventory. • Contribute technical input to migration activities under Senior-Level Developer guidance. • Implement migration tasks aligned to approved phased migration plans. • Support refactor and modernization of scoped system components. • Develop and maintain Python-based microservices and data processing modules. • Implement user stories derived from EPIC decomposition. • Refactor defined legacy modules into cloud-compatible components. • Write clean, modular, testable code aligned to established architectural standards. • Participate in code reviews and incorporate feedback. • Develop application components for AWS environments including: • Lambda • ECS / Fargate • S3 • EventBridge • SNS/SQS • Aurora/Postgres • Implement API integrations and event-driven workflows. • Contribute to containerization and serverless deployments. • Ensure code complies with enterprise cloud guardrails and security standards. • Develop CI/CD-compatible code. • Write and maintain automated unit tests. • Participate in pipeline troubleshooting. • Address defects and security findings. • Support observability implementation (logging, metrics). • Participate in Sprint Planning, Daily Standups, Reviews, and Retrospectives. • Contribute to story estimation and backlog refinement discussions. • Deliver committed sprint objectives. • Collaborate closely with Test Engineers to resolve defects. • Assist in validating component-level integrations. • Escalate cross-system integration risks to Senior Developers. • Support documentation updates tied to implementation changes. • Contribute to regression remediation efforts.



