We invest in people from Latam to bridge the talent gap in AI.
Computer Science PhD
Location
Finland
Posted
4 days ago
Salary
$150 / hour
Seniority
Mid Level
Job Description
Computer Science PhD
Anyone AI
Role Description We are looking for experienced Machine Learning and AI Experts to help train and evaluate advanced AI models. In this role, you will apply your domain expertise to assess AI-generated content, create high-quality technical prompts and answers, and evaluate model responses for accuracy, relevance, and clarity. This is a remote, flexible opportunity for experts who want to contribute to the development of better AI systems. Compensation: US$150 per hour Key Responsibilities - Assessing the factual accuracy, technical quality, and relevance of AI-generated text related to Machine Learning and Artificial Intelligence. - Crafting and answering advanced questions across Machine Learning, AI, and adjacent technical domains. - Evaluating, comparing, and ranking AI-generated responses based on correctness, reasoning quality, completeness, and clarity. - Identifying errors, inconsistencies, hallucinations, or weak reasoning in model outputs. - Providing clear feedback that helps improve the performance of AI systems. Qualifications - A PhD in Machine Learning, Artificial Intelligence, or an adjacent field, such as Computer Science, Statistics, Applied Mathematics, Computational Neuroscience, Natural Language Processing, Computer Vision, Reinforcement Learning, or Data Science. - Experience working as a Machine Learning Engineer, AI Researcher, Applied Scientist, Data Scientist, or in a comparable highly technical or analytical role. - Strong understanding of core Machine Learning and AI concepts, including model training, evaluation, optimization, and real-world application. - Ability to write clearly, precisely, and fluently in English about complex technical topics. - Strong analytical judgment and attention to detail. Work Arrangement - Remote - Flexible hours
Related Guides
Related Job Pages
More AI Research Scientist Jobs
Postdoctoral Scholar – AI Researcher, Critical Mineral Discovery
KoBold MetalsKoBold Metals discovers the battery minerals containing Ni, Cu, Co, and Li critical for the electric vehicle revolution.
• Develop and apply AI, muon tomography, seismic imaging, and geophysical inversion methods to discover and characterize copper, nickel, lithium, cobalt, and rare earth deposits. • Develop stochastic and ensemble inversion frameworks that assimilate muon flux, seismic, magnetics, gravity, and EM data into 3D subsurface property models. • Forward modeling, sensor placement optimization, and inversion of cosmic-ray muon attenuation data. • Apply deep generative models, geostatistical priors, and physics-informed neural networks for resource estimation. • Extend Mineral-X’s intelligent agent framework for data acquisition decisions.
AI Research Intern, Ethics and Readiness
NPRAn extra dose of NPR: the stories behind the stories plus corporate news and announcements.
• Collaborate with a cross-functional group of technologists, product managers, and archivists to shape the future of ethical AI in journalism. • Use generative AI to surface archival content for contemporary use. • Partner with RAD to design and prototype workflows that transform unstructured archival data into structured, discoverable assets. • Establish standards for the ethical use of artificial intelligence with NPR's archival content. • Prepare digital collections for machine learning and research approaches that pair AI with human editorial judgment and metadata to find new insights from the archive. • Identify opportunities and build functional prototypes that enable reporters and producers to query historical audio for contemporary reporting. • Assist in the development of automated workflows to transform unstructured legacy media into a searchable, AI-indexed library. • Analyze historical data troves to surface evergreen content opportunities and extract narrative leads that support contemporary reporting. • Evaluate AI outputs and discovery tools for accuracy and bias to ensure technical prototypes maintain archival integrity.
• Review and evaluate AI-generated outputs related to meditation, wellbeing, mental health, sleep, anxiety, stress, and behavior change. • Assess whether AI outputs are clinically appropriate, evidence-informed, safe, clear, and useful for real-world users. • Evaluate the strength and quality of evidence behind wellbeing, meditation, and psychotherapeutic techniques, including limitations and contraindications. • Help define frameworks for assessing indications, contraindications, risk, evidence quality, internal and external validity, and real-world applicability. • Work with engineers and product teams to improve prompts, AI workflows, evaluations, and user-facing wellbeing experiences. • Translate clinical, psychological, and research concepts into clear product guidance that engineers and designers can act on. • Help build internal evaluation systems for AI outputs, including rubrics, review processes, and quality standards. • Contribute to the development of safe, high-quality AI experiences that are grounded in meditation, wellbeing, and mental health expertise. • Support the creation of structured content and intervention pathways across areas such as sleep, anxiety, stress, mindfulness, and emotional wellbeing. • Help identify where AI outputs are unsupported, overstated, unsafe, unclear, or misaligned with best available evidence. • Collaborate with internal and external researchers on studies, evidence reviews, and potential peer-reviewed publications. • Help shape Insight Timer’s broader research agenda around meditation, wellbeing, mental health, AI, and consumer health products.
• Design, build, and iterate on agentic AI systems for complex healthcare workflows, including documentation, coding, denial management, appeals, and revenue cycle automation. • Develop long-horizon agent behavior across context construction, retrieval, tool use, memory, routing, verification, escalation, and human-in-the-loop review. • Define what “good” looks like for clinical agents end-to-end, translating expert workflows into specifications, rubrics, gold standards, test cases, and clinically meaningful success criteria. • Build rigorous evaluation and feedback loops using expert review, production logs, model outputs, and benchmarks to measure performance, regressions, edge cases, safety, reliability, provenance quality, and business impact. • Prototype new AI capabilities from 0 → 1, then harden them into reliable, explainable, auditable production systems with clear contracts, monitoring, evidence, rationale, and performance gates. • Partner with research and ML engineering teams on model selection, fine-tuning, reward modeling, distillation, synthetic data, post-training, and internal AI infrastructure, including instrumentation, experiment tracking, benchmarking, prompt/version management, and reproducible evaluation.




