Job Closed
This listing is no longer active.
We get talents. You get results.
AI Evaluation Engineer – Mathematics & Algorithms
Location
Pakistan
Posted
40 days ago
Salary
0
Seniority
Senior
Job Description
AI Evaluation Engineer – Mathematics & Algorithms
Gramian Consulting
• Design and build **multi-agent benchmark tasks** requiring multi-step mathematical reasoning and algorithmic problem-solving • Create **complex, decomposable problems** across domains such as: - Competition mathematics - Numerical analysis - Combinatorial optimization - Statistical inference • Develop **verification scripts** to validate: - Numerical outputs (with tolerance thresholds) - Proof correctness and logical steps - Algorithmic outputs and constraints • Write **clear, structured problem statements** with precise notation and defined outputs • Design **task decomposition strategies** for parallel or multi-agent execution • Implement computational solutions and validation pipelines using Python • Work with containerized environments (Docker) for reproducibility and evaluation
Job Requirements
- 5+ years in mathematics, quantitative research, or computational science — competition math, university-level mathematics, or quantitative research background
- Python programming — NumPy, SciPy, or symbolic computation (SymPy) Experience writing mathematical proofs or formal derivations.
- Ability to create problems with precise, verifiable answers — not subjective or open-ended.
- Experience with AI coding benchmarks (SWE-bench, Terminal-bench)
- Comfortable with Docker — writing Dockerfiles, building images, and debugging container issues.
- Understanding of numerical methods — floating point tolerance, convergence criteria, error bounds.
- Nice to Have**
- Experience creating competition math problems (AMC, AIME, Putnam, IMO)
- Background in **theoretical computer science or advanced mathematics research**
- Exposure to **automated theorem proving or formal verification**
- Familiarity with AI reasoning benchmarks (GSM8K, MATH, AIME, GPQA, ARC-AGI)
- Experience in **large-scale numerical or scientific computing**
Related Guides
Related Categories
Related Job Pages
More Artificial Intelligence Jobs
AI Creative Technologist
HightouchSync customer data from your warehouse into the tools your business teams rely on.
• Drive Product Adoption • Master AI Workflows • Consult & Strategize • Performance Optimization • Cross-Functional Collaboration
AI Monitoring – Governance Engineer
iTalentersLeading the art of connecting #tech talent with international IT projects
• Monitorizar y gobernar los sistemas de IA en producción, especialmente aquellos que utilizan modelos y agentes externos. • Analizar y anticipar el impacto de los cambios de versión en los modelos, asegurando la calidad, continuidad y optimización de costes. • Ajustar y transformar prompts para evitar degradaciones, alucinaciones y problemas de rendimiento. • Colaborar estrechamente con data scientists, data engineers y perfiles de RAG, integrando soluciones robustas y eficientes. • Gestionar la documentación y liderar la integración de APIs y servicios externos. • Formalizar procesos técnicos e impulsar la transición de tareas informales a soluciones estructuradas. • Proponer e implementar mejoras de forma continua apostando siempre por la innovación.
• Operate autonomously to audit, edit, and refine complex AI outputs. • Identifying and correcting malformed LaTeX expressions, unclosed environments, and inaccurate mathematical notations within text. • Rewriting AI-generated text to meet strict stylistic, structural, and creative requirements, ensuring engaging prose, varied sentence structure, and pristine grammar. • Fixing broken markdown elements, including unclosed code blocks, inaccurate language tags, headers, and list numbering to ensure flawless structural formatting. • Applying complex, multi-part evaluation rubrics consistently across high volumes of tasks to generate clean, reliable data for model training. • Ensuring all generated text adheres strictly to spelling, grammar, and formatting conventions.
Generative AI Designer, Adobe Firefly Subject Matter Expert
Moore Solutions, Inc.In-classroom and online learning solutions that work in-app and in the browser
• Design clear, structured lessons that teach Adobe Firefly and AI assisted design workflows in a practical, easy to follow way • Create step by step guidance showing how to generate, refine, and apply AI generated content in real projects • Record short video walkthroughs demonstrating workflows such as: Text to image, generative fill, and text effects • Prompt development and iteration • Integrating Firefly outputs into Adobe Creative Cloud tools • Develop high quality visuals and screenshots that illustrate key concepts, features, and workflows • Translate design and AI concepts into clear instruction for students at different learning levels




