24-MAG logo
24-MAG

This opportunity is available through a leading AI-driven work platform.

Applied Machine Learning Evaluation Consultant

Location

United States

Posted

3 days ago

Salary

$100 / hour

Seniority

Mid Level

No structured requirement data.

Job Description

Applied Machine Learning Evaluation Consultant

24-MAG

Role Description We are sharing a specialised part-time consulting opportunity for experienced Machine Learning Engineers and Applied ML Researchers with expertise in end-to-end modeling, dataset analysis, feature engineering, validation strategy, model evaluation, reference solution development, and technical quality review. This role supports current and upcoming remote consulting opportunities focused on complex machine learning challenge design, applied modeling workflows, reference solution development, technical evaluation, reproducible documentation, and high-quality project execution. Selected professionals will design, solve, and review challenging machine learning tasks that reflect real-world ML development across multiple domains and data modalities. Key Responsibilities - End-to-End Machine Learning Solution Development - Develop complete machine learning solutions for challenging prediction and modeling problems. - Analyze datasets and define appropriate modeling approaches, validation strategies, and evaluation metrics. - Perform exploratory data analysis, feature engineering, data preprocessing, model training, tuning, and evaluation. - Work across tabular, text, image, time-series, recommendation, ranking, or other applied ML problem types. - Reference Solutions & Technical Documentation - Develop strong reference solutions using industry-standard machine learning techniques and best practices. - Document methodologies, assumptions, modeling choices, validation approaches, and evaluation results clearly. - Ensure solutions are accurate, reproducible, and technically well-structured. - Identify opportunities to improve model performance through systematic experimentation and iteration. - ML Project Review & Evaluation - Review and validate the technical quality of machine learning projects and deliverables. - Evaluate modeling choices, data preparation decisions, performance metrics, and experimental design. - Identify weak assumptions, data leakage risks, flawed validation, underdeveloped features, or unsupported modeling conclusions. - Provide clear written technical feedback that improves correctness, rigor, and reproducibility. Qualifications - Master's degree, PhD, or equivalent advanced experience in Computer Science, Machine Learning, Statistics, Mathematics, Electrical Engineering, or a related field. - 2+ years of hands-on experience developing, training, evaluating, and optimizing machine learning models in a professional or research setting. - Strong proficiency in Python and modern machine learning frameworks such as scikit-learn, XGBoost, LightGBM, PyTorch, or TensorFlow. - Demonstrated experience building end-to-end machine learning solutions, including data preparation, model development, validation, and evaluation. - Strong understanding of model evaluation metrics, validation methodologies, and experimental design. - Ability to work independently on open-ended machine learning problems and deliver high-quality technical outputs. Requirements - Relevant experience may include: - Tabular machine learning. - Natural language processing. - Computer vision. - Recommendation systems. - Ranking systems. - Time-series forecasting. - Applied modeling across structured or unstructured datasets. Nice to Have - PhD from a leading research university. - Experience at leading technology companies, AI-focused teams, research institutions, or high-growth startups. - Participation in competitive machine learning or data science competitions. - Experience optimizing models against performance-based evaluation metrics. - Familiarity with advanced techniques such as ensembling, hyperparameter optimization, transfer learning, foundation model fine-tuning, or reinforcement learning. - Publications, patents, or significant open-source contributions in machine learning or AI. - Experience reviewing, mentoring, or evaluating the work of other machine learning practitioners. Why This Opportunity - Apply machine learning engineering and applied research expertise to structured remote consulting work. - Contribute to high-quality ML challenge design, reference solution development, and technical evaluation. - Work on flexible assignments aligned with your modeling, Python, experimentation, and ML framework experience. - Use your technical judgment to evaluate complex ML workflows and improve solution quality. - Remote structure with competitive hourly compensation. Contract Details - Independent contractor role. - Fully remote with flexible scheduling. - Eligible professionals may be based in approved project locations depending on project needs. - Project commitment may vary depending on availability and scope. - Competitive rates up to $100 per hour depending on expertise and project scope. - Weekly payments via Stripe or Wise. - Projects may be extended, shortened, or adjusted depending on scope and performance. - Work will not involve access to confidential or proprietary information from any employer, client, or institution. About the Platform This opportunity is available through 24-MAG LLC. We connect experienced professionals with remote consulting opportunities across technical, evaluation, and project-based workstreams. By submitting this application, you acknowledge that your information may be processed by 24-MAG LLC for recruitment and opportunity matching in accordance with our Privacy Policy: https://www.24-mag.com/privacy-policy .

Related Job Pages

More Machine Learning Engineer Jobs

Cash App logo

Staff Machine Learning Engineer, Underwriting and Credit

Cash App

Initially built to take the pain out of peer-to-peer payments, Cash App has gone from a simple product with a single purpose to a dynamic app, bringing a better way to send, spend, invest, borrow and save to our millions of monthly active users. With a mission to redefine the world's relationship with money by making it more relatable, instantly available and universally accessible.

Full TimeRemoteTeam 3,500Since 2013

Block builds technology to increase access to the global economy. Across our ecosystem, including Square and Cash App, we create tools that help businesses run and grow, and individuals move, manage, and grow their money with confidence. Square empowers sellers of all sizes with integrated, omnichannel tools to accept payments, manage operations, access financial services, and reach customers across online and in-person channels. Cash App complements this by providing a fast, accessible financial platform for millions of people to send, spend, save, invest, and borrow, helping redefine how individuals interact with money.Operating at massive scale across both ecosystems means trust and safety are foundational. Our teams build systems that protect real people and businesses, safeguard financial activity, and ensure our products remain reliable, secure, and easy to use. Block is a global, distributed company with a culture rooted in ownership, creativity, and impact. Whether supporting sellers on Square or customers on Cash App, we're united by a shared mission: to make the global economy more accessible and inclusive. The Role Block has provided over $200 billion in credit to customers globally. Afterpay and Cash App Borrow are our two largest products in this space, expanding access to credit for consumers who are often underserved by traditional financial systems. Machine learning is the core of how these products work. Our models decide who gets credit, how much, and under what terms. They underwrite customers across a wide range of credit profiles, including many with thin or no traditional credit history. The modeling challenges are real: maintaining calibration across diverse borrower populations, designing features that generalize as the portfolio grows, and balancing approval rates against loss performance at every decision point. This requires strong fundamentals, disciplined experimentation, and continuous evaluation in production. On the Credit Modeling team, you will be a senior individual contributor building and evolving the ML systems behind these products. You will work across the full modeling lifecycle: problem formulation, feature development, training, calibration, experimentation, deployment, monitoring, and iteration. You will operate across one of these lending products with different borrower populations, repayment structures, and regulatory surfaces. We use agentic engineering and AI tooling to build reliable, high-velocity workflows that enable this work. That includes code generation, automated testing, documentation, and developer tooling. You will help define how these practices scale across the team in ways that are rigorous, auditable, and trusted. This is a team that values high output and rigor. We move fast, we test carefully, and we hold our work to a high standard because the models we build determine real credit outcomes for real people. This role is fully remote for candidates based in the US or Canada. You Will - Build, evaluate, and maintain underwriting and decisioning models across Cash App Borrow and Afterpay. - Design and evolve credit decision frameworks, including the modeling, automation, and policy logic that manage credit exposure over time. - Design and run experiments to evaluate model performance, measure impact on approval rates and loss, and inform credit policy decisions. - Develop deep understanding of borrower behavior, repayment dynamics, and portfolio structure across both products, and use that to inform model design and decision logic. - Contribute analysis and perspective that inform portfolio-level decisions, including explaining model behavior, tradeoffs, and uncertainty to senior technical and business leaders. - Work across the full modeling lifecycle: problem formulation, feature engineering, training, calibration, deployment, monitoring, and iteration in production. - Build agentic engineering workflows that accelerate development, testing, and documentation. - Collaborate with Product, Engineering, Legal, Compliance, and Operations to ensure credit systems reflect business goals and regulatory expectations. - Share modeling context and approaches across teams, helping align how credit risk is measured, interpreted, and discussed. - Shape how AI developer tooling is adopted across the team, defining review practices, quality standards, and governance patterns. You Have - A Bachelor's degree in a quantitative field (e.g., Mathematics, Statistics, Physics, Computer Science). Advanced degrees welcome. - 10+ years applying AI, machine learning, or statistical modeling in decisioning contexts such as credit, risk, fraud, recommendations, or similar domains. - Experience with probabilistic models and decision systems, including calibration, score transformations, and interpretation of model outputs. - Strong experimentation skills: you know how to design holdouts, measure lift, and evaluate models beyond aggregate metrics. - Experience with model monitoring, degradation detection, and retraining strategies in production systems. - Proficiency with AI-native development workflows. You use LLMs, agentic coding tools, and AI-assisted automation as a regular part of how you build and ship. - Experience explaining modeling concepts, results, and limitations to senior stakeholders and cross-functional partners. - Experience working across disciplines in environments with meaningful constraints. Technologies We Use and Teach - Python (NumPy, Pandas, scikit-learn, PyTorch, XGBoost, LightGBM) - AI development tools as core infrastructure: Claude Code, Cursor, Copilot - MLflow for experiment tracking and model registry - Internal feature store and model hosting platform - Prefect and Airflow for orchestration - SQL / Snowflake - GitHub - GCP / AWS We're working to build a more inclusive economy where our customers have equal access to opportunity, and we strive to live by these same values in building our workplace. Block is an equal opportunity employer evaluating all employees and job applicants without regard to identity or any legally protected class. We will consider qualified applicants with arrest or conviction records for employment in accordance with state and local laws and "fair chance" ordinances. We believe in being fair, and are committed to an inclusive interview experience, including providing reasonable accommodations to disabled applicants throughout the recruitment process. We encourage applicants to share any needed accommodations with their recruiter, who will treat these requests as confidentially as possible. Want to learn more about what we're doing to build a workplace that is fair and square? Check out our I+D page . Block takes a market-based approach to pay, and pay may vary depending on your location. U.S. locations are categorized into one of four zones based on a cost of labor index for that geographic area. The successful candidate's starting pay will be determined based on job-related skills, experience, qualifications, work location, and market conditions. These ranges may be modified in the future. To find a location's zone designation, please refer to this resource . If a location of interest is not listed, please speak with a recruiter for additional information. Zone A: $276,800 - $415,200 USD Zone B: $276,800 - $415,200 USD Zone C: $276,800 - $415,200 USD Zone D: $276,800 - $415,200 USD Application Guidelines Candidates may submit up to 9 active applications within a 60-day period. Reapplications to the same role are accepted 90 days after a previous application has been reviewed. Use of AI in Our Hiring Process We may use automated AI tools to evaluate job applications for efficiency and consistency. These tools comply with local regulations, including bias audits, and we handle all personal data in accordance with state and local privacy laws. Contact us here with hiring practice or data usage questions. Every benefit we offer is designed with one goal: empowering you to do the best work of your career while building the life you want. Remote work, medical insurance, flexible time off, retirement savings plans, and modern family planning are just some of our offering. Check out our other benefits at Block. Block, Inc. (NYSE: XYZ) builds technology to increase access to the global economy. Each of our brands unlocks different aspects of the economy for more people. Square makes commerce and financial services accessible to sellers. Cash App is the easy way to spend, send, and store money. Afterpay is transforming the way customers manage their spending over time. TIDAL is a music platform that empowers artists to thrive as entrepreneurs. Bitkey is a simple self-custody wallet built for bitcoin. Proto is a suite of bitcoin mining products and services. Together, we're helping build a financial system that is open to everyone.

California + 1 moreAll locations: California | Canada
Coinbase logo

Senior Machine Learning Engineer, CX Intelligence

Coinbase

We're building an open financial system for the world.

Full TimeRemoteTeam 1,001-5,000Since 2012H1B Sponsor

• Architect and deploy the orchestration layer that manages state transitions, context sharing, and intent routing across vendor and internal LLM frameworks in a distributed conversational environment. • Build production-grade Python services that bridge advanced ML/AI research with reliable, measurable customer-facing products. • Lead end-to-end project execution for complex ML initiatives, managing priorities, technical trade-offs, and cross-functional dependencies from design through delivery. • Establish best practices for system design, coding standards, and AI/ML development workflows across the team. • Mentor engineers on architectural integrity and modern AI/ML patterns, raising the technical bar for the broader team. • Conduct design reviews to ensure every feature meets Coinbase's standards for security, scalability, and performance.

Brazil
R$455.5K / year
Full TimeRemoteTeam 201-500Since 2014H1B Sponsor

• Design, implement, and evaluate RL algorithms for robotic control, motion planning, and adaptive behaviors in dynamic, unstructured environments. • Develop and integrate RL policies with robot control systems, ensuring compatibility with hardware constraints and real-time requirements. • Collaborate with perception teams to fuse RL with vision, depth, and sensor data for robust decision-making. • Build and maintain sim-to-real pipelines, including domain randomization and transfer learning techniques. • Conduct experiments on physical robots, including designing safety protocols and monitoring for unexpected behaviors. • Leverage simulation environments (Isaac Gym, Gazebo, MuJoCo, PyBullet) for large-scale training before real-world validation. • Continuously improve model efficiency to operate within compute and latency constraints on embedded robotic systems.

Ohio
Job Closed
Full TimeRemoteTeam 51-200H1B No Sponsor

• Own technical design and delivery of subsystems in a high-throughput, low-latency inference platform capable of handling multi-tenant, enterprise-grade inference workloads. • Develop robust API layers (gRPC, WebSockets, REST, etc.) and developer SDKs that abstract complex distributed inference orchestration into seamless, reliable token streams. • Build and harden a multi-tenant control plane to enable accurate metering, rate limiting, quotas, tenant isolation and noisy-neighbor fairness across the platform. • Optimize inference performance across the entire system stack, including the model engine layer. • Build observability and SLOs to gain insights into system economics, cache-hit rates, GPU utilization and cost accounting per model and per tenant. • Partner with product and infrastructure teams on model onboarding, capacity planning, external API contracts and customer adoption. • Decompose ambiguous work, drive issues to closure, and raise the engineering bar through code quality, reviews, testing, and mentoring.

Pennsylvania