Mercor logo
Mercor

Cincinnatus is an enterprise staffing company that partners with leading technology companies to source and employ highly skilled professionals for full-time and long-term contingent roles. Cincinnatus serves as the employer of record for these engagements, providing W-2 employment, payroll, benefits, and compliance, while placing employees directly within client teams to work on high-impact initiatives. Roles hired through Cincinnatus are not project-based or freelance engagements. They are structured, role-based positions that typically involve full-time or fixed-term commitments, close collaboration with a client's internal teams, and integration into standard enterprise workflows. Cincinnatus is a legal entity separate from Mercor. While opportunities may be discovered through Mercor's platform, employment, onboarding, payroll, and benefits for these roles are administered by Cincinnatus. Equal Employment Opportunity Cincinnatus is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or any other legally protected characteristic. Cincinnatus is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans throughout the job application process.

DevOps Engineer - AI Model Evaluator

DevOps EngineerDevOps EngineerPart TimeRemoteMid LevelH1B No Sponsor

Location

Poland

Posted

1 day ago

Salary

$85 / hour

Seniority

Mid Level

No structured requirement data.

Job Description

DevOps Engineer - AI Model Evaluator

Mercor

Role Description Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey. Position: DevOps / SRE / Cloud Engineer (Coding Agent Experience) Type: Contract Compensation: $85/hour Location: Remote Role Responsibilities - Use frontier AI coding agents to complete and evaluate complex infrastructure engineering tasks. - Review model-generated implementations involving cloud platforms, Kubernetes, CI/CD systems, and infrastructure automation. - Identify bugs, edge cases, reliability issues, and failure modes in model outputs. - Compare outputs from multiple frontier models to assess strengths and weaknesses. - Apply professional engineering judgment to realistic infrastructure engineering scenarios. Qualifications - Must-Have: 2+ years of professional DevOps, SRE, or Cloud Engineering experience. - Experience with AWS, Azure, GCP, Kubernetes, Terraform, CI/CD pipelines, or observability tooling. - Regular use of AI coding agents like Cursor, Claude Code, Codex, Windsurf, Gemini CLI, or similar tools. - Ability to evaluate model-generated infrastructure and reliability engineering solutions. - Preferred: Experience supporting production-scale systems. Requirements - $400 per accepted task. Compensation is tied to accepted work. Application Process - Upload resume - AI interview based on your resume - Submit form Resources & Support - For details about the interview process and platform information, please check: Interview Process - For any help or support, reach out to: support@mercor.com - PS: Our team reviews applications daily. Please complete your AI interview and application steps to be considered for this opportunity.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Multiplica Talent logo

Junior DevSecOps Engineer

Multiplica Talent

We connect extraordinary talent with forward thinking companies.

Full TimeRemoteTeam 201-500Since 2003H1B No Sponsor

• Diseñar, implementar y optimizar procesos de integración y despliegue continuo (CI/CD), infraestructura cloud y prácticas de seguridad. • Promover una cultura DevSecOps dentro de los equipos de desarrollo.

Mexico
Autodesk logo

Senior Site Reliability Engineer

Autodesk

How the world gets designed and made. #MakeAnything

Full TimeRemoteTeam 10,001+Since 1982H1B No Sponsor

• Serve as a primary owner for the reliability, availability, performance, operability, and capacity of one or more production services • Deploy, operate, maintain, and continuously improve production services running in Autodesk GovCloud environments • Partner with engineering teams to ensure services are designed with reliability, scalability, security, and operability in mind • Define and operate reliability practices such as SLOs/SLIs, error budgets, production readiness reviews, service reviews, and operational health reviews • Build automation to improve deployment safety, operational efficiency, incident response, and service recovery • Design, develop, and maintain software, automation, and tooling that improve the reliability, scalability, and efficiency of production systems • Implement and improve monitoring, alerting, logging, tracing, and observability capabilities across supported services • Lead and participate in incident response, troubleshooting, and post-incident reviews focused on learning and continuous improvement • Develop and maintain operational documentation, runbooks, and recovery procedures • Scale and enhance resilience testing and Gameday practices to validate system behavior, recovery capabilities, and operational readiness • Continuously identify and eliminate operational toil through software engineering, automation, and process improvement • Ensure supported services remain compliant with Autodesk security, privacy, and regulatory requirements, including FedRAMP and related controls where applicable • Participate in a 24x7 on-call rotation for production services

Idaho + 1 moreAll locations: Idaho | Texas
$117K - $209.3K / year
Autodesk logo

Senior Site Reliability Engineer

Autodesk

How the world gets designed and made. #MakeAnything

Full TimeRemoteTeam 10,001+Since 1982H1B No Sponsor

• Serve as a primary owner for the reliability, availability, performance, operability, and capacity of one or more production services • Deploy, operate, maintain, and continuously improve production services running in Autodesk GovCloud environments • Partner with engineering teams to ensure services are designed with reliability, scalability, security, and operability in mind • Define and operate reliability practices such as SLOs/SLIs, error budgets, production readiness reviews, service reviews, and operational health reviews • Build automation to improve deployment safety, operational efficiency, incident response, and service recovery • Design, develop, and maintain software, automation, and tooling that improve the reliability, scalability, and efficiency of production systems • Implement and improve monitoring, alerting, logging, tracing, and observability capabilities across supported services • Lead and participate in incident response, troubleshooting, and post-incident reviews focused on learning and continuous improvement • Develop and maintain operational documentation, runbooks, and recovery procedures • Scale and enhance resilience testing and Gameday practices to validate system behavior, recovery capabilities, and operational readiness • Continuously identify and eliminate operational toil through software engineering, automation, and process improvement • Ensure supported services remain compliant with Autodesk security, privacy, and regulatory requirements, including FedRAMP and related controls where applicable • Participate in a 24x7 on-call rotation for production services • Function effectively in a fast-paced environment while helping establish and mature operational excellence practices for Autodesk GovCloud

Idaho
$117K - $209.3K / year
Gladly logo

AI Deployment Engineer

Gladly

Radically personal customer service software.

Full TimeRemoteTeam 51-200H1B Sponsor

• Partner with customers to decompose ambiguous goals into concrete, buildable AI use cases, uncovering hidden complexity and edge cases along the way. • Determine whether the data a use case needs is available, identify the right APIs or MCP sources, and secure access. • Use Gladly’s CLI to register APIs on the App Platform, making customer data accessible to Gladly AI and agents. • Write app actions in JavaScript to condense large API payloads down to the fields the AI actually needs. • Build the workflows and guides that tell Gladly’s AI how to use that information and respond to the customer. • Own use cases end to end after launch: monitor performance, optimize, and build new use cases that lift assist and resolution rates. • Give proactive status updates to customers and the internal team, and partner with SAMs and Implementation Managers to keep goals and timelines aligned. • Participate in QBRs and EBRs to show progress and ensure customers are getting measurable value. • Partner with Solutions Engineering on pre-sales demos, and pull in Professional Services Engineering for the most complex custom work.

Colombia
$40K - $54K / year