Agent Evals Specialist (Knowledge Graph Review)

Remote Full-time
Big part of Prox is AI agents that process complex technical documents into structured knowledge. The agents are right most of the time. When they're wrong, we need you to catch it.
You'll work inside a review platform we built. Each task shows you the source material, what the agent produced, and the steps it took to get there. You compare them and grade the agent's work.
The position is full-time (40+ hrs a week) compensation is $5-10/hr

What you do per task
1. Read the source and the agent's output side by side. Verify the content was captured accurately.
2. Review what the agent did. What it created, changed, or left out.
3. Score a short rubric covering accuracy, coverage, organization, and rule adherence. Full rubric provided at onboarding.
4. Write detailed feedback about the mistake. This is the most important thing you produce since we use it to improve the agent.
5. Submit. Move to the next task.

Conditions
• Subject matter shifts over time. You don't need prior knowledge of the subjects. You need to be able to compare two documents carefully and spot where they disagree.
• Rate is fixed for the engagement. If it changes, it goes up, and we tell you before your next task.
• Work product owned by Prox (work-for-hire).
• Standard NDA at offer stage.

Requirements
• Strong written English
• Can read dense technical content for hours without losing focus
• Consistent — your scoring on Monday matches your scoring on Friday
• Clear, specific feedback:"section 4 dropped the key requirement from page 17", not"this is confusing"
• Reliable on committed hours

Preferred
• Prior work as an AI trainer, tutor, or evaluator (Outlier, DataAnnotation, xAI, Surge, Mercor, Invisible, Toloka, etc.)
• Technical writing, editing, QA, translation, paralegal, or research-assistant background
• Markdown familiarity

The challenge below is the interview.

We don't do resume screens or vibe calls. Everyone who applies takes the same ~30 min prescreen challenge. You will receive your prescreen challenge link after submitting this job application. You will review a real agent output, score it and communicate the feedback. We read every submission.

If your submission is sharp, you start on paid tasks the same week after a short interview.

Good luck!

Apply To This Job
Apply Now →

Similar Jobs

Experienced Registered Behavior Technician for In-Home ABA Therapy - Atlanta, GA

Remote

Immediate Hiring: Experienced Registered Behavioral Technician (RBT) for Clinic-Based ABA Therapy Services

Remote

Experienced Registered Behavioral Technician (RBT) - ABA Therapy for Children with Autism Spectrum Disorder

Remote

Experienced Registered Nurse - Telehealth: Providing Remote Care Coordination and Patient Support

Remote

Experienced Substitute Teacher for Riverside County Schools - Join Scoot Education's Innovative Team

Remote

Experienced Substitute Teacher for San Bernardino County - Flexible Schedules & Competitive Pay

Remote

Experienced School Year Instructional Coach for High-Dosage Tutoring Programs in Edgewater Park, NJ

Remote

Experienced School Year Tutor for K-8 Students in Math and Literacy - Mickleton, NJ

Remote

Experienced Secondary Social Studies Teacher for Kansas - Flexible Hybrid Remote Arrangement

Remote

USPS Office Helper

Remote

Experienced Remote Live Chat Support Agent – Delivering Exceptional Customer Service through Instant Messaging

Remote

Clinical Contract & Pricing Specialist

Remote

Tax Preparer (Remote)

Remote

Java Support - Sr Engineer

Remote

Senior Data Analysis Expert - Remote Opportunity with Delta Airlines Careers

Remote

Distributed Easy Remote Data Entry At Teens With No Experience

Remote

[Remote] 1099 Remote Recruiter

Remote

**Operations Support Associate (PST Hours, Remote Data Entry) for a Mission-Driven Healthcare Organization**

Remote

[Remote] Head of Product Marketing

Remote

Bioinformatics Analyst III

Remote
← Back