Work Location: USA (remote)
Engagement Model: Freelancer / Independent Contractor
Weekly Workload: Up to 25 hours per week
DataForce is seeking skilled Software Engineers to join our team as Coding Annotators to support the development and evaluation of advanced AI models. This role focuses on creating high-quality coding prompts and answers, benchmarking model performance, and identifying failure cases across internal and competitor models. Candidates will contribute to building realistic evaluation environments and supporting reinforcement learning workflows.
Role Summary:
The Coding Annotator will be responsible for creating programming prompts and reference solutions aligned with industry benchmarks, such as SWE-Bench and Terminal-Bench. The role involves testing model outputs to identify failures.
The annotator will also support reinforcement learning workflows by building and maintaining coding environments and executing coding-specific validation checks. This role does not involve quality checking Annotator++ outputs, but instead focuses on domain-specific evaluation, benchmarking, and technical analysis to surface model limitations and performance insights.
Key Responsibilities: