We´re looking for a Code Reviewer with deep Python expertise to review evaluations completed by data annotators assessing AI-generated Python code responses. Your role is to ensure that annotators follow strict quality guidelines related to instruction-following, factual correctness, and code functionality.
• Weekly Hours: Minimum of 20 hours per person
• Assignment Duration: Approximately 2 to 3 weeks (pilot phase)
• Start Date: Within one week
About Company
SME is a platform that bridges subject-matter experts with AI projects, enabling them to contribute their knowledge to improve AI models. It offers flexible opportunities to work on tasks like data labeling, quality assurance, and domain-specific problem-solving while earning competitive pay.
Responsibilities
• Review and audit annotator evaluations of AI-generated Python code.
• Assess if the Python code follows the prompt instructions, is functionally correct, and secure.
• Validate code snippets using proof-of-work methodology.
• Identify inaccuracies in annotator ratings or explanations.
• Provide constructive feedback to maintain high annotation standards.
• Work within Project Atlas guidelines for evaluation integrity and consistency.
Required Qualifications
• 5–7+ years of experience in Python development, QA, or code review.
• Strong knowledge of Python syntax, debugging, edge cases, and testing.
• Comfortable using code execution environments and testing tools.
• Excellent written communication and documentation skills.
• Experience working with structured QA or annotation workflows.
• English proficiency at B2, C1, C2, or Native level.
Preferred Qualifications
• Experience in AI training, LLM evaluation, or model alignment.
• Familiarity with annotation platforms.
• Exposure to RLHF (Reinforcement Learning from Human Feedback) pipelines.
Why Join Us?
Join a high-impact team working at the intersection of AI and software development. Your Python expertise will directly influence the accuracy, safety, and clarity of AI-generated code. This role offers remote flexibility, milestone-based delivery, and competitive compensation.