Annotation Data Scientist, Evaluation Integrity (Siri)

Join the Evaluation Integrity team to help build the trusted quality signal behind every Siri release. Design and run human-in-the-loop annotation projects to evaluate the quality and authenticity of agentic user personae, the validity of agent-to-agent conversations, and the reliability of LLM-as-judge and rule-based evaluators against Siri’s product specifications.

Requirements

  • Bachelor’s or Master’s degree in a quantitative or related field
  • 3+ years of hands-on experience working with human-annotated datasets or human-in-the-loop evaluation methodologies
  • 3+ years of experience using Python for data processing, analysis, and prototyping
  • Experience designing, implementing, and communicating annotation schemas, rubrics, or ontologies
  • Experience managing multiple concurrent dataset curation efforts
  • Experience specifying or designing custom annotation tooling in collaboration with software engineers

To apply for this job please visit jobs.apple.com.


You can apply to this job and others using your online resume. Click the link below to submit your online resume and email your application to this employer.

Tired of manual job applications?

JobCopilot auto-applies to thousands of RevOps and GTM roles on your behalf — so you can focus on interviews, not applications.

Applying for this role?

Tailor your resume to this exact role — hiring managers notice the difference.

Latest articles on the blog

RECRUITERS!

Reduce the risk of your recruitment process (applicant quality, long and inefficient process) by selecting from a relevant pool of candidates.

POST A NEW JOB NOW!