A company is looking for a Staff Applied Scientist - AI Evaluation & Trust.Key ResponsibilitiesLead the development of specialized judge models and evaluation frameworksDesign and execute rigorous scoring pipelines and calibrations for agentic systemsOwn the full lifecycle of evaluation data and collaborate cross-functionally to translate statistical uncertainty into product signalsRequired Qualifications10+ years of Machine Learning experience, focusing on Deep Neural Networks and model evaluation1-2+ years of experience in post-training activities1+ year experience creating benchmarks for evaluating LLMsDeep expertise in LLM-as-judge architectures and statistical rigor in experimental designProven ability to manage the path from data collection to production deployment