In this role, you will:
Design Evaluation Frameworks: Architect statistical methodologies for safety-critical AI systems to form objective, rigorous conclusions about their performance and reliability. Conduct Robust Analysis: Deliver validation evidence to support increasingly complex operations and identify potential edge-case failures. Inform Strategy: Deliver clear, data-driven insights to development teams to guide system improvement, and to executive leadership to inform milestone-level go/no-go decisions. Define Metrics: Drive alignment across engineering teams on performance metrics and data extraction strategies. Lead the Lifecycle: Manage all phases of evaluation including prototyping, requirements capture, design, implementation, and validation. Scale Pipelines: Partner with engineers to build and maintain scalable data processing and simulation pipelines, applying distributed computing to analyze petabytes of driving data.
Qualifications:
MS or PhD in Statistics, Computer Science, Machine Learning, Applied Mathematics, or related quantitative field Proficiency in Python and SQL with experience in production-quality code Demonstrated expertise in statistical methodologies including hypothesis testing, power analysis, spatiotemporal modeling, Bayesian inference, and multivariate analysis. Experience with large-scale data analysis and statistical modeling Proficiency with Git, unit testing, and collaborative development practices
Bonus Qualifications:
Hands-on experience with production machine learning pipelines: dataset creation, training frameworks, metrics pipelines Experience with modern data processing technologies such as Apache Spark, Spark SQL, and Databricks Experience with designing metrics and delivering actionable insights that drive business decisions