An Analysis of On-road and Virtual Driving Data in Ohio

An Analysis of On-road and Virtual Driving Data in Ohio




Project Summary

                This summer, I worked with professor Adi Wyner from the statistics department on analyzing a set of virtual driving simulation data that is implemented in the state of Ohio’ driver licensing workflow. The driving simulation was taken prior to a first-time license applicant taking the on-road driving test, and the results of their on-road test has been linked to the simulation data. The goal was to use various statistical methods to have a deeper understanding of the data, as well as to build models that predict a driver’s on-road performance based on their virtual driving test data.

                From this project I’ve greatly expanded my understanding of using statistical models to make predictions. I’ve learned to use backward logistic models for variable selections, machine learning for building models, ROC and AUC for model performance validation, and more. In addition, since all of my work were conducted using R, the project gave me an opportunity to sharpen my programming skills.

                As the project is a collaboration between several institutions, I was able to engage in knowledge exchange with CHOP and the simulation startup company through bi-weekly meetings. These small-scaled meetings provided me with insights on what our research should be focusing on, and how we could use our work results to change the current driving system in order to save more lives on the road.

                The project taught me to apply the theoretical knowledge to develop a systematic data analytic tool for future incoming data. I am looking forward to continuing working with professor Wyner during the school year, and to possibly pursue a major in statistics.