Over the summer, I worked with Dr. Masino and his group on researching current machine learning explainability methods. Dr. Masino’s team has several projects that uses machine learning in the medical domain, however, before physicians can begin to trust the prediction models generated by these algorithms they must be more easily interpretable to the user. For this reason, I spent several weeks researching explainability methods and also executed one for a sepsis prediction project.
I spent the bulk of my time working on a project that was meant to predict sepsis in infants. My focus on this project was testing out a machine learning explainability method called LIME (Local Interpretable Model-Agnostic Explanation) on various prediction models that I would create beforehand.
The datasets that were provided to me consisted of a variety of lab test results and medical history features. I first learned how to not only clean and visualize the dataset provided to me but also how to create prediction models using various machine learning algorithms, like logistic regression, random forest, and support-vector machine. I used python and scikit-learn to create these prediction models. I then began to more closely research LIME and how to implement it. LIME is used to explain individual predictions made by any machine learning model. It works by first approximating the model locally and generating its own dataset of permuted samples of the original model. LIME then uses this dataset to train an interpretable model that it then uses to output an explanation for the prediction. This explanation consists of the contributions of the top features that the model used in its prediction.
Working on this research project allowed me to learn about how several machine learning algorithms function and how to use them to create prediction models. It was also interesting to learn more about why explainability in the machine learning field is important and what researchers have currently studied on this topic.