Deep Learning for Healthcare Final Project
For Deep Learning for Healthcare, the objective for the project was to recreate a published deep learning model. I found this project one of the most interesting and learned a great deal about the formation and testing of deep learning models. This project is based on the research of Batuhan Bardak and Mehmet Tan in the paper Implementation for Improving Clinical Outcome Predictions Using Convolution over Medical Entities with Multimodal Learning and focused on improving the prediction of mortality and lenght of stay for patients using Electronic Health Records. Since these records tend to be sparse and high dimentional, the authors created a model that worked with Named Entity Recognition alongside time-series features to improve prediction accuracy.
- Overview
The intent of the paper was to improve predictions on two important clinical risk factors for patients: mortality and length of stay. Through the use of Named Entitiy Recognition (NER) and using a CNN model, authors were able to improve these predictions. The goal of this project was to recreate the successful experiments from the paper and perform ablation studies to determine if different parts of the model could be implemented successfully with different ablations.
- Scope of Reproducibility
My goal was to successfully run the model as a baseline and then implement ablations which changed the learning rate, model type, number of layers and loss function to see how the model performed.
- Methodology
While I was able to reuse the original authors code, I did have to update some libraries and implement some changes in the function calls in order for the code to run smoothly. I also had to correct some variables in the code, as might be expected for code that had been implemented even just a couple of years ago. Overall, the effort to ge the code to run was minimal, yet I had to learn new libraries that I had not worked with before and understand how the code flowed. I was then able to fairly easily run the model with ablations as well.
- Implementation Details
The tools that were used to run the project were Jupyter notebooks, Keras, Sci-Kit, Tensorflow, and Python. The dataset was taken from the MIMIC-III data made available through the university. I was able to use a powerful computer at home with a EVGA 3080 TI FTW GPU, 24 core Ryzen Threadripper 3960X, and 32GB RAM. Each iteration of running the models took a day of continuous running in order to complete. The models included in the reserach were a time series baseline, a Multimodal baseline and the proposed CNN architecture.
- Conclusion
In my conclusion, many of the results were similar in performance, with the ablations making the model perform somewhat worse (as expected). I experimented with GRU, LSTM, and implementing focal loss as the loss function.