Homemachine learning
Covid Prediction
Predicting COVID-19 Fatality using Machine Learning Classification Models
Objective
The objective of this study is to predict COVID-19 fatality to determine patients at greater need of medical treatment.
Material and Methods
Open source data collated from Toronto Public Health and Toronto Open Data was used. Cases were over a 2 year period from Jan 2020 to Dec 2021. We used the features of gender, age group, whether the disease was associated with an outbreak, date at which the person contracted COVID-19, the neighbourhood income, the population density of the neighbourhood, whether the patient was hospitalised, whether the patient required ICU care, whether the patient required intubation, and the features obtained from one hot encoding the source of infection, as predictor variables. Supervised learning with logistic regression, random forest and gradient boosting models were used to predict fatality.
Results
The data from 238 854 COVID-19 cases were analysed. The fatality rate for COVID-19 in Toronto was 1.5%. The fatality rate increased with age and was more than 25% for patients who were more than 90 years old. Patients who required admission to ICU had a more than 10 times increased risk of dying from COVID compared to patients who did not require admission to ICU. Patients who lived in a lower income neighbourhood were more likely to die from COVID than patients who lived in a wealthier neighbourhood.
Scatter plot of neighbourhood income against population denisty - higher fatality rates of Covid cases among lower income population dense neighbourhoods
Conclusion
Machine learning classification models (logistic regression, random forest and gradient boosting) can be used to predict fatality in patients with COVID. The models performed highly accurately in patients who recovered from COVID. However, the models performed less well on patients who eventually died from COVID.