Machine learning classifiers for detection of abnormal clinical electroencephalography (EEG) signals have advanced signficantly in recent years, largely supported by the carefully curated Temple University Hospital Abnormal EEG Corpus (TUAB). Further progress towards clinically useful tools is likely to require larger volumes of data. In this study, we explore the viability and benefits of fully automated labelling of clinical EEG recordings based on the text in the clinical report, to efficiently exploit larger existing databases. We apply a machine learning classifier to the text reports in the Temple University Hospital EEG Corpus (TUEG) in order to label individual recordings. We show that training a deep convolutional neural network against the resulting dataset yields advantages in the resulting classification performance, namely increased area under the receiver operating characteristic curve and state-of-the-art specificity, albeit with a notable reduction in sensitivity. By demonstrating the viability of automatic report-based labelling, this paper opens the prospect of efficiently utilising the huge amount of historical EEG data in global medical archives to enhance the training of machine learning classifiers, either for enhanced general performance or bespoke training/evaluation for local populations.
Western, D., Weber, T., Kandasamy, R., May, F., Taylor, S., Zhu, Y., & Canham, L. (2022). Automatic report-based labelling of clinical EEGs for classifier training. In 2021 IEEE Signal Processing in Medicine and Biology Symposium (SPMB). https://doi.org/10.1109/SPMB52430.2021.9672295