A novel abnormality annotation database for covid-19 affected frontal lung x-rays



To advance the usage of CXRs as a viable solution for efficient COVID-19 diagnostics by providing large-scale annotations of the abnormalities in frontal CXRs in BIMCV-COVID19+ database, and to provide a robust evaluation mechanism to facilitate its usage.

Materials and Methods:

We provide the abnormality annotations in frontal CXRs by creating bounding boxes. The frontal CXRs are a part of the existing BIMCV-COVID19+ database. We also define four different protocols for robust evaluation of semantic segmentation and classification algorithms. Finally, we benchmark the defined protocols and report the results using popular deep learning models as a part of this study.


For semantic segmentation, Mask-RCNN performs the best among all the models with a DICE score of 0.43 ± 0.01. For classification, we observe that MobileNetv2 yields the best results for 2-class and 3-class classification. We also observe that deep models report a lower performance for classifying other classes apart from the COVID class.


By making the annotated data and protocols available to the scientific community, we aim to advance the usage of CXRs as a viable solution for efficient COVID-19 diagnostics. This large-scale data will be useful for ML algorithms and can be used for learning radiological patterns observed in COVID-19 patients. Further, the protocols will facilitate ML practitioners for unified large-scale evaluation of their algorithms.

For full paper: http://https://www.medrxiv.org/content/10.1101/2021.01.07.21249323v2