The coronavirus disease 2019 (COVID-19) outbreak has resulted in countless infections and deaths worldwide, posing increasing challenges for the health care system. The use of artificial intelligence to assist in diagnosis not only had a high accuracy rate but also saved time and effort in the sudden outbreak phase with the lack of doctors and medical equipment. This study aimed to propose a weakly supervised COVID-19 classification network (W-COVNet). This network was divided into three main modules: weakly supervised feature selection module (W-FS), deep learning bilinear feature fusion module (DBFF) and Grad-CAM++ based network visualization module (Grad-Ⅴ). The first module, W-FS, mainly removed redundant background features from computed tomography (CT) images, performed feature selection and retained core feature regions. The second module, DBFF, mainly used two symmetric networks to extract different features and thus obtain rich complementary features. The third module, Grad-Ⅴ, allowed the visualization of lesions in unlabeled images. A fivefold cross-validation experiment showed an average classification accuracy of 85.3%, and a comparison with seven advanced classification models showed that our proposed network had a better performance.
Xiangtao Chen, Yuting Bai, Peng Wang, Jiawei Luo .
Data augmentation based semi-supervised method to improve COVID-19 CT classification. Mathematical Biosciences and Engineering, 2023, 20(4): 6838-6852.
doi: 10.3934/mbe.2023294
[2]
Suzan Farhang-Sardroodi, Mohammad Sajjad Ghaemi, Morgan Craig, Hsu Kiang Ooi, Jane M Heffernan .
A machine learning approach to differentiate between COVID-19 and influenza infection using synthetic infection and immune response data. Mathematical Biosciences and Engineering, 2022, 19(6): 5813-5831.
doi: 10.3934/mbe.2022272
[3]
Shubashini Velu .
An efficient, lightweight MobileNetV2-based fine-tuned model for COVID-19 detection using chest X-ray images. Mathematical Biosciences and Engineering, 2023, 20(5): 8400-8427.
doi: 10.3934/mbe.2023368
Ansheng Ye, Xiangbing Zhou, Kai Weng, Yu Gong, Fang Miao, Huimin Zhao .
Image classification of hyperspectral remote sensing using semi-supervised learning algorithm. Mathematical Biosciences and Engineering, 2023, 20(6): 11502-11527.
doi: 10.3934/mbe.2023510
[6]
Javad Hassannataj Joloudari, Faezeh Azizi, Issa Nodehi, Mohammad Ali Nematollahi, Fateme Kamrannejhad, Edris Hassannatajjeloudari, Roohallah Alizadehsani, Sheikh Mohammed Shariful Islam .
Developing a Deep Neural Network model for COVID-19 diagnosis based on CT scan images. Mathematical Biosciences and Engineering, 2023, 20(9): 16236-16258.
doi: 10.3934/mbe.2023725
[7]
Yingjian Yang, Wei Li, Yingwei Guo, Nanrong Zeng, Shicong Wang, Ziran Chen, Yang Liu, Huai Chen, Wenxin Duan, Xian Li, Wei Zhao, Rongchang Chen, Yan Kang .
Lung radiomics features for characterizing and classifying COPD stage based on feature combination strategy and multi-layer perceptron classifier. Mathematical Biosciences and Engineering, 2022, 19(8): 7826-7855.
doi: 10.3934/mbe.2022366
[8]
Zhenwu Xiang, Qi Mao, Jintao Wang, Yi Tian, Yan Zhang, Wenfeng Wang .
Dmbg-Net: Dilated multiresidual boundary guidance network for COVID-19 infection segmentation. Mathematical Biosciences and Engineering, 2023, 20(11): 20135-20154.
doi: 10.3934/mbe.2023892
[9]
Yue Li, Hongmei Jin, Zhanli Li .
A weakly supervised learning-based segmentation network for dental diseases. Mathematical Biosciences and Engineering, 2023, 20(2): 2039-2060.
doi: 10.3934/mbe.2023094
[10]
Yu Li, Meilong Zhu, Guangmin Sun, Jiayang Chen, Xiaorong Zhu, Jinkui Yang .
Weakly supervised training for eye fundus lesion segmentation in patients with diabetic retinopathy. Mathematical Biosciences and Engineering, 2022, 19(5): 5293-5311.
doi: 10.3934/mbe.2022248
Abstract
The coronavirus disease 2019 (COVID-19) outbreak has resulted in countless infections and deaths worldwide, posing increasing challenges for the health care system. The use of artificial intelligence to assist in diagnosis not only had a high accuracy rate but also saved time and effort in the sudden outbreak phase with the lack of doctors and medical equipment. This study aimed to propose a weakly supervised COVID-19 classification network (W-COVNet). This network was divided into three main modules: weakly supervised feature selection module (W-FS), deep learning bilinear feature fusion module (DBFF) and Grad-CAM++ based network visualization module (Grad-Ⅴ). The first module, W-FS, mainly removed redundant background features from computed tomography (CT) images, performed feature selection and retained core feature regions. The second module, DBFF, mainly used two symmetric networks to extract different features and thus obtain rich complementary features. The third module, Grad-Ⅴ, allowed the visualization of lesions in unlabeled images. A fivefold cross-validation experiment showed an average classification accuracy of 85.3%, and a comparison with seven advanced classification models showed that our proposed network had a better performance.
1.
Introduction
Respiratory infections are a major cause of disease and death worldwide. Further, pneumonia affects millions of people every year, posing a significant risk to children, adults aged 65 years and older, and individuals with health problems such as diabetes, obesity, and hypertension. Pneumonia has more than 30 different causes, but its main etiological factors are viruses and bacteria. Coronavirus disease 2019 (COVID-19) spread rapidly, resulting in a large number of deaths among those infected, and affecting economies and health systems worldwide [1]. Although a larger number of people were vaccinated, 6,833,388 deaths were reported by February 10, 2023 [2]. Computed tomography (CT) images of the chest are usually used to determine the exact area and size of the lesion. At present, CT diagnosis relies heavily on physicians' superior skills. Radiologists can make mistakes even after long clinical training and professional guidance due to the complex pathology and subtle textural changes in the images of different lung lesions. Therefore, developing intelligent classification methods for CT images to support disease identification is extremely important. Usually, these classification methods are of three types: 1) attention-guided methods; 2) methods using convolutional neural networks; and 3) label correlation methods. Attention-guided methods usually direct attention to the primary lesion region, but lack attention to the suspicious region. Methods using convolutional neural networks extract different features mainly by changing the network structure, but these methods cannot reject the background information and the acquired features are redundant. Tag correlation methods can obtain the connection between multiple tags and their interdependencies, but the images often have some noisy information that affects the multi-label learning ability. Therefore, this study was significant in examining the images after removing the background, thus effectively avoiding the interference of background information.
In recent years, artificial intelligence and swarm intelligence optimization algorithms have been increasingly used to solve biological problems [3,4], perform disease diagnosis [5,6,7] and address combinatorial optimization problems [8]. The rapid development and application of deep learning have enabled its extensive use in medical image processing for lesion segmentation or detection [9,10,11,12], disease classification [13], noise induction [14], image annotation [15], alignment, regression, and so forth. The continuous improvement in computer hardware and software and the emergence of more functional and powerful neural network models have made deep learning models more capable of extracting image features for better image classification. In this study, we used deep learning to study the CT classification task of COVID-19.
Deep learning has developed rapidly in recent years in the field of artificial intelligence and played an important role in diagnosing COVID-19. Most researchers are using it for intelligent diagnosis [16,17,18]. Wang et al. [19] proposed a model based on ELU (exponential linear unit) called ELUCNN and developed a mobile application based on it. The backbone network of this model was a 10-layer CNN which could help diagnose COVID-19. The experimental results showed that the proposed ELUCNN performed better than 14 other advanced methods. Zhang et al. [20] proposed a deep learning model SNELM using SqueezeNet as the backbone network. This model was combined with extreme learning machine as the classifier. The results of 10 cross-validations on two datasets showed that the model outperformed seven state-of-the-art classification networks. Zhang et al. [21] proposed a new deep learning network called ANC. This model could visualize Gradient-weighted Class Activation Mapping (Grad-CAM) heat map for deep learning networks and expand the training set using 18 data augmentation methods. The validation results on two COVID-19 classification datasets showed that this model performed better than all nine currently popular classification networks. Lai et al. [22] used the segmentation network U-Net for disease diagnosis, which achieved 94.1% accuracy in diagnostic experiments on 1000 COVID-19 images due to the better performance of the U-Net network in segmentation and its less application in classification. Irmak et al. [23] proposed two new CNNs as backbone networks. One CNN could perform COVID-19 and non-COVID-19 binary classification, and the second CNN can perform triple classification. Experiments on COVID-19 chest x-ray images demonstrated its effectiveness. All the aforementioned studies used the whole image as the target of the study, and most of them used a single network to extract features without removing redundant data and extracting single features. While classifying lung diseases, the lesion region is usually located in the central region of the image, and the noise area of the edge background is extremely large, reducing the efficiency and accuracy of feature extraction. The CT images of patients with COVID-19 usually show ground glass–like changes. Also, the size of the lesion region varies depending on the disease, which requires removing the background and focusing on the lesion region. Hence, we performed weakly supervised preprocessing using the data. Also, we used two identical networks to extract complementary features and then combined them linearly to obtain better classification results considering the strong deep learning feature extraction capability. The main contributions of the method proposed in this study are as follows:
① A weakly supervised COVID-19 classification network (W-COVNet) was proposed, which was mainly divided into three modules: weakly supervised feature selection module (W-FS), deep learning bilinear feature fusion module (DBFF) and Grad-CAM++ based network visualization module (Grad-Ⅴ).
② W-FS mainly learned the lung area features by training Data2 to get the segmentation model and then performed feature selection on Data1 using the segmentation model. The new dataset Data1-Seg was obtained through multiple attempts of merit seeking. Five multiple-way data augmentation methods were used for data enhancement to expand the training set.
③ DBFF mainly used migration learning methods with the lightweight network VGG as the backbone network. The network used two identical groups of networks focusing on different regions to extract different features for linear fusion, improving the feature extraction capability. Also, four DBFF schemes were proposed.
④ Grag-V focused on the visual display of deep learning networks using Grad-CAM++. With this module, it was possible to show whether the network could accurately identify focal areas and interpret the classification results favorably.
⑤ The proposed network, W-COVNet, was compared with the seven most popular current classification networks using a fivefold cross-validation method on the same dataset, and the results showed that W-COVNet outperformed the other networks.
This manuscript is structured as follows. In Section 1, we describe diagnostic methods and deep learning–related research of COVID-19. In Section 2, recent studies on COVID-19 disease diagnosis are also discussed. In Section 3, weakly supervised dataset preprocessing methods are introduced, and new models for classification networks and Grad-CAM++ visualization methods are proposed. In Section 4, the specific steps, methods and results used for the experiments are explained and compared with other methods. In Section 5, the results are discussed and analyzed based on the results obtained from the model, and finally conclusions are presented.
2.
Related works
The research of the intelligent diagnosis of diseases can help doctors accurately diagnose various diseases. Xie et al. [24] found that the aberrations in certain genes affect adjacent genes, thus leading to the development of cancer. Further, the small size of cancer samples does not allow easy detection of cancer-causing genes. Therefore, a conceptual learning model was proposed to explore the characteristics of genes effectively via a dimensionality reduction process. Huang et al. [25] proposed an algorithm based on BI-RADS and CART using a decision tree algorithm to classify breast ultrasound data and identify malignant and benign tumors. This solved the problem of the inability of CAD systems to handle data from different sources and the problem of limited transparency to physicians. AI is not interpretable, and therefore AI-based drug recommendations are less credible, Xi et al. [7] used the proposed evaluation metric of traceability rate in an interpretable AI drug recommendation model, leading to an improvement in model performance. Dong et al. [26] proposed MorbidGCN, a GCN-based multi-disease prediction network, taking into account the differences between patients with single and multiple diseases and the relationship between multiple diseases. The network integrated population phenotype information and disease network information and performed better on two large datasets.
Usually, image classification and recognition are performed using global features. For example, Li and Liu [27] used a combination of wavelet packet Tsallis entropy, feedforward neural network and real number coding biogeography–based optimization (RCBBO) methods for the pathological examination of the brain and found optimal weights and biases. Fulton et al. [28] built a machine learning model using ResNet-50 to classify Alzheimer's disease. The prediction accuracy was 99% for the validation set triple classification and 99.34% for the training set.
However, different diseases and people have different sizes of diseased areas, and their locations are difficult to predict. Therefore, a large number of noisy areas exist when using global images for feature extraction. If the lesion area can be found accurately, its features can be extracted more easily, thus improving the classification accuracy. Hence, two aspects can be considered. First, methods such as the attention mechanism can be used to guide the network to focus on the lesion area; second, useless background areas can be removed so that the network can narrow its focus and improve its ability to obtain valid features. Guan et al. [29] proposed a three-branch attention-guided ResNet50 as the backbone of the convolution neural network (AG-CNN), which first acquired the mask of the focal area under the attention heat map guidance, cut it out as the branch feature extraction region and fused the global and local features. The final experimental result on the ChestX-ray14 dataset was an average Area under curve (AUC) value of 0.841. Ypsilantis et al. [30] mimicked the human visual attention mechanism using recurrent attention. Pesce et al. [31] proposed two network architectures, one consisting of a recurrent attention model and the other using weak markers and manually circling bounding boxes, which could accurately circle lung nodules. Togacar et al. [32] fused features from SqueezeNet and MobileNetV2 and used a new feature selection and combination method: social mimic optimization. Finally, the classification accuracy of COVID-19 reached 100% using a support vector machine for classification, and this model could be applied to mobile devices. Cohen et al. [33] proposed a COVID-19 virus severity network (CSSNet). First, seven large non-COVID-19 chest x-ray datasets were pretrained to obtain correlation features. The severity of infection was then evaluated on a small number of COVID-19 datasets. Ni et al. [34] used the proposed NiNet to segment and detect lesion regions on CT images of a few patients with COVID-19. Ko et al. [35] used a migration learning approach to find the optimal one from four commonly used deep learning networks as the backbone network. The proposed deep learning framework FCONet was used to tri-classify the enhanced images with an accuracy of 96.97%. Wang et al. [36] first obtained a lung segmentation model in an unsupervised manner and then proposed a new lightweight method based on the weakly supervised detection of COVID-19. Khan et al. [37] fused two different datasets, pretrained them using migration learning, and then performed the classification and detection of COVID-19 with 98.2% accuracy for triple classification and 95% accuracy for quadruple classification. They named the proposed network as Coro-Net. Hussain et al. [38] used a fivefold cross-validation for COVID-19 and normal CXR dichotomous classification with accuracy, sensitivity, and specificity results of 100%.
The aforementioned related studies for lung segmentation mainly focused on x-ray images, with fewer studies on the segmentation of CT images. Most of the new methods for the classification of COVID-19 extracted global features, but with the problem of feature redundancy. In this study, a weakly supervised method was used to obtain lung fields for COVID-19 CT images and perform feature selection. We improved the deep learning VGG network and proposed a new bilinear method to obtain effective features.
3.
Methodology
3.1. Dataset and preprocessing
Two main datasets were used in this study, both of which were obtained from open-source websites. The first dataset (Data1) contained 397 non-COVID-19 CT images and 349 COVID-19 CT images. This dataset was validated by radiologists at Tongji Hospital in Wuhan, China [39]. The second dataset (Data2), COVID-19 CT segmentation [40], had 100 axial CT images from more than 40 patients. This dataset had COVID-19 lung CT masks that were labeled, it was mainly used for effective semi-supervised lung region segmentation. All CT images were collected by the Italian Society of Medical and Interventional Radiology.
In this study, we introduced the Inf-Net [41] segmentation model. The backbone of this model was Res2Net, which could be trained on a small number of labeled images via a reverse attention mechanism and showed edge detection attention. It could effectively perform lesion segmentation and showed the best performance compared with five currently popular segmentation networks. Six different metrics were used for performance evaluation in the COVID-19 lesion segmentation dataset, namely, Dice, Specificity (Spec.), Sensitivity (Sen.), Structure Measure (Sα), Enhanced-alignment Measure (Emeanφ) and Mean Absolute Error (MAE) (Table 1). As lung segmentation was required in this study, the first step involved obtaining the edge image from the ground truth (GT) map of the lung region of the Data2 dataset with lung region markers, and the second step used the GT map with the edge map to apply Inf-Net training to obtain the lung segmentation. The third step employed the obtained lung segmentation via the semi-supervised segmentation network Semi-Inf-Net [41] using the lung segmentation parameters to segment the lungs of Data1. In the fourth step, the weakly supervised feature selection module (W-FS) method was proposed because some bias existed in the training process, and some images had poor lung segmentation, as shown in Figure 1. Further, features within the lung region were removed. Data1 was trained three times for lung region segmentation, and the image with the best result was selected for data optimization. Finally, a new complete lung region image dataset Data1-Seg was obtained.
Table 1.
Performance comparison of Inf-Net with five popular segmentation networks [41].
Data1-Seg contained two types of CT images: COVID-19 and non-COVID. We resized the set D1 of the Data1-Seg to a consistent size of 224 × 224 to obtain the new set D2 of images, as shown in Eq (1).
D2=Resize(D1,[224,224])={d11,d12,d13,…,d1n}
(1)
The dataset was randomly divided into two subsets by the random hold-out method: the testing set (X: 20%) and the training set (Y: 80%). The relevant data partitioning information is shown in Table 2. The relevant sizes related to these subsets satisfied Eq (2). To address the issue of limited data, the MDA [42] technique was applied to expand the data and reduce the occurrence of overfitting. Five data enhancement methods were applied to the training set, turning it from the original one image to six images.
|D2|RHO→|X|+|Y|=|{x1,...,xi}|+|{y1,...,yj}|
(2)
Table 2.
Distribution of Data1-Seg dataset in the model.
where i is the number of X (testing set) images, j is the number of Y (training set) images, and |.| refers to the cardinality of a set.
It was assumed that KMDA MDA techniques (KMDA = {K1, K2, K3, K4, K5}) were used. In this study, KMDA MDA techniques included gamma correction, rotation, noise injection, brighter and darker. The Ntra training set images were generated using each MDA technique, and finally, KMDA × Ntra images were generated for all MDAs. The following five MDA techniques were mainly used in this study:
① Gamma correction (Ga_Co)
The gamma correction factor rG_C = 1.5 was used to produce new images as follows:
where →yK(i) means that the data augmentation was the concatenation of five MDA results.
y(i)MDA→concat[y(i),→yK(i)]
(9)
where y(i) means that the training set consisted of the original and augmented images.
3.3. Bilinear feature extraction
Transfer learning uses old knowledge to learn new knowledge, and the main goal is to transfer the learned knowledge to a new domain quickly. In this study, we used the idea of transfer learning to migrate the network after training in ImageNet, which could reduce the network parameters and converge quickly. Further, we used VGG for migration learning, a simple and robust network with four main structures: VGG11, VGG13, VGG16 and VGG19. We conducted experiments on each of them. Figure 2 shows the structure of VGG16.
The symmetric complementary bilinear classification network had two feature extractors, and the two sets of extracted features had local complementary characteristics for fine-grained images with scattered lesion areas such as COVID-19. Then, linear multiplication was performed to combine the features efficiently and obtain richer features. Figure 3 shows the symmetric complementary bilinear module using two VGG networks.
Figure 3.
Flow chart of the symmetric complementary bilinear module.
Deep learning is not interpretable and is usually understood as a black box. Grad-CAM (Gradient-weighted Class Activation Mapping) [43] is a heat map that can be loaded with network parameters and model structures to visualize deep learning networks. Usually, the network feature maps are processed by using global average pooling (Figure 4), and each feature map contained information of different classes.
In Grad-CAM, the weights are calculated by averaging the gradients of the feature maps globally. The more accurate features are further back in the network. Grad-CAM usually shows the features of the last convolutional layer, as shown in Eq (10).
ωck=1A∑i∑j∂gc∂Bkij
(10)
where ωck denotes the weight of the kth feature map corresponding to category c, and by using the heat map, gradient is calculated. A denotes the number of feature map pixels, gc denotes the gradient score of category c, and Bkij denotes the pixel value at the (i, j) coordinate in the kth feature map.
The heat map is obtained by first obtaining the category weights corresponding to all the feature maps and then performing a weighted summation, as shown in Eq (11).
LcGrad−CAM=relu(∑iωckBk)
(11)
where Bk denotes the kth feature map. Grad-CAM++ [44] is more effective than Grad-CAM, mainly because the weights are more complex and accurately obtained, and the positioning is precise.
3.5. Proposed approach
In this study, we proposed a weakly supervised COVID-19 classification network (W-COVNet), which consisted of three main parts: Part 1, the weakly supervised feature selection module, which was mainly used for image lung segmentation; Part 2, the deep learning bilinear feature fusion module; and Part 3, the Grad-CAM++ based network visualization module. W-COVNet could perform feature selection and focus on effective regions, thus improving classification accuracy while visualizing focal areas. Figure 5 shows the overall framework. To elaborate further, pseudo code for the W-COVNet algorithm is given in Algorithm 1.
The study was executed in a Linux environment on a 32 GB Tesla V100 graphics card-equipped NVIDIA DGX Station deep learning workstation. The complete code, including data preprocessing and algorithm implementation, was implemented in Python. It made use of libraries such as NumPy and PyTorch, a deep learning toolset. The model's tuned hyper-parameters were the epochs, learning rate (LR), batch size (BS), and dropout rate (DR). The best experimental results were obtained when epochs = 50, LR = 0.003, BS = 8 and DR = 0.4.
4.2. Classification performance
We used various metrics in the testing set, namely, accuracy (Acc), precision (Pre), sensitivity (Sen), specificity (Spe), recall (Rec) and F1-score (F1−sc), to evaluate the performance of the proposed W-COVNet method. The corresponding equations were as follows.
Acc=TP(Y)+TN(Y)TP(Y)+TN(Y)+FP(Y)+PN(Y)
(12)
Pre=TP(Y)TP(Y)+FP(Y)
(13)
Sen=TP(Y)TP(Y)+FN(Y)
(14)
Spe=TN(Y)TN(Y)+FP(Y)
(15)
Rec=TP(Y)TP(Y)+FN(Y)
(16)
F1−sc=2∗Pre(Y)∗Rec(Y)Pre(Y)+Rec(Y)
(17)
For the four forms of VGG, we proposed four network schemes with VGG11, VGG13, VGG16 and VGG19 as the backbone networks, as shown in Table 3. We created a histogram display of the results, which are shown in Figure 6. Accuracy refers to the ratio of the number of correctly predicted samples to the total number of predicted samples. Precision is the proportion of positive samples that are actually predicted to be positive. The larger the value of sensitivity is, the greater the "sickness is judged as sickness" and the smaller the "missed detection" (FN). The larger the value of specificity is, the larger the "healthy are judged as healthy" and the smaller the "false detection" (FP). The f1-score (equilibrium average) is the result of the calculation that takes into account the accuracy and completeness of the model, and the value is more inclined to the smaller value of the index. The f1-score indicates how many of the data with the correct real value can be predicted correctly. Table 3 shows that VGG16 results were balanced and performed best in overall prediction ability, with the ability to predict positive samples correctly, and VGG16 also had the smallest false detection rate. The best results were obtained when VGG16 was used as the backbone network. In conclusion, the performance in terms of accuracy, sensitivity, recall and F1-score on the testing set was good. Therefore, the W-COVNet proposed in this study was effective.
Table 3.
Classification of W-COVNet networks after using two kinds of testing set (%).
Figure 6 depicts the W-COVNet training process. Scheme 3 with the best results was selected to plot the training loss and the testing accuracy under each epoch. The horizontal axis is the epoch, the left vertical axis is the test accuracy, and the right vertical axis is the training loss. Figure 7 shows that the training loss became smaller with the iteration of epoch until it reached 0.1 and no longer changed, while the test accuracy became larger with the increase in epoch until it reached 0.86 and no longer changed, indicating that the network converged around the tenth epoch and performed well with high test accuracy.
Figure 7.
Test accuracy and training loss of W-COVNet.
We used the confusion matrix [45] to display the test results of the experimental data. For each class C = 1, 2 (1: COVID-19, 2: non-COVID), we set the COVID-19 class to "positive" and the other class to "negative". The true negatives, true positives, false negatives and false positives were used to identify the diagnosis of CT images with the model. Figure 8 shows the confusion matrix of the W-COVNet with VGG11, VGG13, VGG16 and VGG19 as the backbone proposed in this study.
Figure 8.
Confusion matrix of W-COVNet with VGG11 (a), VGG13 (b), VGG16 (c) and VGG19 (d) as the backbone.
We divided the data into a training set and a testing set due to the small amount of data trained and to prevent overfitting problems during the study. The data were divided into five equal parts; one part of each experiment was used for testing, and the remaining parts were used for training. The experiment was repeated five times to find the average. The first experiment used 20% of the data in the red area as the test set and the rest as the training set. The second experiment used 20% of the second red area as the test set and the rest as the training set, and so on, so that the experimental results were more accurate. A 5-Folds multiple partitioning format allowed the use of the full dataset. Finally, the averaging method could accurately represent the model performance and verify the generalization performance of the network.
We experimentally determined that option 3 was the best result. Therefore, W-COVNet selected VGG16 as the backbone and performed a 5-folds cross-validation using the new dataset Data1-Seg. The results are shown in Table 4.
Table 4.
Classification of W-COVNet networks after using two kinds of the testing set (%).
We conducted ablation experiments to verify the effectiveness of the proposed W-COVNet method. We used VGG16 as the backbone network to conduct ablation experiments. Experiment No.1 used the backbone network VGG16 to classify the Data1-Seg data after feature selection. Experiment No.2 is the result after training the Data1-Seg using feature fusion. Experiment No.3 was the result after preprocessing the images and the training set had enhanced data. Experiment No.4 showed the result after performing feature selection, data enhancement and feature fusion. As shown in Table 5, experiment No.1, No.2, No.3 and No.4 validated the effectiveness of several improvements. Among these, Experiment No.2 and No.3 validate the effectiveness of DBFF data fusion and MDA data enhancement, respectively. Finally No.4 showed the performance of our proposed method W-COVNet. As shown in Table 5, the performance of W-COVNet was the best, and the method was effective.
4.6. Grad-CAM++ based network visualization results
The objective of this study was to use the image set labeled data without pixel labeling of the lesion area. However, the deep learning network could find the lesion by learning, allowing for accurate disease diagnosis. Therefore, the Grad-CAM++ technique was used for visualization operation. Three images were selected for each category, namely COVID-19 and non-COVID. Each image was divided into three groups: the first group was the original CT image; the second group was the image of the lung area after removing the background and keeping only the feature area; and the third group was the heat map visualized using Grad-CAM++. As shown in Figure 9, the red area is the feature area, and the blue area is the non-feature area. If the features are denser, the corresponding feature values increase, leading to a brighter color. The displayed results showed that using Grad-CAM++ could not only provide an effective interpretation of the deep learning network but also help doctors find the lesion areas.
Figure 9.
Use Grad-CAM++ + to visualize the feature area.
We compared the W-COVNet method with seven types of better-performance classification methods: NCA-ResNet [46], Fused-DenseNet-Tiny [47], BCNN_SVM [48], ECOVNet-EfficientNetB3 base [49], COVNet [35], Xception [50] and DTL-V19 [51] to demonstrate its effectiveness. All experiments used the dataset Data1-Seg and the same data preprocessing and performed a fivefold cross-validation. The experimental results are shown in Table 6. Figure 10 and Figure 11 show the test accuracy and training loss trend of our method with these seven comparison experiments, and it can be seen that all converge within 20 epochs, and our proposed W-COVNet has the highest accuracy after convergence. It was observed that our proposed W-COVNet method performed the best, with an accuracy of 85.3%. The extracted features were more accurate using the images with the background removed. The use of two symmetric networks at the same time allowed focus on different features and improved the feature richness. The effectiveness of the network was verified by the experiments. The use of MDA data augmentation prevented the overfitting of the model, thus improving its performance. In addition, the BCNN_SVM backbone network was VGG16 and VGG19, and the DTL-V19 backbone network comprised VGG19. In this study, we also used the VGG network as the backbone network. Comparing the experimental results, we could see that VGG performed better than other networks on this dataset.
Table 6.
Performance comparison of the proposed W-COVNet with other studies (%).
Lung diseases are a major cause of death, and the emergence of COVID-19 in the last 3 years has taken a huge toll on humanity. The sudden outbreak has led to the failure of the medical system because of the high contagiousness of the disease. The doctors and equipment are not sufficient for rapid diagnosis and treatment, and hence artificial intelligence–assisted diagnosis is especially important. In this study, we proposed a weakly supervised COVID-19 classification network (W-COVNet). Given a large amount of redundant information in CT data, W-COVNet could perform feature selection to eliminate unnecessary information and retain only the core feature regions, avoiding interference in the deep learning training process. Meanwhile, bilinear fusion provided a stronger feature representation than a single linear model. Because of the poor visualization of deep learning, W-COVNet also used the currently popular Grad-CAM++ technique to visualize the deep learning process. We conducted extensive experiments to verify the performance of the network and achieved an average classification accuracy of 85.3% on the dataset with background removed, which was better than that of other COVID-19 classification networks and classical classification networks. W-COVNet could extract effective features to help physicians quickly diagnose and discover areas of lesions.
This study had certain limitations. Fewer data led to insufficient training. Moreover, the proposed W-COVNet had two disadvantages: 1) Since the segment was trained on the lung region of CT images, the network could not perform weakly supervised segmentation of CXR (chest X-ray) images. 2) Also, it could not judge the severity of the disease.
This study was supported by the Jilin Provincial Key Laboratory of Intelligent Computing in Medical Image (No. YDZJ202102CXJD026), the Innovation Team Projects of Universities in Guangdong (No. 2022KCXTD057), the Higher Education Natural Science Foundation of Anhui Province (No. 2022AH051099), the Key Project of Science Research in Universities of Anhui Province of China (KJ2021A1066) and the open Foundation of Anhui Engineering Research Center of Intelligent Perception and Elderly Care, Chuzhou University (No. 2022OPA03).
Conflict of interest
The authors report no declarations of interest.
References
[1]
O. A. Ataguba, J. E. Ataguba, Social determinants of health: the role of effective communication in the COVID-19 pandemic in developing countries, Global Health Action, 1 (2020), 1788263. https://doi.org/10.1080/16549716.2020.1788263 doi: 10.1080/16549716.2020.1788263
Q. Huang, X. Huang, Z. Kong, X. Li, D. Tao, Bi-phase evolutionary searching for biclusters in gene expression data, IEEE Trans. Evol. Comput., 5 (2018), 803–814. https://doi.org/10.1109/TEVC.2018.2884521 doi: 10.1109/TEVC.2018.2884521
[4]
Q. Huang, J. Yao, J. Li, M. Li, M. R. Pickering, X. Li, Measurement of quasi-static 3-D knee joint movement based on the registration from CT to US, IEEE Trans. Ultrason. Free, 6 (2020), 1141–50. https://doi.org/10.1109/TUFFC.2020.2965149 doi: 10.1109/TUFFC.2020.2965149
[5]
J. Xi, Z. Miao, L. Liu, X. Yang, W. Zhang, Q. Huang, et al., Knowledge tensor embedding framework with association enhancement for breast ultrasound diagnosis of limited labeled samples, Neurocomputing, 468 (2022), 60–70. https://doi.org/10.1016/j.neucom.2021.10.013 doi: 10.1016/j.neucom.2021.10.013
[6]
Q. Huang, F. Pan, W. Li, F. Yuan, H. Hu, J. Huang, et al., Differential diagnosis of atypical hepatocellular carcinoma in contrast-enhanced ultrasound using spatio-temporal diagnostic semantics, IEEE J. Biomed. Health, 10 (2020), 2860–2869. https://doi.org/10.1109/JBHI.2020.2977937 doi: 10.1109/JBHI.2020.2977937
[7]
J. Xi, D. Wang, X. Yang, W. Zhang, Q. Huang, Cancer omic data based explainable AI drug recommendation inference: A traceability perspective for explainability, Biomed. Signal Process., 79 (2023), 104144. https://doi.org/10.1016/j.bspc.2022.104144 doi: 10.1016/j.bspc.2022.104144
[8]
W. Shi, W. N. Chen, S. Kwong, J. Zhang, H. Wang, T. Gu, et al., A coevolutionary estimation of distribution algorithm for group insurance portfolio, IEEE Trans. Syst. Man CY-S, 11 (2021), 6714–28. https://doi.org/10.1109/TSMC.2021.3096013 doi: 10.1109/TSMC.2021.3096013
[9]
Y. Yuan, M. Chao, Y.C. Lo, Automatic skin lesion segmentation using deep fully convolutional networks with jaccard distance, IEEE Trans. Med. Imaging, 9 (2017), 1876–1886. https://doi.org/10.1109/TMI.2017.2695227 doi: 10.1109/TMI.2017.2695227
[10]
P. Liskowski, K. Krawiec, Segmenting retinal blood vessels with deep neural networks, IEEE Trans. Med. Imaging, 11 (2016), 2369–2380. https://doi.org/10.1109/TMI.2016.2546227 doi: 10.1109/TMI.2016.2546227
[11]
H. Fu, J. Cheng, Y. Xu, D. W. K. Wong, J. Liu, X. Cao, Joint optic disc and cup segmentation based on multi-label deep network and polar transformation, IEEE Trans. Med. Imaging, 7 (2018), 1597–1605. https://doi.org/10.1109/TMI.2018.2791488 doi: 10.1109/TMI.2018.2791488
[12]
H. Fu, Y. Xu, S. Lin, X. Zhang, D. W. K. Wong, J. Liu, et al., Segmentation and quantification for angle-closure glaucoma assessment in anterior segment OCT, IEEE Trans. Med. Imaging, 9 (2017), 1930–1938. https://doi.org/10.1109/TMI.2017.2703147 doi: 10.1109/TMI.2017.2703147
[13]
M. Anthimopoulos, S. Christodoulidis, L. Ebner, A. Christe, S. Mougiakakou, Lung pattern classification for interstitial lung diseases using a deep convolutional neural network, IEEE Trans. Med. Imaging, 5 (2016), 1207–1216. https://doi.org/10.1109/TMI.2016.2535865 doi: 10.1109/TMI.2016.2535865
[14]
J. M. Wolterink, T. Leiner, M. A. Viergever, I. Išgum, Generative adversarial networks for noise reduction in low-dose CT, IEEE Trans. Med. Imaging, 12 (2017), 2536–2345. https://doi.org/10.1109/TMI.2017.2708987 doi: 10.1109/TMI.2017.2708987
[15]
S. Albarqouni, C. Baur, F. Achilles, V. Belagiannis, S. Demirci, N. Navab, Aggnet: Deep learning from crowds for mitosis detection in breast cancer histology images, IEEE Trans. Med. Imaging, 5 (2016), 1313–1321. https://doi.org/10.1109/TMI.2016.2528120 doi: 10.1109/TMI.2016.2528120
[16]
X. Zhang, G. Wang, S. G. Zhao, CapsNet-COVID19: Lung CT image classification method based on CapsNet model, Math. Biosci. Eng., 19 (2022), 5055–5074. https://doi.org/10.3934/mbe.2022236 doi: 10.3934/mbe.2022236
[17]
M. J. Horry, S. Chakraborty, B. Pradhan, M. Fallahpoor, H. Chegeni, M. Paul, Factors determining generalization in deep learning models for scoring COVID-CT images, Math. Biosci. Eng., 18 (2021), 9264–9293. https://doi.org/10.3934/mbe.2021456 doi: 10.3934/mbe.2021456
[18]
A. Singh, K. K. Singh, M. Greguš, I. Izonin, CNGOD-An improved convolution neural network with grasshopper optimization for detection of COVID-19, Math. Biosci. Eng., 19 (2022), 12518–12531. https://doi.org/10.3934/mbe.2022584 doi: 10.3934/mbe.2022584
[19]
S. H. Wang, S. C. Satapathy, M. X. Xie, Y. D. Zhang, ELUCNN for explainable COVID-19 diagnosis, Soft Comput., 2023 (2023), 1–17. https://doi.org/10.1007/s00500-023-07813-w doi: 10.1007/s00500-023-07813-w
[20]
Y. Zhang, M. A. Khan, Z. Zhu, S. Wang, SNELM: SqueezeNet-guided ELM for COVID-19 recognition, Comput. Syst. Sci. Eng., 1 (2023), 13–26. https://doi.org/10.32604/csse.2023.034172 doi: 10.32604/csse.2023.034172
[21]
Y. Zhang, X. Zhang, W. Zhu, ANC: Attention network for COVID-19 explainable diagnosis based on convolutional block attention module, CMES COMP Model. Eng., 3 (2021), 1037–1058. https://doi.org/10.32604/cmes.2021.015807 doi: 10.32604/cmes.2021.015807
[22]
C. C. Lai, T. P. Shih, W. C. Ko, H. J. Tang, P. R. Hsueh, Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges, Int. J. Antimicrob. Ag., 3 (2020), 105924. https://doi.org/10.1016/j.ijantimicag.2020.105924 doi: 10.1016/j.ijantimicag.2020.105924
[23]
E. Irmak, Implementation of convolutional neural network approach for COVID-19 disease detection, Physiol. Genomics, 12 (2020), 590–601. https://doi.org/10.1152/physiolgenomics.00084.2020 doi: 10.1152/physiolgenomics.00084.2020
[24]
F. Xie, J. Xi, Q. Duan, Driver attribute filling for genes in interaction network via modularity subspace-based concept learning from small samples, Complexity, 2020 (2020), 1–12. https://doi.org/10.1155/2020/6643551 doi: 10.1155/2020/6643551
[25]
Q. Huang, F. Zhang, X. Li, A new breast tumor ultrasonography CAD system based on decision tree and BI-RADS features, World Wide Web, 21 (2018), 1491–1504. https://doi.org/10.1007/s11280-017-0522-5 doi: 10.1007/s11280-017-0522-5
[26]
G. Dong, Z. C. Zhang, J. Feng, X. M. Zhao, MorbidGCN: Prediction of multimorbidity with a graph convolutional network based on integration of population phenotypes and disease network, Brief Bioinform., 4 (2022). https://doi.org/10.1093/bib/bbac255 doi: 10.1093/bib/bbac255
[27]
S. Wang, P. Li, P. Chen, P. Phillips, G. Liu, S. Du, et al., Pathological brain detection via wavelet packet tsallis entropy and real-coded biogeography-based optimization, Fund. Inform., 4 (2017), 275–291. https://doi.org/10.3233/FI-2017-1492 doi: 10.3233/FI-2017-1492
[28]
L. V. Fulton, D. Dolezel, J. Harrop, Y. Yan, C. P. Fulton, Classification of Alzheimer's disease with and without imagery using gradient boosted machines and ResNet-50, Brain Sci., 9 (2019), 212. https://doi.org/10.3390/brainsci9090212 doi: 10.3390/brainsci9090212
[29]
Q. Guan, Y. Huang, Z. Zhong, Z. Zheng, L. Zheng, Y. Yang, Diagnose like a radiologist: Attention guided convolutional neural network for thorax disease classification, preprint, arXiv: 1801.09927.
[30]
P. P. Ypsilantis, G. Montana, Learning what to look in chest X-rays with a recurrent visual attention model, preprint, arXiv: 1701.06452.
[31]
E. Pesce, S. J. Withey, P. P. Ypsilantis, R. Bakewell, V. Goh, G. Montana, Learning to detect chest radiographs containing pulmonary lesions using visual attention networks, Med. Image Anal., 53 (2019), 26–38. https://doi.org/10.1016/j.media.2018.12.007 doi: 10.1016/j.media.2018.12.007
[32]
M. Toğaçar, B. Ergen, Z. Cömert, COVID-19 detection using deep learning models to exploit Social Mimic Optimization and structured chest X-ray images using fuzzy color and stacking approaches, Comput. Biol. Med., 121 (2020), 103805.https://doi.org/10.1016/j.compbiomed.2020.103805 doi: 10.1016/j.compbiomed.2020.103805
[33]
J. P. Cohen, L. Dao, K. Roth, P. Morrison, Y. Bengio, A. F. Abbasi, et al., Predicting covid-19 pneumonia severity on chest x-ray with deep learning, Cureus J. Med. Sci., 12 (2020), e9448. https://doi.org/10.7759/cureus.9448 doi: 10.7759/cureus.9448
[34]
Q. Ni, Z. Y. Sun, L. Qi, W. Chen, Y. Yang, L. Wang, et al., A deep learning approach to characterize 2019 coronavirus disease (COVID-19) pneumonia in chest CT images, Eur. Radiol., 30 (2020), 6517–6527. https://doi.org/10.1007/s00330-020-07044-9 doi: 10.1007/s00330-020-07044-9
[35]
H. Ko, H. Chung, W. S. Kang, K. W. Kim, Y. Shin, S. J. Kang, et al., COVID-19 pneumonia diagnosis using a simple 2D deep learning framework with a single chest CT image: model development and validation, J. Med. Int. Res., 6 (2020), e19569. https://doi.org/10.2196/19569 doi: 10.2196/19569
[36]
X. Wang, X. Deng, Q. Fu, Q. Zhou, J. Feng, H. Ma, et al., A weakly-supervised framework for COVID-19 classification and lesion localization from chest CT, IEEE Trans. Med. Imaging, 8 (2020), 2615–2625. https://doi.org/10.1109/TMI.2020.2995965 doi: 10.1109/TMI.2020.2995965
[37]
A. I. Khan, J. L. Shah, M. M. Bhat, CoroNet: A deep neural network for detection and diagnosis of COVID-19 from chest x-ray images, Comput. Method Prog. Bio., 196 (2020), 105581. https://doi.org/10.1016/j.cmpb.2020.105581 doi: 10.1016/j.cmpb.2020.105581
[38]
L. Hussain, T. Nguyen, H. Li, A. A. Abbasi, K. J. Lone, Z. Zhao, et al., Machine-learning classification of texture features of portable chest X-ray accurately classifies COVID-19 lung infection, Biomed. Eng. Online, 19 (2020), 1–18. https://doi.org/10.1186/s12938-020-00831-x doi: 10.1186/s12938-020-00831-x
[39]
J. Zhao, Y. Zhang, X. He, P. Xie, Covid-ct-dataset: a ct scan dataset about covid-19, 2020.
D. P. Fan, T. Zhou, G. P. Ji, Y. Zhou, G. Chen, H. Fu, et al., Inf-net: Automatic covid-19 lung infection segmentation from ct images, IEEE Trans. Med. Imaging, 8 (2020), 2626–2637. https://doi.org/10.1109/TMI.2020.2996645 doi: 10.1109/TMI.2020.2996645
[42]
S. H. Wang, V. V. Govindaraj, J. M. Górriz, X. Zhang, Y. D. Zhang, Covid-19 classification by FGCNet with deep feature fusion from graph convolutional network and convolutional neural network, Inform. Fusion, 67 (2021), 208–229. https://doi.org/10.1016/j.inffus.2020.10.004 doi: 10.1016/j.inffus.2020.10.004
[43]
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, et al., Grad-cam: Visual explanations from deep networks via gradient-based localization, Int. J. Comput. Vision,2 (2020), 336–359. https://doi.org/10.1007/s11263-019-01228-7 doi: 10.1007/s11263-019-01228-7
[44]
A. Chattopadhay, A. Sarkar, P. Howlader, V. N. Balasubramanian, Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks, in 2018 IEEE winter conference on applications of computer vision (WACV), (2018), 839–847. https://doi.org/10.1109/WACV.2018.00097
[45]
D. Chicco, N. Tötsch, G. Jurman, The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation, Biodata Min., 1 (2021), 1–22. https://doi.org/10.1186/s13040-021-00244-z doi: 10.1186/s13040-021-00244-z
[46]
Q. Huang, Y. Lei, W. Xing, C. He, G. Wei, Z. Miao, et al., Evaluation of pulmonary edema using ultrasound imaging in patients with COVID-19 pneumonia based on a non-local Channel attention ResNet, Ultrasound Med. Biol., 5 (2022), 945–953. https://doi.org/10.1016/j.ultrasmedbio.2022.01.023 doi: 10.1016/j.ultrasmedbio.2022.01.023
[47]
F. J. P. Montalbo, Diagnosing Covid-19 chest x-rays with a lightweight truncated DenseNet with partial layer freezing and feature fusion, Biomed. Signal Process., 68 (2021), 102583. https://doi.org/10.1016/j.bspc.2021.102583 doi: 10.1016/j.bspc.2021.102583
[48]
R. Mastouri, N. Khlifa, H. Neji, S. Hantous-Zannad, A bilinear convolutional neural network for lung nodules classification on CT images, Int. J. Comput. Ass Rad, 16 (2021), 91–101. https://doi.org/10.1007/s11548-020-02283-z doi: 10.1007/s11548-020-02283-z
[49]
A. Garg, S. Salehi, M. La Rocca, R. Garner, D. Duncan, Efficient and visualizable convolutional neural networks for COVID-19 classification using Chest CT, Expert Syst. Appl., 195 (2022), 116540. https://doi.org/10.1016/j.eswa.2022.116540 doi: 10.1016/j.eswa.2022.116540
[50]
F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), (2017), 1251–1258. https://doi.org/10.48550/arXiv.1610.02357
[51]
H. Panwar, P. Gupta, M. K. Siddiqui, R. Morales-Menendez, P. Bhardwaj, V. Singh, A deep learning and grad-CAM based color visualization approach for fast detection of COVID-19 cases using chest X-ray and CT-Scan images, Chaos Soliton Fract., 140 (2020), 110190. https://doi.org/10.1016/j.chaos.2020.110190 doi: 10.1016/j.chaos.2020.110190