
Lung cancer is a deadly disease. An early diagnosis can significantly improve the patient survival and quality of life. One potential solution is using deep learning (DL) algorithms to automate the diagnosis using patient computed tomography (CT) scans. However, the limited availability of training data and the computational complexity of existing algorithms, as well as their reliance on high-performance systems, limit the potential of DL algorithms. To improve early lung cancer diagnoses, this study proposes a low-cost convolutional neural network (CNN) that uses a Mavage pooling technique to diagnose lung cancers.
The DL-based model uses five convolution layers with two residual connections and Mavage pooling layers. We trained the CNN using two publicly available datasets comprised of the IQ_OTH/NCCD dataset and the chest CT scan dataset. Additionally, we integrated the Mavage pooling in the AlexNet, ResNet-50, and GoogLeNet architectures to analyze the datasets. We evaluated the performance of the models based on accuracy and the area under the receiver operating characteristic curve (AUROC).
The CNN model achieved a 99.70% accuracy and a 99.66% AUROC when the scans were classified as either cancerous or non-cancerous. It achieved a 90.24% accuracy and a 94.63% AUROC when the scans were classified as containing either normal, benign, or malignant nodules. It achieved a 95.56% accuracy and a 99.37% AUROC when lung cancers were classified. Additionally, the results indicated that the diagnostic abilities of AlexNet, ResNet-50, and GoogLeNet were improved with the introduction of the Mavage pooling technique.
This study shows that a low-cost CNN can effectively diagnose lung cancers from patient CT scans. Utilizing Mavage pooling technique significantly improves the CNN diagnostic capabilities.
Citation: Ayomide Abe, Mpumelelo Nyathi, Akintunde Okunade. Lung cancer diagnosis from computed tomography scans using convolutional neural network architecture with Mavage pooling technique[J]. AIMS Medical Science, 2025, 12(1): 13-27. doi: 10.3934/medsci.2025002
[1] | Anuj A. Shukla, Shreya Podder, Sana R. Chaudry, Bryan S. Benn, Jonathan S. Kurman . Non-small cell lung cancer: epidemiology, screening, diagnosis, and treatment. AIMS Medical Science, 2022, 9(2): 348-361. doi: 10.3934/medsci.2022016 |
[2] | Sherven Sharma, Pournima Kadam, Ram P Singh, Michael Davoodi, Maie St John, Jay M Lee . CCL21-DC tumor antigen vaccine augments anti-PD-1 therapy in lung cancer. AIMS Medical Science, 2021, 8(4): 269-275. doi: 10.3934/medsci.2021022 |
[3] | Yonggang Cui, Giuseppe S. Camarda, Anwar Hossain, Ge Yang, Utpal N. Roy, Terry Lall, Ralph B. James . Modeling an Interwoven Collimator for A 3D Endocavity Gamma Camera. AIMS Medical Science, 2016, 3(1): 114-125. doi: 10.3934/medsci.2016.1.114 |
[4] | Van Tuan Nguyen, Anh Tuan Tran, Nguyen Quyen Le, Thi Huong Nguyen . The features of computed tomography and digital subtraction angiography images of ruptured cerebral arteriovenous malformation. AIMS Medical Science, 2021, 8(2): 105-115. doi: 10.3934/medsci.2021011 |
[5] | Masoud Nazemiyeh, Mehrzad Hajalilou, Mohsen Rajabnia, Akbar Sharifi, Sabah Hasani . Diagnostic value of Endothelin 1 as a marker for diagnosis of pulmonary parenchyma involvement in patients with systemic sclerosis. AIMS Medical Science, 2020, 7(3): 234-242. doi: 10.3934/medsci.2020014 |
[6] | Ka-man Fong, Shek-yin Au, Ka-lee Lily Chan, Wing-yiu George Ng . Update on management of acute respiratory distress syndrome. AIMS Medical Science, 2018, 5(2): 145-161. doi: 10.3934/medsci.2018.2.145 |
[7] | Gabriele Fici, Alessio Langiu, Giosuè Lo Bosco, Riccardo Rizzo . Bacteria classification using minimal absent words. AIMS Medical Science, 2018, 5(1): 23-32. doi: 10.3934/medsci.2018.1.23 |
[8] | Qingxue Zhang, Dian Zhou, Xuan Zeng . Machine Learning-Empowered Biometric Methods for Biomedicine Applications. AIMS Medical Science, 2017, 4(3): 274-290. doi: 10.3934/medsci.2017.3.274 |
[9] | Salma M. AlDallal . Quick glance at Fanconi anemia and BRCA2/FANCD1. AIMS Medical Science, 2019, 6(4): 326-336. doi: 10.3934/medsci.2019.4.326 |
[10] | Gerard Marx, Chaim Gilon . The Molecular Basis of Neural Memory. Part 7: Neural Intelligence (NI) versus Artificial Intelligence (AI). AIMS Medical Science, 2017, 4(3): 241-260. doi: 10.3934/medsci.2017.3.241 |
Lung cancer is a deadly disease. An early diagnosis can significantly improve the patient survival and quality of life. One potential solution is using deep learning (DL) algorithms to automate the diagnosis using patient computed tomography (CT) scans. However, the limited availability of training data and the computational complexity of existing algorithms, as well as their reliance on high-performance systems, limit the potential of DL algorithms. To improve early lung cancer diagnoses, this study proposes a low-cost convolutional neural network (CNN) that uses a Mavage pooling technique to diagnose lung cancers.
The DL-based model uses five convolution layers with two residual connections and Mavage pooling layers. We trained the CNN using two publicly available datasets comprised of the IQ_OTH/NCCD dataset and the chest CT scan dataset. Additionally, we integrated the Mavage pooling in the AlexNet, ResNet-50, and GoogLeNet architectures to analyze the datasets. We evaluated the performance of the models based on accuracy and the area under the receiver operating characteristic curve (AUROC).
The CNN model achieved a 99.70% accuracy and a 99.66% AUROC when the scans were classified as either cancerous or non-cancerous. It achieved a 90.24% accuracy and a 94.63% AUROC when the scans were classified as containing either normal, benign, or malignant nodules. It achieved a 95.56% accuracy and a 99.37% AUROC when lung cancers were classified. Additionally, the results indicated that the diagnostic abilities of AlexNet, ResNet-50, and GoogLeNet were improved with the introduction of the Mavage pooling technique.
This study shows that a low-cost CNN can effectively diagnose lung cancers from patient CT scans. Utilizing Mavage pooling technique significantly improves the CNN diagnostic capabilities.
Lung cancer is the leading cause of cancer-related deaths across all genders [1],[2]. The statistics on lung cancer incidences reveal that the disease has an average 5-year survival rate [3]. An early diagnosis is critical for a favorable prognosis and an improved quality of life for patients [4]. Medical images such as chest computed tomography (CT) scans provide doctors with vital tools for early lung cancer diagnoses; however, the onset of the disease may present patterns that are not visible to the human eye [5],[6]. This could result in crucial information being missed and may lead to misdiagnoses [7],[8]. Moreover, the manual analysis of medical images is time-consuming, requires years of clinical experience, and can overburden radiologists, thus resulting in prolonged patient hospital waiting times [9]–[12].
Recent advancements in improving the early diagnosis of lung cancer have focused on automatic detection systems that use computer algorithms to detect malignancies in patient CT scans [13],[14]. Deep learning (DL) is a branch of machine learning (ML) that uses artificial neural networks to automatically extract higher-level features from input data in a hierarchical manner; moreover, it is currently at the forefront of research efforts and has shown potential in the early diagnosis of lung cancer [15]. DL methods have been deployed for the automated detection, classification, and segmentation of pulmonary nodules from CT scans [16]–[18]. Moreover, the results of myriad studies have demonstrated that DL models can surpass CT image-visualizations by radiologist [17],[19].
Despite these successes, there remains a gap between research and clinical applications [20]. The computational complexities of existing algorithms and their reliance on high-performance systems, as well as the limited availability of training data, are major drawbacks that hinder the effectiveness of DL algorithms in lung cancer diagnoses [21]–[23]. Researchers have employed various interventions to improve the performance of DL algorithms for lung cancer diagnoses. These include the use of state-of-the-art convolutional neural network (CNN) models with transfer learning, data augmentation, regularization techniques, and the use of customized CNN models [24],[25]. The customized models have demonstrated superior performances compared to the state-of-the-art CNN models in areas such as pulmonary nodule detection and the classification of lung cancer across various datasets [26]. These findings indicate that the implementation of customized CNN models capable of effectively modeling CT datasets may lead to improvements in lung cancer diagnoses.
This study aims to improve the early diagnosis of lung cancer by introducing a low-cost CNN that utilizes the Mavage pooling technique. Additionally, the study evaluates the effectiveness of the Mavage-pooling technique when incorporated into state-of-the-art CNN models for diagnosing lung cancer.
In 2012, computer vision technologies significantly advanced when Alex Krizhevsky trained a deep learning model using a CNN to categorize approximately 1.2 million images in the ImageNet LSVRC-2010 contest into 1000 classes [27]. This success sparked interest among researchers to extend its application to the medical field, particularly in cancer diagnoses. In 2015, Hua et al. [28] successfully used a CNN trained on the Lung Image Database Consortium dataset to classify pulmonary nodules, thus achieving a 73.3% sensitivity and a 78.7% specificity. In the same year, Ronneberger et al. [29] proposed a DL Unet model that surpassed previous approaches for biomedical image segmentation. In 2016, Gulshan et al. [30] demonstrated that DL algorithms can achieve over a 90% sensitivity and specificity when detecting diabetic retinopathies and macular edemas in retinal fundus photographs. In 2017, the U.S. Food and Drug Administration approved Arterys, a cloud-based DL application. Arterys's CardioAI was launched and analyzed cardiac Magnetic Resonance Imaging (MRI) images within seconds [31]. In the same year, Esteva et al. [32], classified skin lesions using a CNN which was trained using a dataset of 129,450 clinical images consisting of 2032 different diseases and reported a dermatologist-level accuracy. Saouli et al. [33] and Lorenzo et al. [34], performed automated tumor segmentation from MRI images in 2018 and 2019, respectively. In 2019, Moitra and Mandal performed the staging of non-small cell lung cancers using a CNN [35].
Since 2020, there have been significant improvements in DL algorithms for cancer diagnoses. Several studies have been conducted in which DL algorithms were used to diagnose various types of cancers that affected the skeletal, digestive, urinary, muscular, endocrine, lymphatic, respiratory, integumentary, cardiovascular, nervous, and reproductive systems [31],[36]–[40]. Currently, DL algorithms can be used to effectively detect lesions in medical images, segment tumor regions, generate diagnosis reports, propose treatment options, predict disease progression, and forecast patient survival outcomes.
Pooling is a non-linear process commonly used in CNNs to reduce the spatial dimension of a feature map while retaining important information [41]. The application of pooling to a feature map can potentially improve the CNN's performance [42]. Additionally, pooling reduces the memory consumption by decreasing the resolution of the input, thus resulting in faster learning [43]. Two commonly used pooling methods in CNNs are local pooling and global pooling [44]. Local pooling involves pooling data from small local regions, which is achieved by using a pooling window that divides the overall data into multiple local regions [45]. On the other hand, global pooling is used to create a single value representation of every activation present in the feature map [46]. The Max Pooling and Average Pooling techniques are commonly used in CNNs to achieve pooling.
Max Pooling is used to select the most prominent feature of the data within a defined pooling window [27],[47]. Eliminating non-maximal elements within the data can potentially help a model to generalize faster and prevent vanishing gradients in subsequent layers after the pooling layer [48]. On the other hand, Average Pooling computes the mean of the values within a defined pooling window, thus providing a generalized feature representation of the input [49]. Average Pooling helps a model by providing a smoother representation of the feature map, which can prevent outliers and reduce the risk of overfitting [50]. The approach is particularly useful when working with limited data or noisy inputs that could negatively impact the performance [45].
While the Max Pooling and Average Pooling techniques are both commonly used in CNNs, they do have some potential drawbacks [46]. For instance, using Max Pooling on feature maps with a wide range might confuse a model and limit its performance [44]. Moreover, Max Pooling may be less effective when there is a lot of noise in the data [51]. Using Average Pooling for features with values closer to zero could potentially lead to the vanishing gradient problem within the network [52]. Additionally, Average Pooling may result in the model learning noise in the data because all the elements within the pooling window are considered, thus potentially resulting in an unstable network [44]. To tackle these challenges, pooling techniques that incorporate both Max Pooling and Average Pooling may present a potential solution [53],[54].
Kareem et al. [55] developed a ML model using the IQ_OTH/NCCD dataset [56] to classify CT images into three classes: normal, benign, and malignant scans. They applied Gaussian filters, erosion, and an outlining algorithm to enhance the images and create borders around the lungs. Otsu's thresholding technique was used to segment the potential lung nodules from the lung parenchyma. A feature extraction module that consisted of a Gabor filter and a Gray Level Co-occurrence Matrix (GLCM) was employed to analyze the contrast, homogeneity, entropy, energy, and correlation. A support vector machine (SVM) classification module with three kernels was used to classify the CT images. The authors found that combining the Gabor filter and the GLCM with a polynomial kernel resulted in the best classification accuracy. Additionally, they conducted a train/test split of 7:3 and achieved an overall best accuracy of 89.8876%, with a 97.143% sensitivity and a 97.500% specificity.
AL-Huseiny et al. [24] utilized transfer learning with a modified GoogLeNet to detect lung malignancies by retraining it with the IQ_OTH/NCCD dataset. The dataset consisted of two classes of images: malignant and non-malignant. The images were preprocessed by centering, normalizing, and segmenting using techniques such thresholding using Gabor filter, image dilation, and erosion. Bounding boxes were drawn around the segmented lungs to create a rectangular region of interest, which was then passed to the CNN model. The pre-trained GoogLeNet's later layers were fine-tuned to learn deep features from the dataset, and the final layer was modified to perform a binary classification task to predict if an image contained a malignant nodule. After training for 12 epochs, the modified GoogLeNet achieved a 94.38% classification accuracy, a 93.7% specificity, and a 95.08% sensitivity on a total of 249 test images.
In a study by Mamun et al. [26], a custom CNN with 3 convolution layers and Max Pooling was used to classify lung cancers using the Chest CT-Scan images dataset [57]. The authors applied image preprocessing techniques, including resizing the images to a 64 × 64-pixel size, image denoising, image segmentation, and edge smoothing. They used a batch size of 13, employed the Adam optimization function, and trained the CNN for 50 epochs. The authors reported that the CNN achieved a 92.00% accuracy and a 98.21% AUC. Additionally, the authors analyzed the Chest CT-Scan images dataset using the ResNet-50, Xception, and Inception_V3 CNN models and reported accuracies of 84.13%, 82.10%, and 84.13%, respectively.
Publicly available lung cancer data from the Iraq-Oncology Teaching Hospital/National Center for Cancer Diseases (IQ-OTH/NCCD) was used. It consists of 1097 CT images from 110 cases during 2019 [56]. The dataset includes JPG CT images of healthy individuals and those of lung cancer patients with various demographic characteristics. The oncologists and radiologists at the IQ-OTH/NCCD categorized the images into benign, malignant, and normal cases: 15 cases were classified as benign and contained a total of 120 images; 40 cases were classified as malignant and contained 416 images; and 55 cases were classified as normal and contained 561 images.
The Chest CT scan dataset is a publicly available dataset on Kaggle [57]. It consists of 1000 lung CT scans of patients with three different types of lung cancers alongside scans of healthy individuals, all in the JPG format. The identified lung cancer types were adenocarcinoma, squamous cell carcinoma, and large cell carcinoma. The images were divided into training, testing, and validation sets for each category.
The Mavage Pooling technique is derived from a combination of both the Max Pooling and Average Pooling techniques. While it shares similarities with the two techniques, it significantly differs. For any input feature, the Mavage Pooling techniques first makes a copy of the input feature to attain two independent features “A” and “B”. Thereafter, a pooling window of size “n × m” and stride “s” is defined. To perform downsampling, the pooling window is first employed to select the maximum activation of the first input feature “A” using the stride. Second, the pooling window selects the average values of the activation of the second input feature “B” using the stride. Finally, the two downsampled feature maps are summed to attain a single feature map that is passed onto the next layer of the network. Figure 1 describes our approach and how it differs from both Max Pooling and Average Pooling. Similar to Max and Average Pooling, Mavage pooling also allows for padding of the input feature.
We pre-processed the images in the datasets by adjusting their values to have a mean and standard deviation of 0.5, and then resized them to 512 × 512 pixels. To prevent overfitting of the model due to the limited data size, we performed the data augmentation by randomly erasing pixels within a patient CT scan and replacing them with random values. Additionally, we randomly split and concatenated the images along various dimensions to create multiple variations of the original image. We addressed the class imbalance in the benign class within the IQ-OTH/NCCD dataset by oversampling the images.
The CNN architecture proposed, named AutoLungCADetector, consists of a five-layered CNN, as illustrated in Figure 2. All convolution layers use a 3 × 3 kernel size with a stride and padding of 1. The Batch Normalization and ReLU activation functions are applied after each convolution layer. Mavage pooling with a 4 × 4 kernel size and stride of 1 is used at each layer, except at the fifth layer, where Mavage pooling with a 2 × 2 kernel size and stride of 1 is applied. Furthermore, residual connections [58] are introduced between the second and third layers and the fourth and fifth layers to enhance the network's robustness. Then, the output of the fifth layer is flattened and passed on to two fully connected layers.
We examined the impact of the Mavage pooling technique using AutoLungCADetector. Our study involved training a CNN model on the IQ-OTH/NCCD and Chest CT-Scan image datasets. The IQ-OTH/NCCD dataset was used to detect pulmonary nodules in CT scans, while the Chest CT-Scan data was utilized to identify different types of lung cancers. The Chest CT-Scan images were initially divided into training, validation, and test datasets. We trained the model using the training set and then tested it on the test set. On the other hand, the IQ-OTH/NCCD dataset was initially sorted into benign, malignant, and normal CT scans. We serially grouped the images instead of randomizing to ensure that the same patient's scans were not included in both the training and test datasets using a train/test split of 7:3. We conducted experiments using both Max and Average Pooling in the AutoLungCADetector model. To validate our approach, we integrated Average Pooling into three cutting-edge CNN models—AlexNet [27], ResNet-50 [58], and GoogLeNet [59]—and compared the results with the traditional pooling for each selected network. We evaluated the models' performance based on the accuracy and the area under the receiver operating characteristic curve (AUROC).
Our experiments were conducted using Python, version 3.12.2, and PyTorch, version 2.2.2+cu118, within a Jupyter Notebook, version 7.0.8, running on an NVIDIA QUADRO RTX-3000 GPU. We employed the Adam optimization function, the CrossEntropyLoss function, a batch size of 16, and a learning rate set at 1 × 10−4. We used the stepLR scheduler with a step size of 10 [60]. The networks underwent training for 50 epochs, beyond which overfitting negatively impacted the model's performance.
To identify pulmonary nodules in the patient CT scans, we utilized AutoLungCADetector that was trained on the IQ-OTH/NCCD dataset. The results revealed that AutoLungCADetector with Mavage Pooling achieved a 99.70 % accuracy and a 99.53% AUROC when the patients CT scans were classified as either normal or cancerous. It achieved an accuracy of 88.11% and an AUROC of 92.54% when the patients CT scans were classified as either containing benign, malignant, or normal nodules. Furthermore, we substituted the Mavage Pooling layers with Max Pooling and Average Pooling layers. The results indicated that AutoLungCADetector with Max Pooling attained a similar accuracy with Mavage Pooling but recorded an improved AUROC of 0.13% when the patients CT scans were classified as either normal or cancerous. Additionally, it achieved an improved accuracy of 2.13% and an AUROC of 2.09% when the patients CT scans as either containing benign, malignant or normal nodules. However, the Mavage Pooling surpassed the Average Pooling by a 0.31% accuracy when the patients CT scans were classified as either normal or cancerous and a 1.22% accuracy and a 2.49% AUROC when the patients CT scans were classified as either containing benign, malignant, or normal nodules. The detailed results are provided in Table 1.
Number of classes | Pooling technique | Accuracy (%) | AUROC (%) | Number of parameters |
2 | Average pool | 99.39 | 99.64 | 7 Millions |
Max pool | 99.70 | 99.66 | 7 Millions | |
Mavage pool | 99.70 | 99.53 | 7 Millions | |
3 | Average pool | 86.89 | 90.05 | 7 Millions |
Max pool | 90.24 | 94.63 | 7 Millions | |
Mavage pool | 88.11 | 92.54 | 7 Millions |
We implemented AutoLungCADetector to identify lung cancer in patient CT scans using the chest CT scans dataset. Our results showed that AutoLungCADetector with Mavage Pooling achieved the highest classification accuracy of 95.56% and an AUROC of 99.37%, surpassing AutoLungCADetector with Max Pooling by margins of a 3.50% accuracy and a 0.70% AUROC. Additionally, it outperformed AutoLungCADetector with Average Pooling by margins of a 9.21% accuracy and a 1.39% AUROC. The results are presented in Table 2.
Pooling technique | Accuracy (%) | AUROC (%) | Number of parameters |
Average pool | 86.35 | 97.98 | 7 Millions |
Max pool | 92.06 | 98.67 | 7 Millions |
Mavage pool | 95.56 | 99.37 | 7 Millions |
We replaced the local pooling layers in three state-of-the-art CNN models, namely AlexNet, ResNet-50, and GoogLeNet, with Mavage Pooling. We trained the networks on both the IQ-OTH/NCCD and the chest CT scan dataset. The results showed that Mavage Pooling improved the performance of all CNN models. It improved the accuracy and AUROC of AlexNet by 0.62% and 0.17%, ResNet-50 by 0.31% and 1.20%, and GoogLeNet by 0.61% and 0.74%, respectively, when using the IQ-OTH/NCCD dataset. Furthermore, Mavage Pooling improved the accuracy and AUROC of AlexNet by 4.45% and 2.81, ResNet-50 by 6.03% and 2.45%, and GoogLeNet by 1.9% and 0.67%, respectively, when using the chest CT scan dataset. The results showed that GoogLeNet with Mavage Pool achieved the highest classification accuracy of 89.94% and an AUROC of 95.29% when detecting pulmonary nodules in CT scans, and a 93.65% accuracy and a 98.99% AUROC when classifying lung cancers. The results are presented in Table 3.
Dataset | Technique | Accuracy (%) | AUROC (%) | Number of parameters |
IQ-OTH/NCCD | AlexNet | 86.28 | 91.47 | 57 Millions |
AlexNet + Mavage pool | 86.90 | 91.64 | 57 Millions | |
ResNet-50 | 87.80 | 92.80 | 23 Millions | |
ResNet-50 + Mavage pool | 88.11 | 94.00 | 23 Millions | |
GoogLeNet | 89.33 | 94.55 | 6 Millions | |
GoogLeNet + Mavage pool | 89.94 | 95.29 | 6 Millions | |
Chest CT scan | AlexNet | 67.30 | 86.59 | 57 Millions |
AlexNet + Mavage pool | 71.75 | 89.40 | 57 Millions | |
ResNet-50 | 82.86 | 95.84 | 23 Millions | |
ResNet-50 + Mavage pool | 88.89 | 98.29 | 23 Millions | |
GoogLeNet | 91.75 | 98.32 | 6 Millions | |
GoogLeNet + Mavage pool | 93.65 | 98.99 | 6 Millions |
We conducted a performance comparison of AutoLungCADetector with previous studies that analyzed the IQ-OTH/NCCD and Chest CT scan datasets. The results demonstrated that AutoLungCADetector outperformed the previous approaches. It surpassed the approach proposed by AL-Huseiny et al. [24] by 5.32% and surpassed the approached proposed by Kareem et al. [42] by 0.35% on the IQ-OTH/NCCD dataset; moreover, it surpassed the approach by Mamun et al. [26] by 3.56% on the chest CT scan dataset. The detailed comparison is presented in Table 4.
Author | Dataset | Accuracy (%) | Number of classes |
AL-Huseiny et al. [24] | IQ-OTH/NCCD | 94.38 | 2 |
Ours | IQ-OTH/NCCD | 99.70 | |
Kareem et al. [42] | IQ-OTH/NCCD | 89.89 | 3 |
Ours | IQ-OTH/NCCD | 90.24 | |
Mamun et al. [26] | Chest CT scan | 92.00 | 4 |
Ours | Chest CT scan | 95.56 |
Lung cancer is a deadly disease that often goes undiagnosed until later stages, which limits survival outcomes [2]. An early diagnosis can greatly improve the prognosis and quality of life for affected individuals [4]. Artificial intelligence, particularly DL using CNNs shows promise in automating the early diagnoses of lung cancer through the analyses of patient CT scans [13]–[19]. This study utilized a customized CNN (AutoLungCADetector) with a Mavage pooling layer to analyze publicly available lung CT datasets, including the IQ-OTH/NCCD and Chest CT Scan datasets. The approach was validated using three state-of-the-art CNN models.
AutoLungCADetector is comprised of about 7 million parameters. It possesses significantly fewer parameters and demonstrates an improved performance compared to state-of-the-art CNN models like AlexNet [27] and ResNet-50 [58]. The reduced number of parameters led to shorter training times and lower computational costs. The study's findings indicate that AutoLungCADetector is effective in diagnosing lung cancer in patient CT scans. These results corroborate prior studies that utilized customized CNNs to analyze patient chest CT data and reported a superior performance compared to available state-of-the-art CNN models [15],[16],[26]. These results may be due to the original design of state-of-the-art CNNs, which were intended to model large-scale non-medical datasets with varying resolutions. Moreover, chest CT data are limited, and training on large models with deeper layers may lead to underfitting, which can negatively impact the performance [58],[59].
Pooling is a technique used in CNNs to reduce the resolution of an image while preserving important features [44]. AutoLungCADetector incorporated various pooling techniques, including Max Pooling, Average Pooling, and Mavage Pooling. The study results demonstrated that the AutoLungCADetector that utilized Mavage Pooling achieved an improved performance compared to Max Pooling and Average Pooling. Additionally, Mavage Pooling was found to enhance the performance of three state-of-the-art CNNs: AlexNet, ResNet-50, and GoogLeNet, thus indicating its effectiveness as a downsampling technique in providing a more accurate representation of the convolved feature map at each layer of the network compared to Max Pooling and Average Pooling. Furthermore, the results of the study showed that the Mavage Pooling technique is beneficial for models with deeper architectures (Table 3) and may help overcome issues such as vanishing gradients.
CNNs are effective tools for the early diagnosis of lung cancers. A low-cost CNN has the potential for the effective diagnosis of lung cancer from patient CT scans. Utilizing the Mavage Pooling technique as a feature extraction and downsampling technique can significantly improve the CNN diagnostic capabilities. In the future, we shall investigate the effectiveness of the Mavage Pooling when used to diagnose other types of cancers and expand our approach to utilize alternate imaging techniques such as positron emission tomography and magnetic resonance imaging scans.
Ayomide Abe: conceptualization, software, validation, formal analysis, data curation, writing—original draft preparation, visualization, funding acquisition; all authors: methodology, writing—review and editing; Mpumelelo Nyathi and Akintunde Okunade: supervision, project administration. All authors have read and agreed to the published version of the manuscript.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
[1] |
Cruz CSD, Tanoue LT, Matthay RA (2011) Lung cancer: epidemiology, etiology, and prevention. Clin Chest Med 32: 605-644. https://doi.org/10.1016/j.ccm.2011.09.001 ![]() |
[2] |
Bray F, Ferlay J, Soerjomataram I, et al. (2018) Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68: 394-424. https://doi.org/10.3322/caac.21492 ![]() |
[3] |
Schabath MB, Cote ML (2019) Cancer progress and priorities: lung cancer. Cancer Epidemiol Biomarkers Prev 28: 1563-1579. https://doi.org/10.1158/1055-9965.EPI-19-0221 ![]() |
[4] |
Yoon SM, Shaikh T, Hallman M (2017) Therapeutic management options for stage III non-small cell lung cancer. World J Clin Oncol 8: 1-20. https://doi.org/10.5306/wjco.v8.i1.1 ![]() |
[5] |
Hoffman RM, Atallah RP, Struble RD, et al. (2020) Lung cancer screening with low-dose CT: a meta-analysis. J Gen Intern Med 35: 3015-3025. https://doi.org/10.1007/s11606-020-05951-7 ![]() |
[6] |
Silvestri GA, Goldman L, Tanner NT, et al. (2023) Outcomes from more than 1 million people screened for lung cancer with low-dose CT imaging. Chest 164: 241-251. https://doi.org/10.1016/j.chest.2023.02.003 ![]() |
[7] |
Shin HJ, Kim MS, Kho BG, et al. (2020) Delayed diagnosis of lung cancer due to misdiagnosis as worsening of sarcoidosis: a case report. BMC Pulm Med 20: 1-4. https://doi.org/10.1186/s12890-020-1105-2 ![]() |
[8] |
Del Ciello A, Franchi P, Contegiacomo A, et al. (2017) Missed lung cancer: when, where, and why?. Diagn Interv Radiol 23: 118-126. https://doi.org/10.5152/dir.2016.16187 ![]() |
[9] |
Friedland B (2009) Medicolegal issues related to cone beam CT. Semin Orthod 15: 77-84. https://doi.org/10.1053/j.sodo.2008.09.010 ![]() |
[10] |
Krupinski EA, Berbaum KS, Caldwell RT, et al. (2010) Long radiology workdays reduce detection and accommodation accuracy. J Am Coll Radiol 7: 698-704. https://doi.org/10.1016/j.jacr.2010.03.004 ![]() |
[11] |
Abujudeh HH, Boland GW, Kaewlai R, et al. (2010) Abdominal and pelvic computed tomography (CT) interpretation: discrepancy rates among experienced radiologists. Eur Radiol 20: 1952-1957. https://doi.org/10.1007/s00330-010-1763-1 ![]() |
[12] |
Jacobsen MM, Silverstein SC, Quinn M, et al. (2017) Timeliness of access to lung cancer diagnosis and treatment: a scoping literature review. Lung cancer 112: 156-164. https://doi.org/10.1016/j.lungcan.2017.08.011 ![]() |
[13] | Sathyakumar K, Munoz M, Singh J, et al. (2020) Automated lung cancer detection using artificial intelligence (AI) deep convolutional neural networks: a narrative literature review. Cureus 12: e10017. https://doi.org/10.7759/cureus.10017 |
[14] |
Huang S, Yang J, Shen N, et al. (2023) Artificial intelligence in lung cancer diagnosis and prognosis: current application and future perspective. Semin Cancer Biol 89: 30-37. https://doi.org/10.1016/j.semcancer.2023.01.006 ![]() |
[15] |
Ardila D, Kiraly AP, Bharadwaj S, et al. (2019) End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat Med 25: 954-961. https://doi.org/10.1038/s41591-019-0447-x ![]() |
[16] |
Alrahhal MS, Alqhtani E (2021) Deep learning-based system for detection of lung cancer using fusion of features. Int J Comput Sci Mobile Comput 10: 57-67. https://doi.org/10.47760/ijcsmc.2021.v10i02.009 ![]() |
[17] |
Huang X, Shan J, Vaidya V (2017) Lung nodule detection in CT using 3D convolutional neural networks. 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017) : 379-383. https://doi.org/10.1109/ISBI.2017.7950542 ![]() |
[18] |
Guo Y, Song Q, Jiang M, et al. (2021) Histological subtypes classification of lung cancers on CT images using 3D deep learning and radiomics. Acad Radiol 28: e258-e266. https://doi.org/10.1016/j.acra.2020.06.010 ![]() |
[19] |
Cui X, Zheng S, Heuvelmans MA, et al. (2022) Performance of a deep learning-based lung nodule detection system as an alternative reader in a Chinese lung cancer screening program. Eur J Radiol 146: 110068. https://doi.org/10.1016/j.ejrad.2021.110068 ![]() |
[20] |
Dunn B, Pierobon M, Wei Q (2023) Automated classification of lung cancer subtypes using deep learning and CT-scan based radiomic analysis. Bioengineering 10: 690. https://doi.org/10.3390/bioengineering10060690 ![]() |
[21] |
de Margerie-Mellon C, Chassagnon G (2023) Artificial intelligence: a critical review of applications for lung nodule and lung cancer. Diagn Interv Imaging 104: 11-17. https://doi.org/10.1016/j.diii.2022.11.007 ![]() |
[22] |
Ahmed SF, Alam MSB, Hassan M, et al. (2023) Deep learning modelling techniques: current progress, applications, advantages, and challenges. Artif Intell Rev 56: 13521-13617. https://doi.org/10.1007/s10462-023-10466-8 ![]() |
[23] |
Zhang J, Xia Y, Zeng H, et al. (2018) NODULe: combining constrained multi-scale LoG filters with densely dilated 3D deep convolutional neural network for pulmonary nodule detection. Neurocomputing 317: 159-167. https://doi.org/10.1016/j.neucom.2018.08.022 ![]() |
[24] | AL-Huseiny MS, Sajit AS (2021) Transfer learning with GoogLeNet for detection of lung cancer. Indones J Electr Eng Comput Sci 22: 1078-1086. https://doi.org/10.11591/ijeecs.v22.i2.pp1078-1086 |
[25] |
Sakshiwala, Singh MP (2023) A new framework for multi-scale CNN-based malignancy classification of pulmonary lung nodules. J Ambient Intell Human Comput 14: 4675-4683. https://doi.org/10.1007/s12652-022-04368-w ![]() |
[26] |
Mamun M, Mahmud MI, Meherin M, et al. (2023) LCDctCNN: lung cancer diagnosis of CT scan images using CNN based model. 2023 10th International Conference on Signal Processing and Integrated Networks (SPIN) : 205-212. https://doi.org/10.1109/SPIN57001.2023.10116075 ![]() |
[27] | Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25. |
[28] | Hua KL, Hsu CH, Hidayati SC, et al. (2015) Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Ther 8: 2015-2022. https://doi.org/10.2147/OTT.S80733 |
[29] | Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. Eds. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015. Springer International Publishing, 234-241. https://doi.org/10.1007/978-3-319-24574-4_28 |
[30] |
Gulshan V, Peng L, Coram M, et al. (2016) Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316: 2402-2410. https://doi.org/10.1001/jama.2016.17216 ![]() |
[31] |
Kaul V, Enslin S, Gross SA (2020) History of artificial intelligence in medicine. Gastrointest Endosc 92: 807-812. https://doi.org/10.1016/j.gie.2020.06.040 ![]() |
[32] |
Esteva A, Kuprel B, Novoa RA, et al. (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542: 115-118. https://doi.org/10.1038/nature21056 ![]() |
[33] |
Saouli R, Akil M, Kachouri R (2018) Fully automatic brain tumor segmentation using end-to-end incremental deep neural networks in MRI images. Comput Methods Programs Biomed 166: 39-49. https://doi.org/10.1016/j.cmpb.2018.09.007 ![]() |
[34] |
Lorenzo PR, Nalepa J, Bobek-Billewicz B, et al. (2019) Segmenting brain tumors from FLAIR MRI using fully convolutional neural networks. Comput Methods Programs Biomed 176: 135-148. https://doi.org/10.1016/j.cmpb.2019.05.006 ![]() |
[35] |
Moitra D, Mandal RK (2019) Automated AJCC staging of non-small cell lung cancer (NSCLC) using deep convolutional neural network (CNN) and recurrent neural network (RNN). Health Inf Sci Syst 7: 1-12. https://doi.org/10.1007/s13755-019-0077-1 ![]() |
[36] |
Chaunzwa TL, Hosny A, Xu Y, et al. (2021) Deep learning classification of lung cancer histology using CT images. Sci Rep 11: 1-12. https://doi.org/10.1038/s41598-021-84630-x ![]() |
[37] |
Anari S, Tataei Sarshar N, Mahjoori N, et al. (2022) Review of deep learning approaches for thyroid cancer diagnosis. Math Probl Eng 2022: 5052435. https://doi.org/10.1155/2022/5052435 ![]() |
[38] |
Painuli D, Bhardwaj S, Köse U (2022) Recent advancement in cancer diagnosis using machine learning and deep learning techniques: a comprehensive review. Comput Biol Med 146: 105580. https://doi.org/10.1016/j.compbiomed.2022.105580 ![]() |
[39] |
Hosseini SH, Monsefi R, Shadroo S (2024) Deep learning applications for lung cancer diagnosis: a systematic review. Multimed Tools Appl 83: 14305-14335. https://doi.org/10.1007/s11042-023-16046-w ![]() |
[40] |
Aamir M, Rahman Z, Abro WA, et al. (2023) Brain tumor classification utilizing deep features derived from high-quality regions in MRI images. Biomed Signal Proces 85: 104988. https://doi.org/10.1016/j.bspc.2023.104988 ![]() |
[41] |
Sun M, Song Z, Jiang X, et al. (2017) Learning pooling for convolutional neural network. Neurocomputing 224: 96-104. https://doi.org/10.1016/j.neucom.2016.10.049 ![]() |
[42] |
Kumar RL, Kakarla J, Isunuri BV, et al. (2021) Multi-class brain tumor classification using residual network and global average pooling. Multimed Tools Appl 80: 13429-13438. https://doi.org/10.1007/s11042-020-10335-4 ![]() |
[43] |
Zafar A, Aamir M, Mohd Nawi N, et al. (2022) A comparison of pooling methods for convolutional neural networks. Appl Sci 12: 8643. https://doi.org/10.3390/app12178643 ![]() |
[44] |
Nirthika R, Manivannan S, Ramanan A, et al. (2022) Pooling in convolutional neural networks for medical image analysis: a survey and an empirical study. Neural Comput Appl 34: 5321-5347. https://doi.org/10.1007/s00521-022-06953-8 ![]() |
[45] |
Xiong W, Zhang L, Du B, et al. (2017) Combining local and global: rich and robust feature pooling for visual recognition. Pattern Recogn 62: 225-235. https://doi.org/10.1016/j.patcog.2016.08.006 ![]() |
[46] |
Dogan Y (2023) A new global pooling method for deep neural networks: Global average of top-k max-pooling. Trait Signal 40: 577-587. https://doi.org/10.18280/ts.400216 ![]() |
[47] |
Nirthika R, Manivannan S, Ramanan A, et al. (2022) Pooling in convolutional neural networks for medical image analysis: a survey and an empirical study. Neural Comput Applic 34: 5321-5347. https://doi.org/10.1007/s00521-022-06953-8 ![]() |
[48] |
Abuqaddom I, Mahafzah BA, Faris H (2021) Oriented stochastic loss descent algorithm to train very deep multi-layer neural networks without vanishing gradients. Knowledge-Based Systems 230: 107391. https://doi.org/10.1016/j.knosys.2021.107391 ![]() |
[49] |
LeCun Y, Bottou L, Bengio Y, et al. (1998) Gradient-based learning applied to document recognition. Proc IEEE 86: 2278-2324. https://doi.org/10.1109/5.726791 ![]() |
[50] | Sabri N, Hamed HNA, Ibrahim Z, et al. (2020) A comparison between average and max-pooling in convolutional neural network for scoliosis classification. Int J 9. https://doi.org/10.30534/ijatcse/2020/9791.42020 |
[51] |
Hyun J, Seong H, Kim E (2021) Universal pooling–a new pooling method for convolutional neural networks. Expert Syst Appl 180: 115084. https://doi.org/10.1016/j.eswa.2021.115084 ![]() |
[52] |
Özdemir C (2023) Avg-topk: a new pooling method for convolutional neural networks. Expert Syst Appl 223: 119892. https://doi.org/10.1016/j.eswa.2023.119892 ![]() |
[53] |
Zhou Q, Qu Z, Cao C (2021) Mixed pooling and richer attention feature fusion for crack detection. Pattern Recogn Lett 145: 96-102. https://doi.org/10.1016/j.patrec.2021.02.005 ![]() |
[54] | Hou Q, Zhang L, Cheng MM, et al. (2020) Strip pooling: Rethinking spatial pooling for scene parsing. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition : 4003-4012. |
[55] | Kareem HF, AL-Husieny MS, Mohsen FY, et al. (2021) Evaluation of SVM performance in the detection of lung cancer in marked CT scan dataset. Indones J Electr Eng Comput Sci 21: 1731-1738. https://doi.org/10.11591/ijeecs.v21.i3.pp1731-1738 |
[56] | Alyasriy H (2020) The IQ-OTHNCCD lung cancer dataset. Mendeley Data . Available from: https://data.mendeley.com/datasets/bhmdr45bh2/1. Retrieved June 15, 2024 |
[57] | Hany M (2020) Chest CT-scan images dataset. Available from: https://www.kaggle.com/datasets/mohamedhanyyy/chest-ctscan-images. Retrieved June 15, 2024 |
[58] | He K, Zhang X, Ren S, et al. (2016) Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition : 770-778. |
[59] | Szegedy C, Liu W, Jia Y, et al. (2015) Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition : 1-9. |
[60] | Paszke A, Gross S, Massa F, et al. (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32. |
1. | A.A. Abe, M. Nyathi, A.A. Okunade, W. Pilloy, B. Kgole, N. Nyakale, A robust deep learning algorithm for lung cancer detection from computed tomography images, 2025, 11, 26665212, 100203, 10.1016/j.ibmed.2025.100203 |
Number of classes | Pooling technique | Accuracy (%) | AUROC (%) | Number of parameters |
2 | Average pool | 99.39 | 99.64 | 7 Millions |
Max pool | 99.70 | 99.66 | 7 Millions | |
Mavage pool | 99.70 | 99.53 | 7 Millions | |
3 | Average pool | 86.89 | 90.05 | 7 Millions |
Max pool | 90.24 | 94.63 | 7 Millions | |
Mavage pool | 88.11 | 92.54 | 7 Millions |
Pooling technique | Accuracy (%) | AUROC (%) | Number of parameters |
Average pool | 86.35 | 97.98 | 7 Millions |
Max pool | 92.06 | 98.67 | 7 Millions |
Mavage pool | 95.56 | 99.37 | 7 Millions |
Dataset | Technique | Accuracy (%) | AUROC (%) | Number of parameters |
IQ-OTH/NCCD | AlexNet | 86.28 | 91.47 | 57 Millions |
AlexNet + Mavage pool | 86.90 | 91.64 | 57 Millions | |
ResNet-50 | 87.80 | 92.80 | 23 Millions | |
ResNet-50 + Mavage pool | 88.11 | 94.00 | 23 Millions | |
GoogLeNet | 89.33 | 94.55 | 6 Millions | |
GoogLeNet + Mavage pool | 89.94 | 95.29 | 6 Millions | |
Chest CT scan | AlexNet | 67.30 | 86.59 | 57 Millions |
AlexNet + Mavage pool | 71.75 | 89.40 | 57 Millions | |
ResNet-50 | 82.86 | 95.84 | 23 Millions | |
ResNet-50 + Mavage pool | 88.89 | 98.29 | 23 Millions | |
GoogLeNet | 91.75 | 98.32 | 6 Millions | |
GoogLeNet + Mavage pool | 93.65 | 98.99 | 6 Millions |
Author | Dataset | Accuracy (%) | Number of classes |
AL-Huseiny et al. [24] | IQ-OTH/NCCD | 94.38 | 2 |
Ours | IQ-OTH/NCCD | 99.70 | |
Kareem et al. [42] | IQ-OTH/NCCD | 89.89 | 3 |
Ours | IQ-OTH/NCCD | 90.24 | |
Mamun et al. [26] | Chest CT scan | 92.00 | 4 |
Ours | Chest CT scan | 95.56 |
Number of classes | Pooling technique | Accuracy (%) | AUROC (%) | Number of parameters |
2 | Average pool | 99.39 | 99.64 | 7 Millions |
Max pool | 99.70 | 99.66 | 7 Millions | |
Mavage pool | 99.70 | 99.53 | 7 Millions | |
3 | Average pool | 86.89 | 90.05 | 7 Millions |
Max pool | 90.24 | 94.63 | 7 Millions | |
Mavage pool | 88.11 | 92.54 | 7 Millions |
Pooling technique | Accuracy (%) | AUROC (%) | Number of parameters |
Average pool | 86.35 | 97.98 | 7 Millions |
Max pool | 92.06 | 98.67 | 7 Millions |
Mavage pool | 95.56 | 99.37 | 7 Millions |
Dataset | Technique | Accuracy (%) | AUROC (%) | Number of parameters |
IQ-OTH/NCCD | AlexNet | 86.28 | 91.47 | 57 Millions |
AlexNet + Mavage pool | 86.90 | 91.64 | 57 Millions | |
ResNet-50 | 87.80 | 92.80 | 23 Millions | |
ResNet-50 + Mavage pool | 88.11 | 94.00 | 23 Millions | |
GoogLeNet | 89.33 | 94.55 | 6 Millions | |
GoogLeNet + Mavage pool | 89.94 | 95.29 | 6 Millions | |
Chest CT scan | AlexNet | 67.30 | 86.59 | 57 Millions |
AlexNet + Mavage pool | 71.75 | 89.40 | 57 Millions | |
ResNet-50 | 82.86 | 95.84 | 23 Millions | |
ResNet-50 + Mavage pool | 88.89 | 98.29 | 23 Millions | |
GoogLeNet | 91.75 | 98.32 | 6 Millions | |
GoogLeNet + Mavage pool | 93.65 | 98.99 | 6 Millions |
Author | Dataset | Accuracy (%) | Number of classes |
AL-Huseiny et al. [24] | IQ-OTH/NCCD | 94.38 | 2 |
Ours | IQ-OTH/NCCD | 99.70 | |
Kareem et al. [42] | IQ-OTH/NCCD | 89.89 | 3 |
Ours | IQ-OTH/NCCD | 90.24 | |
Mamun et al. [26] | Chest CT scan | 92.00 | 4 |
Ours | Chest CT scan | 95.56 |