A USB high resolution lock-in photometer

Simon Bateson; Simon Bateson

doi:10.3934/ElectrEng.2019.1.1

AIMS Electronics and Electrical Engineering

2019, Volume 3, Issue 1: 1-15. doi: 10.3934/ElectrEng.2019.1.1

Previous Article Next Article

Research article Special Issues

A USB high resolution lock-in photometer

Simon Bateson ^,

Health Innovation Centre, Teesside University, Borough Road, Middlesbrough, UK

Received: 29 October 2018 Accepted: 19 November 2018 Published: 20 December 2018

A design is presented, in which the DDC112 current-input analogue-to-digital converter is combined with a PIC microcontroller and associated circuitry to give a simple and economical dual-beam photometer / electrometer. The PIC supervises the operation of the analogue-digital converter and performs a lock-in function using a rolling average filter, which enables intensity measurement to parts per million. It also synchronously controls a regulated current source to drive, typically, one or more power LEDs as light sources. It has been applied in various fluorescence and absorbance measurements and can be used with Cavity Enhanced Absorption Spectrometry to determine ultra-low absorbance. An expanded version has been created with a PIC32 processor handling eight channels for a PixelSensor multispectral detector. All circuit details and source code are made available.

Keywords:

Citation: Simon Bateson. A USB high resolution lock-in photometer[J]. AIMS Electronics and Electrical Engineering, 2019, 3(1): 1-15. doi: 10.3934/ElectrEng.2019.1.1

Related Papers:

[1]	Nicola Bellomo, Francesca Colasuonno, Damián Knopoff, Juan Soler . From a systems theory of sociology to modeling the onset and evolution of criminality. Networks and Heterogeneous Media, 2015, 10(3): 421-441. doi: 10.3934/nhm.2015.10.421
[2]	Nicola Bellomo, Raluca Eftimie, Guido Forni . What is the in-host dynamics of the SARS-CoV-2 virus? A challenge within a multiscale vision of living systems. Networks and Heterogeneous Media, 2024, 19(2): 655-681. doi: 10.3934/nhm.2024029
[3]	Pierre-Emmanuel Jabin . Small populations corrections for selection-mutation models. Networks and Heterogeneous Media, 2012, 7(4): 805-836. doi: 10.3934/nhm.2012.7.805
[4]	Xiaoqian Gong, Benedetto Piccoli . A measure model for the spread of viral infections with mutations. Networks and Heterogeneous Media, 2022, 17(3): 427-442. doi: 10.3934/nhm.2022015
[5]	Pierre Degond, Gadi Fibich, Benedetto Piccoli, Eitan Tadmor . Special issue on modeling and control in social dynamics. Networks and Heterogeneous Media, 2015, 10(3): i-ii. doi: 10.3934/nhm.2015.10.3i
[6]	Xavier Blanc, Claude Le Bris, Frédéric Legoll, Tony Lelièvre . Beyond multiscale and multiphysics: Multimaths for model coupling. Networks and Heterogeneous Media, 2010, 5(3): 423-460. doi: 10.3934/nhm.2010.5.423
[7]	Qi Luo, Ryan Weightman, Sean T. McQuade, Mateo Díaz, Emmanuel Trélat, William Barbour, Dan Work, Samitha Samaranayake, Benedetto Piccoli . Optimization of vaccination for COVID-19 in the midst of a pandemic. Networks and Heterogeneous Media, 2022, 17(3): 443-466. doi: 10.3934/nhm.2022016
[8]	Henri Berestycki, Jean-Pierre Nadal, Nancy Rodíguez . A model of riots dynamics: Shocks, diffusion and thresholds. Networks and Heterogeneous Media, 2015, 10(3): 443-475. doi: 10.3934/nhm.2015.10.443
[9]	Fabio Camilli, Italo Capuzzo Dolcetta, Maurizio Falcone . Preface. Networks and Heterogeneous Media, 2012, 7(2): i-ii. doi: 10.3934/nhm.2012.7.2i
[10]	Giulia Bertaglia, Liu Liu, Lorenzo Pareschi, Xueyu Zhu . Bi-fidelity stochastic collocation methods for epidemic transport models with uncertainties. Networks and Heterogeneous Media, 2022, 17(3): 401-425. doi: 10.3934/nhm.2022013

Abstract

1. Introduction

Chronic obstructive pulmonary disease (COPD) is a heterogeneous inflammatory disease ^[1] characterized by persistent airflow limitations ^[2]. Due to this characteristic, the gold standard for diagnosing and evaluating COPD is the pulmonary function test (PFT) ^[2], which yields the forced expiratory volume in 1 second per forced vital capacity (FEV₁/FVC) and FEV₁ percentage of predicted (FEV₁ % predicted). COPD's primary anatomical and pathophysiological manifestations are small airway lesions and emphysema ^[1,3]. Although PFTs can explain the impact on the symptoms and quality of life of COPD patients ^[4,5], it cannot reflect the change of the lung tissue in COPD patients according to the COPD stage evolution. PFT changes only occur when lung tissue is destroyed to a certain extent. Therefore, it is also difficult for the PFT to identify the etiology of COPD.

Compared with PFTs, computed tomography (CT) has been regarded as the most effective modality for characterizing and quantifying COPD ^[6]. For example, chest CT images can indicate that the patients have suffered from mild lobular central emphysema and reveal decreased exercise tolerance in smokers without airflow limitations in their PFT results ^[7]. In addition, the chest CT images have also been used to quantitatively analyze the bronchial ^[8,9], airway disease ^{[10,11,12,13,14,15]}, emphysema ^[16] and vascular ^[17,18] problems in COPD patients by measuring the parameters of the bronchi and vasculature, or by using the analysis methods for airway disease and emphysema. Furthermore, since radiomics was proposed to mine more information from medical images by using advanced feature analysis in 2007 ^[19], it has been widely used for the analysis of lung disease images ^{[20,21,22,23,24]} and other diseases ^[25,26]. Unlike normal lungs, the lung texture and density of COPD patients are influenced by the increased air abundance ^[20], leading to changes in chest CT images. The radiomics features, which reflect lung texture and density changes, can also predict severe COPD exacerbations ^[27]. They have also been applied to the spirometric assessment of emphysema presence and COPD severity ^[28]. However, radiomics has not been extensively investigated in COPD yet. Currently, there are potential applications of radiomics features in COPD, particularly for the diagnosis, treatment and follow-up of COPD and future directions of radiomics features in COPD ^[29]. Due to the characteristics of COPD, an important reason limiting the development of radiomics in COPD is its diffuse distribution in the lungs. Therefore, it is challenging to segment the COPD regions. Especially, because of the limitations of CT resolution, small airways (diameter < 2 mm) and their associated vessels cannot be segmented from chest CT images. However, COPD results from the joint action of the lung parenchyma. Therefore, the lung radiomics features calculated from the lung parenchyma images are considered in this paper.

Most scholars are committed to improving the classifiers to get better classification results ^[30,31]. However, they ignore the effect of the input features on classification. Therefore, it is necessary to construct the lung radiomics combination features that characterize the COPD stage to improve the classification performance of the existing classifier. COPD and its heart complications (such as a higher resting heart rate) have been studied by using the lung radiomics features ^{[32,33,34,35,36,37]}, but the lung radiomics features have not been applied for COPD stage classification.

Our contributions in this paper are briefly described as follows: 1) Lung radiomics features are applied to COPD stage classification. The best accuracy, precision, recall, F1-score and area under the curve (AUC) of the multi-layer perceptron (MLP) classifier with the 19 lung radiomics features selected by Lasso were 0.80, 0.80, 0.80, 0.80 and 0.94, respectively. 2) Two lung radiomics combination features, Radiomics-FIRST and Radiomics-ALL, were constructed to characterize COPD stage evolution. Radiomics-FIRST or Radiomics-ALL improves the performance of the MLP classifier. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features and Radiomics-ALL improved by 3, 3, 3, 2 and 1%, respectively. 3) Compared to the classic convolutional neural network (CNN) application to chest high-resolution CT (HRCT) images, the machine learning (ML) methods based on lung radiomics features are more suitable and interpretable for the COPD classification.

2. Materials and methods

2.1. Materials

The subjects were Chinese people aged 40 to 79 who were enrolled in the National Clinical Research Center of Respiratory Diseases in China from May 25, 2009 to January 11, 2011. The enrolled subjects rigorously followed this study's inclusion and exclusion criteria ^[38]. The 468 subjects underwent chest HRCT scans at the full inspiration state and PFTs. The COPD stage was diagnosed from Stage 0 to Ⅳ by using the PFT, and according to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) 2008. COPD Stage 0 is diagnosed without COPD according to GOLD, but it may involve some symptoms of respiratory diseases. Please refer to our previous study ^[36] for a more detailed description of the materials.

The ethics committee of the National Clinical Research Center of Respiratory Diseases at Guangzhou Medical University in China has approved this study. All 468 subjects submitted written informed consent to the First Affiliated Hospital of Guangzhou Medical University before the chest HRCT scans and PFTs were performed.

2.2. Methods

Our proposed method classifies the COPD stage by applying lung radiomics features selected by Lasso and lung radiomics combination features (characterizing the COPD stage evolution) based on best-performance ML methods. Figure 2 shows the overall block diagram of this study. We further describe our methods in the following sections, i.e., Section 2.2.1 (ROI segmentation), Section 2.2.2 (Lung radiomics feature calculation) and Section 2.2.3 (COPD stage classification).

Figure 1. Subject selection flow diagram and COPD stage distribution of the subjects in this study. (A) Subject selection flow diagram, showing the enrollment, inclusion criteria and exclusion criteria; (B) COPD stage distribution of the subjects in this study.

DownLoad: Full-Size Img PowerPoint

Figure 2. Overall block diagram for the proposed method for this study. (A) Region-of-interest (ROI) segmentation; (B) Lung radiomics feature calculation; (C) COPD stage classification based on ML.

DownLoad: Full-Size Img PowerPoint

Like the previous radiomics feature analysis methods, we also need to segment the region of interest (ROI) in the chest HRCT images and calculate the lung radiomics features based on the ROI. We finally use the lung radiomics features to characterize and classify the COPD stage. Due to the COPD characteristics of diffuse distribution in the lungs, it is challenging to segment the COPD regions. In addition, CT resolution also limits the segmentation of small airways (diameter < 2 mm) and their associated vessels. However, COPD results from the joint action of the lung parenchyma. Therefore, lung parenchyma images segmented from chest HRCT images, such as the ROI, were used to calculate the lung radiomics features in this paper.

2.2.1. ROI segmentation

A trained ResU-Net ^[39] was used to automatically segment the lung region from the chest HRCT images. The lung region includes both the right and left lungs in this study. The architecture of the ResU-Net has been described in detail in our previous paper ^[40]. In addition, three experienced radiologists have checked and modified all of the lung region segmentation images to ensure the accuracy of the segmentation images. Please refer to our previous study ^[36] for a more detailed description of the ROI segmentation process.

2.2.2. Lung radiomics feature calculation

The 468 sets of original lung parenchyma images were extracted from the chest HRCT images using our previous method ^[41]. Figure 3 shows the typical parenchyma images with the Hounsfield unit (HU) value in the transverse plane. PyRadiomics ^[42] with the predefined class of radiomics features was implemented to calculate the lung radiomics features based on the original and derived lung parenchyma images. Finally, 1316 lung radiomics features per subject were obtained. Please also refer to our previous study ^[36] for a more detailed description of the lung radiomics feature calculation.

Figure 3. Typical parenchyma images with the HU value in the transverse plane. The top figures are the lung region segmentation images with red color, and the bottom figures are the corresponding original lung parenchyma images with the HU value.

DownLoad: Full-Size Img PowerPoint

2.2.3. COPD stage classification

Lung radiomics features are the input features of the ML methods to classify the COPD stage. Before the COPD stage classification, the least absolute shrinkage and selection operator (Lasso) ^[43,44] selects the lung radiomics features by establishing the relationship between the lung radiomics features and COPD stages. Then, the selected lung radiomics features are used to pick up the best classifier from the different preset ML methods shown in Figure 3(C). This paper also introduces a radiomics combination strategy to construct the lung radiomics combination features for characterizing COPD stage evolution. Finally, the lung radiomics combination feature characterizing the COPD stage evolution is used to improve the performance of the best classifier.

First, 19 lung radiomics features per subject were selected by Lasso from 1316 lung radiomics features. COPD Stages Ⅲ and Ⅳ were considered as one COPD stage to balance each COPD stage. The mathematical expression (Eq (1)) of the Lasso model ^[37] is

$arg\;min\left\{ {\sum\limits_{i = 1}^n {{{\left( {y_i^* - {\beta _0} - \sum\limits_{j = 1}^p {{\beta _j}} x_{ij}^*} \right)}^2}} + \lambda \sum\limits_{j = 0}^p {\left| {{\beta _j}} \right|} } \right\}$

(1)

where $x_{i j}^{*}$ is the value of the independent variable (1316 lung radiomics features) after a normalization operation, $y_{i}^{*}$ is the value of the dependent variable (COPD Stages 0, Ⅰ, Ⅱ and Ⅲ & Ⅳ), λ is the penalty parameter (λ ≥0), β_j is the regression coefficient, i∈[1, n] and j∈[0, p].

The lung radiomics features of the four COPD stages are normalized by Eq (2).

$x_{ij}^* = \left( {{x_{ij}} - \overline {{x_j}} } \right)/\left( {{x_{jmax}} - {x_{jmin}}} \right)$

(2)

where i = 1~468 (468 subjects), j = 1~1316 (1316 kinds of lung radiomics features for each subject), $X_{i j}$ is the i^th row and j^th column of the 468 × 1316 lung radiomics features and $\overline{x_{j}}$ , ${{x_{jmax}}}$ and $X_{jmin}$ are the mean, maximum and minimum of each kind of lung radiomics feature $x_{j}$ , respectively.

Second, the best-performance classifier, i.e., the MLP classifier, is determined from ML classifiers, such as Random Forest (RF) ^[45], Adaboost (Ada) ^[46], Gradient boosting (GB) ^[47], Multi-layer perceptron (MLP) ^[48], Linear discriminant analysis (LDA) ^[49], and Support vector machine (SVM) ^[50] classifiers, as shown in Figure 2(C). The 468 subjects with the selected lung radiomics features were divided into 70 and 30%. The data for 70% of subjects trained the ML classifiers. Then, the data for 30% of the subjects were used to validate or test the trained ML classifiers. Of course, the labels for the 468 subjects were COPD stages (i.e., 0, 1, 2 and 3).

The evaluation metrics for the classifiers were set as the accuracy, precision, recall, F1-score and AUC (a performance measurement for classification). The AUC evaluation metric was calculated based on the receiver operating characteristic (ROC) curve. The ROC curve, accuracy, precision, recall and F1-score can be calculated and drawn by the confusion matrix, which shows the distribution of the predicted and true labels (the COPD stages). A standard Python package "classification_report" was used to calculate the accuracy, precision, recall and F1-score. The AUC is usually the evaluation metric of binary classification.

Figure 4 shows the confusion matrix and a schematic diagram of the ROC curve drawing for multi-classification. Figure 4(A) shows the confusion matrix of the binary classification. The true positive (TP) and false positive (FP) respectively represent the positive and negative samples predicted to be positive by the classifier. The false negative (FN) and true negative (TN) respectively represent the positive and negative samples predicted to be negative by the classifier. Like Figure 4(A), 4(B) shows that T00–T33 on the diagonal represents the correct classification results, and F represents the wrong classification results. Figure 4(C) shows a schematic diagram of the ROC curve drawing for multi-classification. The test set's COPD stages (the true and predicted label) are encoded by 0's and 1's. The position of 1 represents its classification. For example, the COPD stages 0, Ⅰ(1), Ⅱ(2) and Ⅲ & Ⅳ(3) are encoded as 1000, 0100, 0010 and 0001, respectively. Suppose the predicted label (classification result) is correct. In that case, the position value corresponding to 1 in the probability matrix P generated from the classifier is greater than the probability value of the position corresponding to 0. The coded COPD stages and their probability matrix P were used to draw the ROC curve according to the binary classification method.

Figure 4. Typical parenchyma images with the HU value in the transverse plane. The top figures are the lung region segmentation images with red color, and the bottom figures are the corresponding original lung parenchyma images with the HU value.

DownLoad: Full-Size Img PowerPoint

Third, a radiomics combination strategy was proposed to construct the lung radiomics combination features used to characterize the COPD stage evolution. The lung radiomics combination features can be constructed using the class of the selected radiomics features. Eq (3) is the mathematical form of the lung radiomics combination strategy:

${ Radiomics }-X = \sum\limits_{i = 1}^{N} \beta_{2} x_{2} = \beta_{1} x_{1}+\beta_{2} x_{2}+\cdots+\beta_{N} x_{N}$

(3)

where N is the number of the selected lung radiomics features in each class, and β_i is the coefficient of the selected lung radiomics features x_i generated by Lasso.

The lung radiomics combination features are named Radiomics-X. The symbol "X" in Radiomics-X is the class name of the selected lung radiomics features, such as FIRST, SHAPE, GLCM, GLRLM, GLSZM, NGTDM and GLDM in Figure 2(B). In particular, Radiomics-ALL is constructed by using all selected lung radiomics features and their coefficients generated by Lasso. Finally, Radiomics-FIRST and Radiomics-ALL were picked up from the lung radiomics combination features (P-value < 0.05 between any two COPD stages) to characterize the COPD stage.

Lastly, the 19 selected lung radiomics features with Radiomics-FIRST/Radiomics-ALL were used to train and validate the MLP classifier to improve the performance of the MLP classifier.

2.2.4. Experiments

Figure 5 presents the experimental design of this study to show the lung radiomics features' classification ability, highlight Lasso's role in classification and prove the proposed lung radiomics combination strategy's effectiveness in improving the performance of the MLP classifier.

Figure 5. Experimental design of this study.

DownLoad: Full-Size Img PowerPoint

First, to compare the classification abilities of CNNs derived from chest HRCT image results and ML methods based on the lung radiomics features, we adopted two classic CNNs, DenseNet and GoogleNet, which achieved the best classification performance in our previous study ^[51]. The input images of DenseNet (2D/3D) and GoogleNet (2D/3D) were the original chest HRCT images or original parenchyma images, respectively. To obtain good classification performance based on the chest HRCT image results, before inputting the two classic CNNs, we also applied the following processes to the original chest HRCT images: 1) deleting the non-lung-region images (Fine selection); 2) deleting 1/6 images at the beginning and the end of all the slicers, respectively (Rough selection) and 3) applying multiple-instance learning ^[52]. After Rough selection, only the middle 4/6 slicers of the original chest HRCT images are used for COPD classification. In addition, multiple-instance learning was also applied to the original parenchyma images before inputting to the two classic CNNs. Table 1 shows the chest HRCT image data set division for the two classic CNNs. Tables 2 and 3 show the detailed parameters that were set for DenseNet and GoogleNet training. For the 2D CNN, the classification results were decided by the mean probability of all slices of the chest HRCT images. However, for the 3D CNN, the classification results were determined by the probability of the selected slices from the chest HRCT images. The numbers of selected slices were 20 slices and 16 slices, as shown in Tables 2 and 3.

Table 1. Chest HRCT image data set division for the two classic CNNs.

Data set (6:1:3)	Stage 0	Stage Ⅰ	Stage Ⅱ	Stage Ⅲ & Ⅳ	Total
Training set	76 subjects	65 subjects	75 subjects	64 subjects	280 subjects
Training set	43,694 images	40,550 images	43,510 images	40,944 images	168,698 images
Validation set	13 subjects	11 subjects	12 subjects	11 subjects	47 subjects
Validation set	7705 images	6981 images	7672 images	6924 images	29,282 images
Test set	41 subjects	33 subjects	35 subjects	32 subjects	141 subjects
Test set	23,940 images	21,141 images	22,283 images	20,478 images	87,842 images

| Show Table

DownLoad: CSV

Table 2. Parameters set to train the DenseNet.

DenseNet: Input images	Batch size (2D/3D)	Input size (2D/3D)	Epoch (2D/3D)	Drop rate (2D/3D)
Original chest HRCT images	20/2	512 × 512/512 × 512 × 20^*	50/50	0.5/0.2
Fine selection (HRCT images)
Rough selection (HRCT images)
Rough selection (HRCT images)
Rough selection (HRCT images)
Multiple instance (HRCT images)	16/2	512 × 512^/512 × 512 × 16^*	50/50	0.5/0.2
Multiple instance (parenchyma)	16/2	512 × 512^/512 × 512 × 16^*	50/50	0.5/0.2
Note:* Each case (a set of chest HRCT images) was equally divided into 20 segments, with one slice taken equidistantly to obtain 20 slices in each case. After rough selection, each case was equally divided into 10 bags, with one slice taken randomly to obtain 10 slices in each case. * After rough selection, each case was equally divided into 16 bags, with one slice taken equidistantly to obtain 16 slices in each case.

| Show Table

DownLoad: CSV

Table 3. Parameters set to train the GoogleNet.

DenseNet: Input images	Batch size (2D/3D)	Input size (2D/3D)	Epoch (2D/3D)	Drop rate (2D/3D)
Original chest HRCT images	16/2	512 × 512/512 × 512 × 20^*	50/50	0.2/0.2
Fine selection (HRCT images)
Rough selection (HRCT images)
Rough selection (HRCT images)
Rough selection (HRCT images)
Multiple instance (HRCT images)	16/2	512 × 512^/512 × 512 × 16^*	50/50	0.2/0.2
Multiple instance (parenchyma)	16/2	512 × 512^/512 × 512 × 16^*	50/50	0.2/0.2
Note:* Each case (a set of chest HRCT images) was equally divided into 20 segments, with one slice taken equidistantly to obtain 20 slices in each case. After rough selection, each case was equally divided into 10 bags, with one slice taken randomly to obtain 10 slices in each case. * After rough selection, each case was equally divided into 16 bags, with one slice taken equidistantly to obtain 16 slices in each case.

| Show Table

DownLoad: CSV

Second, the lung radiomics features and their selected lung radiomics features were respectively used to train and test the different ML classifier, as shown in Figure 5, to highlight Lasso's role in classification. The selected lung radiomics features were determined by Lasso from the lung radiomics features directly calculated by PyRadiomics. Then, the ML classifier with the best classification performance was determined.

Finally, the lung radiomics combination features that characterized the COPD stage were used to improve the performance of the best classifier.

3. Results

This section shows the results of Lasso, the radiomics combination strategy and the experiments.

3.1. Lung radiomics features selected by Lasso

Table 4 presents the lung radiomics features selected by Lasso in detail, including the name, class and regression coefficient. To conveniently describe the selected lung radiomics features, we define the selected lung radiomics features as Radiomics1–19. Figure 6 further shows more detailed information on Radiomics1–19. Specifically, Figure 6(A) shows that Radiomics18 was the dominant feature in Radiomics1–19. Figure 6(B) shows that the FIRST class had seven selected lung radiomics features, i.e., the maximum number in all classes. Figure 6(C) also shows that the FIRST class is the most important of all classes.

Table 4. 19 lung radiomics features selected by Lasso.

Definition	Name of the 19 selected lung radiomics features	Class	Coefficient
Radiomics1	original_shape_Elongation	Shape	0.0056
Radiomics2	original_shape_Maximum2DDiameterSlice	Shape	-0.0789
Radiomics3	original_shape_Sphericity	Shape	0.0624
Radiomics4	log.sigma.1.0.mm.3D_firstorder_Maximum	First Order	0.0665
Radiomics5	log.sigma.1.0.mm.3D_glcm_ClusterProminence	GLCM ¹	-0.0425
Radiomics6	log.sigma.1.0.mm.3D_glszm_ZoneEntropy	GLSZM ²	0.0394
Radiomics7	log.sigma.2.0.mm.3D_firstorder_Maximum	First Order	0.0129
Radiomics8	log.sigma.2.0.mm.3D_ngtdm_Contrast	NGTDM ³	-0.0318
Radiomics9	log.sigma.2.0.mm.3D_gldm_DependenceVariance	GLDM ⁴	-0.0136
Radiomics10	log.sigma.4.0.mm.3D_firstorder_10Percentile	First Order	-0.0760
Radiomics11	log.sigma.5.0.mm.3D_firstorder_10Percentile	First Order	-0.1669
Radiomics12	wavelet.LLH_firstorder_RootMeanSquared	First Order	-0.0252
Radiomics13	wavelet.HLH_firstorder_Mean	First Order	0.0599
Radiomics14	wavelet.HLH_glcm_Idmn	GLCM ¹	-0.0022
Radiomics15	wavelet.HLH_ngtdm_Busyness	NGTDM ³	0.0444
Radiomics16	wavelet.HHL_gldm_SmallDependenceLowGrayLevelEmphasis	GLDM ⁴	-0.0168
Radiomics17	wavelet.HHH_glszm_GrayLevelNonUniformityNormalized	GLSZM ²	-0.0043
Radiomics18	wavelet.LLL_firstorder_10Percentile	First Order	-0.5314
Radiomics19	wavelet.LLL_glcm_Imc2	GLCM ¹	0.1383
Note: ¹ Gray level co-occurrence matrix. ² Gray level size zone matrix. ³ Neighboring gray tone difference matrix. ⁴ Gray level dependence matrix.

| Show Table

DownLoad: CSV

Figure 6. Detailed information on Radiomics1–19. (A) Comparison of regression coefficients; (B) Feature numbers for each class; (C) Feature importance for each class.

DownLoad: Full-Size Img PowerPoint

The P-values and significant differences for Radiomics1–19 according to COPD stage evolution were further investigated. A Bonferroni-Dunn multiple comparisons test was applied to calculate the P-values among Radiomics1–19 according to COPD stage. Figure 7(N) and Table 5 show no significant differences for Radiomics14, regardless of COPD stage. Figure 7(A)–(C), (E), (H), (I), (L), (M), (O), (R) and (S) and Table 5 show that only Radiomics1–3, 9, 13, 15 and 19 significantly increased, and that Radiomics5, 8, 12 and 18 significantly decreased with COPD stage evolution from COPD Stage 0 to COPD Stage Ⅰ, respectively. Figure 7(A), (C), (D), (F), (H), (J)–(L), (O), (P), (R) and (S) and Table 5 show that only Radiomics1, 3, 4, 6, 13, 15 and 19 significantly increased, and that Radiomics8, 10–12, 16 and 18 significantly decreased with COPD stage evolution from COPD Stage 0 to COPD Stage Ⅱ, respectively. Figure 7(A), (C)–(E), (F)–(H), (J)–(M) and (O)–(S) and Table 5 show that only Radiomics1, 3, 4, 6, 7, 13, 15 and 19 significantly increased and Radiomics5, 8, 10–12 and 16–18 significantly decreased with COPD stage evolution from COPD Stage 0 to COPD Stages Ⅲ & Ⅳ, respectively. Figure 7(A)–(D), (F)–(K), (M), (R) and (S) and Table 5 show that only Radiomics1, 3, 4, 6, 7, 13 and 19 significantly increased, and that Radiomics2, 8–11 and 18 significantly decreased with COPD stage evolution from COPD Stage Ⅰ to COPD Stages Ⅲ & Ⅳ, respectively. Figure 7(A)–(C), (J)–(L), (O), (R) and (S) and Table 5 show that only Radiomics1–3, 15 and 19 significantly increased, and that Radiomics10–12 and 18 significantly decreased with COPD stage evolution from COPD Stage Ⅱ to COPD Stages Ⅲ & Ⅳ, respectively. Unfortunately, there were no significant differences between at least two COPD stages for the 19 selected lung radiomics features.

Figure 7. Box plots showing the 19 selected lung radiomics features at different COPD stages. (A)–(S) show the box plots for Radiomics1–19 at different COPD stages, respectively.

DownLoad: Full-Size Img PowerPoint

Table 5. P-values for the 19 selected lung radiomics features for the different COPD stages.

Features	0 vs. Ⅰ	0 vs. Ⅱ	0 vs. Ⅲ & Ⅳ	Ⅰ vs. Ⅱ	Ⅰ vs. Ⅲ & Ⅳ	Ⅱ vs. Ⅲ & Ⅳ
Radiomics1	< 0.0001	< 0.0001	< 0.0001	0.9999 (ns¹)	< 0.0001	< 0.0001
Radiomics2	0.0039	0.4406 (ns)	> 0.9999 (ns)	0.5975 (ns)	< 0.0001	0.0164
Radiomics3	0.0004	< 0.0001	< 0.0001	> 0.9999 (ns)	< 0.0001	0.0026
Radiomics4	> 0.9999 (ns)	0.0244	0.0016	0.0066	0.0004	> 0.9999 (ns)
Radiomics5	0.0009	0.4707 (ns)	0.0243	0.2483 (ns)	> 0.9999 (ns)	> 0.9999 (ns)
Radiomics6	0.8609 (ns)	< 0.0001	< 0.0001	0.0016	< 0.0001	0.7552 (ns)
Radiomics7	> 0.9999 (ns)	0.1892 (ns)	0.0005	0.2978 (ns)	0.0013	0.3961 (ns)
Radiomics8	< 0.0001	< 0.0001	< 0.0001	> 0.9999 (ns)	0.0229	0.5546 (ns)
Radiomics9	0.0021	> 0.9999 (ns)	0.6507 (ns)	0.0026	< 0.0001	0.6705 (ns)
Radiomics10	> 0.9999 (ns)	0.0001	< 0.0001	< 0.0001	< 0.0001	0.0045
Radiomics11	0.0626 (ns)	< 0.0001	< 0.0001	0.0001	< 0.0001	0.0055
Radiomics12	< 0.0001	0.0006	< 0.0001	0.4505 (ns)	0.2677 (ns)	0.0006
Radiomics13	< 0.0001	< 0.0001	< 0.0001	0.1717 (ns)	< 0.0001	0.0800 (ns)
Radiomics14	> 0.9999 (ns)	> 0.9999 (ns)	> 0.9999 (ns)	> 0.9999 (ns)	0.1873 (ns)	0.6492 (ns)
Radiomics15	< 0.0001	0.0011	< 0.0001	> 0.9999 (ns)	0.0878 (ns)	0.0019
Radiomics16	0.7928 (ns)	0.0077	0.0005	0.6650 (ns)	0.1161 (ns)	0.8141 (ns)
Radiomics17	> 0.9999 (ns)	0.1001 (ns)	0.0011	> 0.9999 (ns)	0.1153 (ns)	0.9721 (ns)
Radiomics18	< 0.0001	< 0.0001	< 0.0001	0.1691 (ns)	< 0.0001	< 0.0001
Radiomics19	< 0.0001	< 0.0001	< 0.0001	> 0.9999 (ns)	< 0.0001	< 0.0001
Note: ¹ ns: no significance.

| Show Table

DownLoad: CSV

Table 6. P-values for the seven lung radiomics combination features according to COPD stages.

Features	0 vs. Ⅰ	0 vs. Ⅱ	0 vs. Ⅲ & Ⅳ	Ⅰ vs. Ⅱ	Ⅰ vs. Ⅲ & Ⅳ	Ⅱ vs. Ⅲ & Ⅳ
Radiomics-SHAPE	> 0.999 (ns¹)	0.4005 (ns)	< 0.0001	0.2587 (ns)	< 0.0001	0.0003
Radiomics-FIRST	< 0.0001	< 0.0001	< 0.0001	0.0003	< 0.0001	< 0.0001
Radiomics-GLCM	< 0.0001	< 0.0001	< 0.0001	> 0.999 (ns)	< 0.0001	< 0.0001
Radiomics-GLSZM	0.9780 (ns)	< 0.0001	< 0.0001	0.0010	< 0.0001	0.7294 (ns)
Radiomics-NGTDM	< 0.0001	< 0.0001	< 0.0001	> 0.999 (ns)	< 0.0051	0.0211
Radiomics-GLDM	> 0.999 (ns)	0.0111	< 0.0001	0.0057	< 0.0001	0.3038 (ns)
Radiomics-ALL	< 0.0001	< 0.0001	< 0.0001	0.0006	< 0.0001	< 0.0001
Note:¹ ns: no significance.

| Show Table

DownLoad: CSV

3.2. Lung radiomics combination features

The P-values and significant differences among different COPD stages are respectively shown in Figure 8 and Table 3 for the seven lung radiomics combination features. The Bonferroni-Dunn multiple comparisons test was also applied to calculate the P-values for the seven lung radiomics combination features according to COPD stage.

Figure 8. Box plots showing the seven lung radiomics combination features at different COPD stages. (A)–(G) show the box plots for the seven lung radiomics combination features at COPD Stages 0, Ⅰ, Ⅱ and Ⅲ & Ⅳ, respectively.

DownLoad: Full-Size Img PowerPoint

Figure 8(B), (C), (E) and (G) and Table 3 show that only Radiomics-FIRST, Radiomics-GLCM, Radiomics-NGTDM and Radiomics-ALL significantly increased from COPD Stage 0 to Ⅰ, respectively. Figure 8(B)–(G) and Table 3 show that only Radiomics-SHAPE, Radiomics-GLCM, Radiomics-GLSZM, Radiomics-NGTDM, Radiomics-GLDM and Radiomics-ALL significantly increased from COPD Stage 0 to Ⅱ, respectively. Figure 8(A)–(G) and Table 3 show that all seven of the lung radiomics combination features significantly increased from COPD Stage 0 to Stages Ⅲ & Ⅳ, and from COPD Stage Ⅰ to Stages Ⅲ & Ⅳ. Figure 8(B), (E), (F) and (G) and Table 3 show that only Radiomics-FIRST, Radiomics-GLSZM, Radiomics-GLDM and Radiomics-ALL significantly increased from COPD Stage Ⅰ to Ⅱ. Figure 8(A)–(C), (E) and (G) and Table 3 show that only Radiomics-SHAPE, Radiomics-FIRST, Radiomics-GLCM, Radiomics-NGTDM and Radiomics-ALL significantly increased from COPD Stage Ⅱ to Stages Ⅲ & Ⅳ. Therefore, only Radiomics-FIRST and Radiomics-ALL significantly increased with COPD stage evolution (P-value < 0.05).

3.3. Experimental results

This section shows the classification results for the CNN classifier, ML classifier and our proposed method.

3.3.1. Classification results for the CNNs

Figures 9–11 show the classification results for the DenseNet and GoogleNet. The other evaluation metrics in Tables 7 and 8 were calculated from Figures 10 and 11, respectively. In Figures 10 and 11, the confusion matrices visually show the classification effect of each COPD stage.

Figure 9. ROC curves derived from the CNNs. (A) ROC curves from DenseNet; (B) ROC curves from GoogleNet.

DownLoad: Full-Size Img PowerPoint

Figure 10. Confusion matrix results for the DenseNet. (A) Confusion matrix results for the DenseNet with 2D input images; (B) Confusion matrix results for the DenseNet with 3D input images.

DownLoad: Full-Size Img PowerPoint

Figure 11. Confusion matrix results for the GoogleNet. (A) Confusion matrix results for the GoogleNet with 2D input images; (B) Confusion matrix results for the GoogleNet with 3D input images.

DownLoad: Full-Size Img PowerPoint

Figure 12. ROC curves for the different ML classifiers. (A) ROC curves for the different ML classifiers with 1316 lung radiomics features; (B) ROC curves for the different ML classifiers with 19 lung radiomics features selected by Lasso.

DownLoad: Full-Size Img PowerPoint

Table 7. Other evaluation metrics for applying the DenseNet to the test set.

DenseNet: Input images	Accuracy (2D/3D)	Precision (2D/3D)	Recall (2D/3D)	F1-score (2D/3D)
Original chest HRCT images	0.39/0.54	0.22/0.56	0.39/0.54	0.28/0.51
Fine selection (HRCT images)	0.41/0.54	0.38/0.58	0.41/0.54	0.33/0.52
Rough selection (HRCT images)	0.34/0.57	0.45/0.65	0.34/0.57	0.24/0.54
Multiple instance (HRCT images)	0.40/0.59	0.32/0.62	0.40/0.59	0.32/0.57
Original parenchyma images	0.47/0.58	0.47/0.61	0.47/0.58	0.43/0.58
Multiple instance (parenchyma)	0.49/0.50	0.48/0.59	0.49/0.50	0.44/0.44

| Show Table

DownLoad: CSV

Table 8. Other evaluation metrics for applying the GoogleNet to the test set.

GoogleNet: Input images	Accuracy (2D/3D)	Precision (2D/3D)	Recall (2D/3D)	F1-score (2D/3D)
Original chest HRCT images	0.55/0.40	0.67/0.49	0.55/0.40	0.50/0.37
Fine selection (HRCT images)	0.39/0.48	0.40/0.56	0.39/0.48	0.37/0.44
Rough selection (HRCT images)	0.37/0.36	0.31/0.37	0.37/0.36	0.33/0.32
Multiple instance (HRCT images)	0.39/0.38	0.37/0.36	0.39/0.38	0.28/0.33
Original parenchyma images	0.55/0.39	0.56/0.47	0.55/0.39	0.55/0.34
Multiple instance (parenchyma)	0.41/0.49	0.54/0.46	0.41/0.49	0.33/0.43

| Show Table

DownLoad: CSV

Figure 9(A) and Table 7 show that the DenseNet with 3D images (3D DenseNet) had consistently better classification performance (based on the evaluation metrics) than the DenseNet with 2D images (2D DenseNet). Figure 10 intuitively shows the classification results for the 2D and 3D GoogleNet. The classification performance of the 2D and 3D DenseNet with original parenchyma images was better than that with the original chest HRCT images. Compared with the chest HRCT images after the fine selection, the classification performance of the 2D DenseNet with the chest HRCT images after the rough selection was lower for the test set, except for precision. However, the classification ability of the 3D DenseNet with the chest HRCT images after the rough selection was higher than that with the chest HRCT images after the fine selection. In particular, Figure 9(A) shows that the best AUC value (0.82) for the DenseNet was achieved by applying multiple-instance learning to 3D chest HRCT images. Table 7 shows that the other evaluation metrics based on applying multiple-instance learning to the 3D chest HRCT images processed with DenseNet were 0.59 (accuracy), 0.62 (precision), 0.59 (recall) and 0.57 (F1-score). However, it was lower than the rough selection result for the 3D chest HRCT images (0.65) in terms of precision.

Figure 9(B) and Table 8 show that the best performance of GoogleNet was based on the 2D original parenchyma images. Finally, Figure 11 intuitively shows the classification results for the 2D and 3D GoogleNet. Specifically, the classification performance of the 2D GoogleNet with the original chest HRCT images/the original parenchyma images was better than that of the 3D GoogleNet. Furthermore, the classification performance of the 2D GoogleNet with the rough selection of the original chest HRCT images was also better than that of the 3D GoogleNet, except for precision. The classification performance of the 2D GoogleNet with the multiple-instance learning of the original chest HRCT images was also better than that of the 3D GoogleNet, except for the F1-score. However, the classification performance of the 2D GoogleNet with the fine selection of the original chest HRCT images was worse than that of the 3D GoogleNet. Furthermore, the classification performance of the 2D GoogleNet with the multiple-instance learning of the original parenchyma images was also worse than that of the 3D GoogleNet, except for precision. In particular, Figure 9(A) shows that the best AUC value (0.81) for the GoogleNet was achieved with the 2D original parenchyma images. Table 8 further shows that the other evaluation metrics of the 2D original parenchyma images process using GoogleNet were 0.55 (accuracy), 0.56 (precision), 0.55 (recall) and 0.55 (F1-score). It was lower than the 2D original chest HRCT images (0.67) in terms of precision.

For all of the six kinds of input images in Tables 7 and 8, the results show that the classification performance of the 3D DenseNet was better than that of the 3D GoogleNet. However, the classification performance of the 2D GoogleNet with the original chest HRCT images/the original parenchyma images was also better than that of the 2D DenseNet.

3.3.2. Classification results for the ML classifier

The classification performances of different ML classifiers were evaluated by using 1316 lung radiomics features and 19 selected lung radiomics features. In addition, the best-performance classifier (MLP classifier) was also determined, as described in this section.

Figure 13 intuitively shows the classification results for different ML classifiers with 1316 lung radiomics features and 19 selected lung radiomics features. Table 9 reports the classification performance of the classifier with 1316 lung radiomics features. Compared with the classification performance of the CNNs (DenseNet and GoogleNet), that of the ML classifiers with 1316 lung radiomics features had an overwhelming effect on COPD stage classification. The accuracy, precision, recall, F1-score and AUC of the ML classifiers improved significantly. Overall, the classification performance of the ML classifiers was better than the DenseNet with the multiple-instance learning of the 3D chest HRCT images (best performance in DenseNet), even for the worst LDA classifier with 1316 lung radiomics features (except for precision). Compared with the classification performance of the ML classifiers with 1316 lung radiomics features, that of the ML classifiers with 19 selected lung radiomics features was further improved.

Figure 13. Confusion matrix results for the different ML classifiers. (A) Confusion matrix results for the ML classifiers with 1316 lung radiomics features; (B) Confusion matrix results for the ML classifiers with 19 selected lung radiomics features.

DownLoad: Full-Size Img PowerPoint

Table 9. Evaluation metrics for the different ML classifiers with the 1316 lung radiomics features when applied to the test set.

Classifier	Accuracy	Precision	Recall	F1-score	AUC
RF classifier	0.72	0.72	0.72	0.72	0.90
Ada classifier	0.70	0.69	0.70	0.69	0.90
GB classifier	0.72	0.73	0.72	0.72	0.92
MLP classifier	0.78	0.78	0.78	0.78	0.92
LDA classifier	0.60	0.60	0.60	0.59	0.85
SVM classifier	0.62	0.62	0.63	0.61	0.87
RF classifier	0.72	0.72	0.72	0.72	0.90

| Show Table

DownLoad: CSV

Table 9 shows that the accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 1316 lung radiomics features were 0.78, 0.78, 0.78, 0.78 and 0.92, respectively. Therefore, the MLP classifier is regarded as the best-performance classifier for the 1316 lung radiomics features. In addition, Table 10 shows that all of the evaluation metrics improved, and that the MLP classifier was also the best-performance classifier for the 19 selected lung radiomics features. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features were 0.80, 0.80, 0.80, 0.80 and 0.94, respectively.

Table 10. Evaluation metrics for the different ML classifiers with 19 selected lung radiomics features when applied to the test set.

Classifier	Accuracy	Precision	Recall	F1-score	AUC
RF classifier	0.76	0.76	0.76	0.76	0.93
Ada classifier	0.77	0.76	0.77	0.76	0.93
GB classifier	0.73	0.74	0.73	0.73	0.92
MLP classifier	0.80	0.80	0.80	0.80	0.94
LDA classifier	0.65	0.65	0.65	0.65	0.89
SVM classifier	0.71	0.71	0.71	0.71	0.92
RF classifier	0.76	0.76	0.76	0.76	0.93

| Show Table

DownLoad: CSV

3.3.3. Classification results for the proposed method

The 19 selected lung radiomics features with Radiomics-FIRST/Radiomics-ALL were used to further evaluate the MLP classifier's performance.

Figure 14(A) and Table 11 show the evaluation metrics for the MLP classifier with one lung radiomics combination feature (Radiomics-X). Figure 14(A) shows that the AUC of the MLP classifier with Radiomics-FIRST/Radiomics-ALL was 0.87/0.85, which is better than that of the other lung radiomics combination features. Figure 15(A) intuitively shows the classification results for the MLP classifier with seven lung radiomics combination features. The MLP classifier with Radiomics-GLSZM/ Radiomics-GLSZM/ Radiomics-GLDM could not distinguish COPD Stage Ⅰ from the other COPD stages ("0" at COPD stage, predicted label). Radiomics-FIRST and Radiomics-ALL, which characterized the COPD stage, showed better classification performance than the other lung radiomics combination features. However, Radiomics-ALL showed the best classification performance for all lung radiomics combination features. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with Radiomics-ALL were 0.60, 0.58, 0.60, 0.59 and 0.87, respectively.

Figure 14. ROC curves for the MLP classifiers with different features. (A) ROC curves for the MLP classifier with one lung radiomics combination feature (Radiomics-X); (B) ROC curves for the MLP classifier with 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL (20 lung radiomics features).

DownLoad: Full-Size Img PowerPoint

Table 11. Test set evaluation metrics for the MLP classifier with one lung radiomics combination feature (Radiomics-X).

Radiomics-X	Accuracy	Precision	Recall	F1-score	AUC
Radiomics-FIRST	0.56	0.56	0.56	0.56	0.85
Radiomics-SHAPE	0.39	0.40	0.39	0.36	0.64
Radiomics-GLCM	0.49	0.51	0.49	0.47	0.72
Radiomics-GLSZM	0.36	0.27	0.36	0.31	0.62
Radiomics-NGTDM	0.42	0.32	0.42	0.36	0.66
Radiomics-GLDM	0.28	0.21	0.28	0.24	0.58
Radiomics-ALL	0.60	0.58	0.60	0.59	0.87

| Show Table

DownLoad: CSV

Figure 15. Confusion matrix results for the MLP classifiers with different features. (A) Confusion matrix results for the MLP classifier with one lung radiomics combination feature (Radiomics-X); (B) Confusion matrix results for the MLP classifier with 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL (20 lung radiomics features).

DownLoad: Full-Size Img PowerPoint

Figure 14(B) and Table 12 show the evaluation metrics for the MLP classifier with 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL. Figure 15(B) intuitively shows the classification results for the MLP classifier with the 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL. Compared with the MLP classifier with the 19 selected lung radiomics features, all of the evaluation metrics for the MLP classifier with the 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL improved. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features and Radiomics- FIRST were 0.81, 0.82, 0.81, 0.81 and 0.94, respectively. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features and Radiomics-ALL were 0.83, 0.83, 0.83, 0.82 and 0.95, respectively.

Table 12. Test set evaluation metrics for the MLP classifiers with 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL (20 lung radiomics features).

MLP Classifier: Input features	Accuracy	Precision	Recall	F1-score	AUC
19 radiomics features + Radiomics-FIRST	0.81	0.82	0.81	0.81	0.94
19 radiomics features + Radiomics-ALL	0.83	0.83	0. 83	0.82	0.95

| Show Table

DownLoad: CSV

4. Discussion

Four topics will be discussed, including 1) the classification ability of the classic CNN based on the images and that of the ML classifiers based on the lung radiomics features; 2) the role of feature selection and 3) the reason why the constructed lung radiomics combination features characterizing the COPD stage can improve the lung radiomics combination features.

4.1. Classification abilities of the CNN and ML classifiers

The classification ability of the classic CNN based on the images was worse than that of the ML classifiers based on the lung radiomics features. The following discussion focuses on the characteristics of COPD and the classic CNN.

First, COPD diffusely distributes in the lung. Therefore, there may be no lesions on some slices of chest HRCT images from a patient with COPD (Stage Ⅰ to Ⅳ). In addition, even if some participants were diagnosed without COPD (no airflow restriction), primary or mild lesions may already exist on their HRCT images. Alternatively, some slices of the chest HRCT images may not have lesions. Although we have made many attempts, including fine selection, rough selection and multiple-instance learning ^[52], to eliminate the above problems, the classification ability of the classic CNN based on the images still makes us disappointed.

Second, the 3D DenseNet achieved better classification results than the 2D DenseNet. The reason is that, compared with the 2D DenseNet, the 3D DenseNet can extract interlayer information. Compared with the original chest HRCT images, the lung parenchyma image removes the non-lung region containing redundant information. Therefore, when inputting the lung parenchyma images, DenseNet (2D and 3D) can focus more on extracting the features of the lung region to achieve better classification results. Compared with the chest HRCT images after the fine selection (deleting the non-lung-region images), the classification ability of the 2D DenseNet with the chest HRCT images after the rough selection (deleting 1/6 images at the beginning and the end, respectively) was lower, except for precision. The reason is that, although the rough selection of the chest HRCT images deletes no redundant information interference (non-lung images), there is also a lack of effective information on the 2/6 deleted images for COPD classification. However, compared with the chest HRCT images after the fine selection, the classification ability of the 3D DenseNet with the chest HRCT images after the rough selection was better. The reason is that the spacing of the 20 slices (Table 2) after the rough selection was less than that after the fine selection.

Third, similar to DenseNet, GoogleNet also cannot achieve the ideal classification effect with a small amount of training data. When the network dimension was transformed from 2D to 3D, the classification performance of DenseNet was greatly improved. However, the classification performance of GoogleNet improved only slightly, or even decreased. The 2D GoogleNet was proposed based on the classification of natural images (RGB images). Its structure does not have a targeted design for the details of these natural images. Inception network structure of 2D GoogleNet with many 1 × 1 convolution kernels strengthens the channel connection of RGB images, which cannot be reflected in the chest HRCT images. The classification ability of the 2D GoogleNet with the chest HRCT images after rough/fine selection was worse than that of the 2D GoogleNet with the original chest HRCT images. The reason is that non-lung slicers (fine selection) or few lung slicers (rough selection) are removed, but they are still effective information for COPD classification. The accuracy of the 2D GoogleNet with the chest HRCT images after the fine selection was higher than that resulting from rough selection, which confirms the above discussion. The accuracy of the 2D GoogleNet with the original chest HRCT images and the multiple-instance learning was the same. This also shows the role of multiple-instance learning in dealing with COPD classification, which aligns with its original intention. The AUC resulting from the 2D GoogleNet with the original parenchyma images was the maximum, showing that the ROI improves the AUC. In addition, the classification performance of the 2D GoogleNet with the multiple-instance learning of the original parenchyma images was better than that based on the original chest HRCT images. This further illustrates that the ROI improves the classification performance under the conditions of multiple-instance learning. The research on 3D GoogleNets is minimal ^{[53,54,55,56]}. As the number of 3D GoogleNet parameters increases, the problem of the limited number of training data will become more obvious. The Inception module structure is specially designed for 2D images. Different convolution kernels extract features from the 2D images, and then feature concatenation is implemented. The stitching is aligned and spliced according to the two dimensions of the tensor. Therefore, compared to the 2D GoogleNet, the 3D GoogleNet has more dimension tensors, which weakens the effect of the receptive fields.

The classification ability of the ML classifiers based on the lung radiomics features is better than that of the classic CNN based on the images. Compared with the classic CNN based on the images, the ML classifiers with the lung radiomics features calculated by preset formulas are more interpretable for the COPD classification. The lung radiomics features were calculated based on information from all of the slicers of the parenchyma images. Therefore, the lung radiomics features cannot be affected by the location of lesions in the chest HRCT images.

4.2. Role of feature selection for classification

Compared to the classification performance of the ML classifiers with the 1316 lung radiomics features, that of the ML classifiers with the 19 selected lung radiomics features was better. Lasso is often used with survival analysis models to determine variables and eliminate the collinearity problem between variables ^[43,44]. In this study, Lasso was applied to select the classification features, and the classification performance improved. Lasso selects the classification features by establishing the relationship between the independent and dependent variables (lung radiomics features and the COPD stages). This operation selects the lung radiomics features related to COPD stages to reduce the complexity of the ML classifiers and avoid overfitting. While reducing the complexity of the ML classifiers, the ML classifiers can focus on the 19 selected lung radiomics features and improve the classification performance. At the same time, it also endows the lung radiomics features used for classification with strong explanatory power. Radiomics18 of the 19 selected lung radiomics features was the dominant feature with the maximum coefficient.

4.3. Constructed lung radiomics combination features for characterizing the COPD stage

T-tests have been widely used to select significant variables in survival analysis models, generalized linear models and regression models ^[57]. Therefore, we were inspired to construct features that can characterize the COPD stage to improve classification performance. The two lung radiomics combination features, Radiomics-FIRST and Radiomics-ALL, were constructed to characterize the COPD stage (P-value < 0.05 for all COPD stages). The features with P-value < 0.05 for all COPD stages showed improved classification performance. The reason can be explained from the perspective of statistics. Generally, a P-value of two groups that is < 0.05 means significant correlation between these two groups. Therefore, using them (P-value < 0.05) to classify the two groups can improve the classification performance.

4.4. Limitations of this study and future direction

There are some limitations of this study. First, regarding the materials used in this study, there were not enough cases at the COPD Stage Ⅳ. Second, regarding the methods used in this study, many attempts were made to eliminate the problems of lesions in the HRCT images mentioned in Section 4.1, but the classification performance of the classic CNN remained unsatisfactory. The MLP classifiers with the 19 selected lung radiomics features and Radiomics-ALL achieved good classification performance, but the fixed calculation equations limit further development of the lung radiomics features. However, the CNN based on the chest HRCT images was not subject to the above restrictions. Fully combining a CNN classifier with the limited number of 3D medical images is an urgent problem to be solved. Transfer learning ^[58] in CNNs has become the first choice to solve the problem of a limited number of 3D medical images. Similarly, the method of data augmentation should be further tried. Inspired by lung radiomics features, which derive many features from each set of chest HRCT images, the 3D chest HRCT images of each subject can be resized into small-sized 3D images. For example, 3D chest HRCT images with 512 × 512 × N can be resized into other sizes, such as 256 × 256 × 300 and 64 × 64 × 50. Finally, the chest HRCT images used in this study were collected from 2009 to 2011, but they are still a rare and standard study cohort. We will also try our best to collect the updated study cohort in the future.

5. Conclusions

The lung radiomics features were used to characterize and classify the COPD stage in this study. Compared with classic CNN classifiers based on the chest HRCT images, the ML method based on the use of lung radiomics features is more suitable and interpretable for COPD classification. Lasso was applied to select the lung radiomics features for enhancing the ML method's classification performance. The best-performance classifier, i.e., the MLP classifier, was determined. Two lung radiomics combination features, Radiomics-FIRST and Radiomics-ALL, were constructed based on 19 selected lung radiomics features by using the proposed lung radiomics combination strategy for characterizing COPD stage evolution. Radiomics-FIRST/Radiomics-ALL was used further to improve the classification performance of the MLP classifier. As a result, the accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features and Radiomics-ALL were 0.83, 0.83, 0.83, 0.82 and 0.95, respectively.

Acknowledgments

Thanks to the Department of Radiology of the First Affiliated Hospital of Guangzhou Medical University for providing the data set, and to the National Natural Science Foundation of China (62071311), Natural Science Foundation of Guangdong Province, China (2019A1515011382), Stable Support Plan for Colleges and Universities in Shenzhen, China (SZWD2021010), Scientific Research Fund of Liaoning Province, China (JL201919) and the Special Program for Key Fields of Colleges and Universities in Guangdong Province (biomedicine and health) of China (2021ZDZX2008) for the funding support.

Conflict of interest

The authors declare no conflict of interest.

References

[1]	Seetohul LN (2009) Novel applications of optical analytical techniques. Unpublished PhD Thesis, Teesside University.
[2]	Johnson M (2003) Photodetection and measurement: maximizing performance in optical systems. McGraw-Hill.
[3]	ibid, p165–166.
[4]	Jung WG (2006) High Impedance Sensors. Op Amp Applications Handbook. Section 4-4, Analog Devices, 4.39-4.56. Available from: https://www.analog.com/media/en/training-seminars/design-handbooks/Op-Amp-Applications/Section4.pdf.
[5]	Dorrington AA and Kunnemeyer R (2002) A simple microcontroller based digital lock-in amplifier for the detection of low level optical signals. Proceedings of the First IEEE International Workshop on Electronic Design, Test and Applications (DELTA 02), 486–488. doi: 10.1109/delta.2002.994680
[6]	ACF2101 Data Sheet: Texas Instruments. Available from: https://www.ti.com/lit/ds/symlink/acf2101.pdf.
[7]	Analog Devices Engineering (2009) Analog Switches and Multiplexers Basics. Available from: http://www.analog.com/media/en/training-seminars/tutorials/MT-088.pdf.
[8]	Baker BC (1993) Application Bulletin AB-57A Comparison of the noise performance between a FET transimpedance amplifier and a switched integrator. Available from: nic.ath.cx/PDF/Burr-Brown/apnotes/AB-057.pdf.
[9]	USB Implementers Forum (2000) USB 2.0 Specification section 5.7: 48. Available from: http://sdphca.ucsd.edu/Lab_Equip_Manuals/usb_20.pdf.
[10]	Burr-Brown Data Sheet (2004) SBAS085B Dual Current Input 20-Bit Analog-to-Digital Converter DDC112: 28. Available from: http://www.ti.com/lit/ds/symlink/ddc112.pdf.
[11]	Kester W and Bryant J (2009) MT-027 ADC Architectures VIII: Integrating ADCs. Available from: https://www.analog.com/media/en/training-seminars/tutorials/MT-027.pdf.
[12]	Amatek Scientific Instruments (2008) Technical Note TN 1001. Available from: https://www.ameteksi.com/-/media/ameteksi/download_links/documentations/7210/tn1001_specifying_lock-in_amplifiers.pdf.
[13]	Comment in usb_device.h (line 197) Microchip Libraries for Applications. Available from: http://ww1.microchip.com/downloads/en/softwarelibrary/mla_v2013_06_15_windows_installer.exe.
[14]	Zumbahlen H (2012) Staying Well Grounded. Available from: http://www.analog.com/en/analog-dialogue/articles/staying-well-grounded.html.
[15]	Microchip compiled library for inclusion in .net projects. Available from: http://ww1.microchip.com/downloads/en/softwarelibrary/mla_v2013_06_15_windows_installer.exe: HID class.dll found in \USB\Device - HID - Custom Demos\HID DLL - PC Software\Microsoft Visual C++ 2008 Express\ HID class.dll.
[16]	Linux library 'libusb' generic access to USB devices. Available from: https://libusb.info/.
[17]	Howlett M, et al. (2014) NPlot graph plotting library. Available from: http://netcontrols.org/nplot/wiki/index.php.
[18]	Measurement Computing HID library for Raspberry Pi. Available from: https://www.mccdaq.com/TechTips/TechTip-9.aspx.
[19]	Seetohul LN, Ali Z and Islam M (2009) Liquid-phase broadband cavity enhanced absorption spectroscopy (BBCEAS) studies in a 20 cm cell. Analyst 134: 1887–1895. doi: 10.1039/b907316g
[20]	Burr-Brown application Bulletin SBAA027. Available from: http://www.ti.com/lit/an/sbaa027/sbaa027.pdf
[21]	Johnson M (2003) Photodetection and measurement: maximizing performance in optical systems. McGraw-Hill, 104–108.
[22]	Bateson SW and Woodward AT (1994) High-resolution noise-rejecting ADC. Computing & Control Engineering Journal 6: 113–119. doi: 10.1049/cce:19950303

This article has been cited by:

1.	Yingjian Yang, Shicong Wang, Nanrong Zeng, Wenxin Duan, Ziran Chen, Yang Liu, Wei Li, Yingwei Guo, Huai Chen, Xian Li, Rongchang Chen, Yan Kang, Lung Radiomics Features Selection for COPD Stage Classification Based on Auto-Metric Graph Neural Network, 2022, 12, 2075-4418, 2274, 10.3390/diagnostics12102274
2.	Yingjian Yang, Ziran Chen, Wei Li, Nanrong Zeng, Yingwei Guo, Shicong Wang, Wenxin Duan, Yang Liu, Huai Chen, Xian Li, Rongchang Chen, Yan Kang, Multi-modal data combination strategy based on chest HRCT images and PFT parameters for intelligent dyspnea identification in COPD, 2022, 9, 2296-858X, 10.3389/fmed.2022.980950
3.	Yanan Wu, Shuyue Xia, Zhenyu Liang, Rongchang Chen, Shouliang Qi, Artificial intelligence in COPD CT images: identification, staging, and quantitation, 2024, 25, 1465-993X, 10.1186/s12931-024-02913-z
4.	Meng Zhao, Yanan Wu, Yifu Li, Xiaoyu Zhang, Shuyue Xia, Jiaxuan Xu, Rongchang Chen, Zhenyu Liang, Shouliang Qi, Learning and depicting lobe-based radiomics feature for COPD Severity staging in low-dose CT images, 2024, 24, 1471-2466, 10.1186/s12890-024-03109-3
5.	Xingguang Deng, Wei Li, Yingjian Yang, Shicong Wang, Nanrong Zeng, Jiaxuan Xu, Haseeb Hassan, Ziran Chen, Yang Liu, Xiaoqiang Miao, Yingwei Guo, Rongchang Chen, Yan Kang, COPD stage detection: leveraging the auto-metric graph neural network with inspiratory and expiratory chest CT images, 2024, 62, 0140-0118, 1733, 10.1007/s11517-024-03016-z
6.	TaoHu Zhou, Yu Guan, XiaoQing Lin, XiuXiu Zhou, Liang Mao, YanQing Ma, Bing Fan, Jie Li, ShiYuan Liu, Li Fan, CT-based whole lung radiomics nomogram for identification of PRISm from non-COPD subjects, 2024, 25, 1465-993X, 10.1186/s12931-024-02964-2
7.	TaoHu Zhou, Yu Guan, XiaoQing Lin, XiuXiu Zhou, Liang Mao, YanQing Ma, Bing Fan, Jie Li, WenTing Tu, ShiYuan Liu, Li Fan, A clinical-radiomics nomogram based on automated segmentation of chest CT to discriminate PRISm and COPD patients, 2024, 13, 23520477, 100580, 10.1016/j.ejro.2024.100580
8.	Fei Shan, Minwen Zheng, 2024, Chapter 9, 978-981-99-8440-4, 153, 10.1007/978-981-99-8441-1_9
9.	Peng An, Junjie Liu, Mengxing Yu, Jinsong Wang, Zhongqiu Wang, Predicting mixed venous oxygen saturation (SvO2) impairment in COPD patients using clinical-CT radiomics data: A preliminary study, 2024, 32, 09287329, 1569, 10.3233/THC-230619
10.	Zecheng Zhu, Shunjin Zhao, Jiahui Li, Yuting Wang, Luopiao Xu, Yubing Jia, Zihan Li, Wenyuan Li, Gang Chen, Xifeng Wu, Development and application of a deep learning-based comprehensive early diagnostic model for chronic obstructive pulmonary disease, 2024, 25, 1465-993X, 10.1186/s12931-024-02793-3
11.	Yingjian Yang, Nanrong Zeng, Ziran Chen, Wei Li, Yingwei Guo, Shicong Wang, Wenxin Duan, Yang Liu, Rongchang Chen, Yan Kang, Weihua Yang, Multi‐Layer Perceptron Classifier with the Proposed Combined Feature Vector of 3D CNN Features and Lung Radiomics Features for COPD Stage Classification, 2023, 2023, 2040-2295, 10.1155/2023/3715603
12.	Tao-Hu Zhou, Xiu-Xiu Zhou, Jiong Ni, Yan-Qing Ma, Fang-Yi Xu, Bing Fan, Yu Guan, Xin-Ang Jiang, Xiao-Qing Lin, Jie Li, Yi Xia, Xiang Wang, Yun Wang, Wen-Jun Huang, Wen-Ting Tu, Peng Dong, Zhao-Bin Li, Shi-Yuan Liu, Li Fan, CT whole lung radiomic nomogram: a potential biomarker for lung function evaluation and identification of COPD, 2024, 11, 2054-9369, 10.1186/s40779-024-00516-9
13.	Taohu Zhou, Xiuxiu Zhou, Jiong Ni, Yu Guan, Xin’ang Jiang, Xiaoqing Lin, Jie Li, Yi Xia, Xiang Wang, Yun Wang, Wenjun Huang, Wenting Tu, Peng Dong, Zhaobin Li, Shiyuan Liu, Li Fan, A CT-Based Lung Radiomics Nomogram for Classifying the Severity of Chronic Obstructive Pulmonary Disease, 2024, Volume 19, 1178-2005, 2705, 10.2147/COPD.S483007
14.	Yingjian Yang, Jie Zheng, Peng Guo, Qi Gao, Yingwei Guo, Ziran Chen, Chengcheng Liu, Tianqi Wu, Zhanglei Ouyang, Huai Chen, Yan Kang, Three-stage registration pipeline for dynamic lung field of chest X-ray images based on convolutional neural networks, 2025, 8, 2624-8212, 10.3389/frai.2025.1466643
15.	Farzat Farha, Sageer Abass, Saba Khan, Javed Ali, Bushra Parveen, Sayeed Ahmad, Rabea Parveen, Transforming pulmonary healthcare: the role of artificial intelligence in diagnosis and treatment, 2025, 1747-6348, 10.1080/17476348.2025.2491723

Reader Comments

Your name:*

Email:*
© 2019 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)