Loading [Contrib]/a11y/accessibility-menu.js
Review Topical Sections

Perspectives on the use of transcriptomics to advance biofuels

  • As a field within the energy research sector, bioenergy is continuously expanding. Although much has been achieved and the yields of both ethanol and butanol have been improved, many avenues of research to further increase these yields still remain. This review covers current research related with transcriptomics and the application of this high-throughput analytical tool to engineer both microbes and plants with the penultimate goal being better biofuel production and yields. The initial focus is given to the responses of fermentative microbes during the fermentative production of acids, such as butyric acid, and solvents, including ethanol and butanol. As plants offer the greatest natural renewable source of fermentable sugars within the form of lignocellulose, the second focus area is the transcriptional responses of microbes when exposed to plant hydrolysates and lignin-related compounds. This is of particular importance as the acid/base hydrolysis methods commonly employed to make the plant-based cellulose available for enzymatic hydrolysis to sugars also generates significant amounts of lignin-derivatives that are inhibitory to fermentative bacteria and microbes. The article then transitions to transcriptional analyses of lignin-degrading organisms, such as Phanerochaete chrysosporium, as an alternative to acid/base hydrolysis. The final portion of this article will discuss recent transcriptome analyses of plants and, in particular, the genes involved in lignin production. The rationale behind these studies is to eventually reduce the lignin content present within these plants and, consequently, the amount of inhibitors generated during the acid/base hydrolysis of the lignocelluloses. All four of these topics represent key areas where transcriptomic research is currently being conducted to identify microbial genes and their responses to products and inhibitors as well as those related with lignin degradation/formation.

    Citation: Siseon Lee, Robert J. Mitchell. Perspectives on the use of transcriptomics to advance biofuels[J]. AIMS Bioengineering, 2015, 2(4): 487-506. doi: 10.3934/bioeng.2015.4.487

    Related Papers:

    [1] Nicola Bellomo, Francesca Colasuonno, Damián Knopoff, Juan Soler . From a systems theory of sociology to modeling the onset and evolution of criminality. Networks and Heterogeneous Media, 2015, 10(3): 421-441. doi: 10.3934/nhm.2015.10.421
    [2] Nicola Bellomo, Raluca Eftimie, Guido Forni . What is the in-host dynamics of the SARS-CoV-2 virus? A challenge within a multiscale vision of living systems. Networks and Heterogeneous Media, 2024, 19(2): 655-681. doi: 10.3934/nhm.2024029
    [3] Pierre-Emmanuel Jabin . Small populations corrections for selection-mutation models. Networks and Heterogeneous Media, 2012, 7(4): 805-836. doi: 10.3934/nhm.2012.7.805
    [4] Xiaoqian Gong, Benedetto Piccoli . A measure model for the spread of viral infections with mutations. Networks and Heterogeneous Media, 2022, 17(3): 427-442. doi: 10.3934/nhm.2022015
    [5] Pierre Degond, Gadi Fibich, Benedetto Piccoli, Eitan Tadmor . Special issue on modeling and control in social dynamics. Networks and Heterogeneous Media, 2015, 10(3): i-ii. doi: 10.3934/nhm.2015.10.3i
    [6] Xavier Blanc, Claude Le Bris, Frédéric Legoll, Tony Lelièvre . Beyond multiscale and multiphysics: Multimaths for model coupling. Networks and Heterogeneous Media, 2010, 5(3): 423-460. doi: 10.3934/nhm.2010.5.423
    [7] Qi Luo, Ryan Weightman, Sean T. McQuade, Mateo Díaz, Emmanuel Trélat, William Barbour, Dan Work, Samitha Samaranayake, Benedetto Piccoli . Optimization of vaccination for COVID-19 in the midst of a pandemic. Networks and Heterogeneous Media, 2022, 17(3): 443-466. doi: 10.3934/nhm.2022016
    [8] Henri Berestycki, Jean-Pierre Nadal, Nancy Rodíguez . A model of riots dynamics: Shocks, diffusion and thresholds. Networks and Heterogeneous Media, 2015, 10(3): 443-475. doi: 10.3934/nhm.2015.10.443
    [9] Fabio Camilli, Italo Capuzzo Dolcetta, Maurizio Falcone . Preface. Networks and Heterogeneous Media, 2012, 7(2): i-ii. doi: 10.3934/nhm.2012.7.2i
    [10] Giulia Bertaglia, Liu Liu, Lorenzo Pareschi, Xueyu Zhu . Bi-fidelity stochastic collocation methods for epidemic transport models with uncertainties. Networks and Heterogeneous Media, 2022, 17(3): 401-425. doi: 10.3934/nhm.2022013
  • As a field within the energy research sector, bioenergy is continuously expanding. Although much has been achieved and the yields of both ethanol and butanol have been improved, many avenues of research to further increase these yields still remain. This review covers current research related with transcriptomics and the application of this high-throughput analytical tool to engineer both microbes and plants with the penultimate goal being better biofuel production and yields. The initial focus is given to the responses of fermentative microbes during the fermentative production of acids, such as butyric acid, and solvents, including ethanol and butanol. As plants offer the greatest natural renewable source of fermentable sugars within the form of lignocellulose, the second focus area is the transcriptional responses of microbes when exposed to plant hydrolysates and lignin-related compounds. This is of particular importance as the acid/base hydrolysis methods commonly employed to make the plant-based cellulose available for enzymatic hydrolysis to sugars also generates significant amounts of lignin-derivatives that are inhibitory to fermentative bacteria and microbes. The article then transitions to transcriptional analyses of lignin-degrading organisms, such as Phanerochaete chrysosporium, as an alternative to acid/base hydrolysis. The final portion of this article will discuss recent transcriptome analyses of plants and, in particular, the genes involved in lignin production. The rationale behind these studies is to eventually reduce the lignin content present within these plants and, consequently, the amount of inhibitors generated during the acid/base hydrolysis of the lignocelluloses. All four of these topics represent key areas where transcriptomic research is currently being conducted to identify microbial genes and their responses to products and inhibitors as well as those related with lignin degradation/formation.


    Chronic obstructive pulmonary disease (COPD) is a heterogeneous inflammatory disease [1] characterized by persistent airflow limitations [2]. Due to this characteristic, the gold standard for diagnosing and evaluating COPD is the pulmonary function test (PFT) [2], which yields the forced expiratory volume in 1 second per forced vital capacity (FEV1/FVC) and FEV1 percentage of predicted (FEV1 % predicted). COPD's primary anatomical and pathophysiological manifestations are small airway lesions and emphysema [1,3]. Although PFTs can explain the impact on the symptoms and quality of life of COPD patients [4,5], it cannot reflect the change of the lung tissue in COPD patients according to the COPD stage evolution. PFT changes only occur when lung tissue is destroyed to a certain extent. Therefore, it is also difficult for the PFT to identify the etiology of COPD.

    Compared with PFTs, computed tomography (CT) has been regarded as the most effective modality for characterizing and quantifying COPD [6]. For example, chest CT images can indicate that the patients have suffered from mild lobular central emphysema and reveal decreased exercise tolerance in smokers without airflow limitations in their PFT results [7]. In addition, the chest CT images have also been used to quantitatively analyze the bronchial [8,9], airway disease [10,11,12,13,14,15], emphysema [16] and vascular [17,18] problems in COPD patients by measuring the parameters of the bronchi and vasculature, or by using the analysis methods for airway disease and emphysema. Furthermore, since radiomics was proposed to mine more information from medical images by using advanced feature analysis in 2007 [19], it has been widely used for the analysis of lung disease images [20,21,22,23,24] and other diseases [25,26]. Unlike normal lungs, the lung texture and density of COPD patients are influenced by the increased air abundance [20], leading to changes in chest CT images. The radiomics features, which reflect lung texture and density changes, can also predict severe COPD exacerbations [27]. They have also been applied to the spirometric assessment of emphysema presence and COPD severity [28]. However, radiomics has not been extensively investigated in COPD yet. Currently, there are potential applications of radiomics features in COPD, particularly for the diagnosis, treatment and follow-up of COPD and future directions of radiomics features in COPD [29]. Due to the characteristics of COPD, an important reason limiting the development of radiomics in COPD is its diffuse distribution in the lungs. Therefore, it is challenging to segment the COPD regions. Especially, because of the limitations of CT resolution, small airways (diameter < 2 mm) and their associated vessels cannot be segmented from chest CT images. However, COPD results from the joint action of the lung parenchyma. Therefore, the lung radiomics features calculated from the lung parenchyma images are considered in this paper.

    Most scholars are committed to improving the classifiers to get better classification results [30,31]. However, they ignore the effect of the input features on classification. Therefore, it is necessary to construct the lung radiomics combination features that characterize the COPD stage to improve the classification performance of the existing classifier. COPD and its heart complications (such as a higher resting heart rate) have been studied by using the lung radiomics features [32,33,34,35,36,37], but the lung radiomics features have not been applied for COPD stage classification.

    Our contributions in this paper are briefly described as follows: 1) Lung radiomics features are applied to COPD stage classification. The best accuracy, precision, recall, F1-score and area under the curve (AUC) of the multi-layer perceptron (MLP) classifier with the 19 lung radiomics features selected by Lasso were 0.80, 0.80, 0.80, 0.80 and 0.94, respectively. 2) Two lung radiomics combination features, Radiomics-FIRST and Radiomics-ALL, were constructed to characterize COPD stage evolution. Radiomics-FIRST or Radiomics-ALL improves the performance of the MLP classifier. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features and Radiomics-ALL improved by 3, 3, 3, 2 and 1%, respectively. 3) Compared to the classic convolutional neural network (CNN) application to chest high-resolution CT (HRCT) images, the machine learning (ML) methods based on lung radiomics features are more suitable and interpretable for the COPD classification.

    The subjects were Chinese people aged 40 to 79 who were enrolled in the National Clinical Research Center of Respiratory Diseases in China from May 25, 2009 to January 11, 2011. The enrolled subjects rigorously followed this study's inclusion and exclusion criteria [38]. The 468 subjects underwent chest HRCT scans at the full inspiration state and PFTs. The COPD stage was diagnosed from Stage 0 to Ⅳ by using the PFT, and according to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) 2008. COPD Stage 0 is diagnosed without COPD according to GOLD, but it may involve some symptoms of respiratory diseases. Please refer to our previous study [36] for a more detailed description of the materials.

    The ethics committee of the National Clinical Research Center of Respiratory Diseases at Guangzhou Medical University in China has approved this study. All 468 subjects submitted written informed consent to the First Affiliated Hospital of Guangzhou Medical University before the chest HRCT scans and PFTs were performed.

    Our proposed method classifies the COPD stage by applying lung radiomics features selected by Lasso and lung radiomics combination features (characterizing the COPD stage evolution) based on best-performance ML methods. Figure 2 shows the overall block diagram of this study. We further describe our methods in the following sections, i.e., Section 2.2.1 (ROI segmentation), Section 2.2.2 (Lung radiomics feature calculation) and Section 2.2.3 (COPD stage classification).

    Figure 1.  Subject selection flow diagram and COPD stage distribution of the subjects in this study. (A) Subject selection flow diagram, showing the enrollment, inclusion criteria and exclusion criteria; (B) COPD stage distribution of the subjects in this study.
    Figure 2.  Overall block diagram for the proposed method for this study. (A) Region-of-interest (ROI) segmentation; (B) Lung radiomics feature calculation; (C) COPD stage classification based on ML.

    Like the previous radiomics feature analysis methods, we also need to segment the region of interest (ROI) in the chest HRCT images and calculate the lung radiomics features based on the ROI. We finally use the lung radiomics features to characterize and classify the COPD stage. Due to the COPD characteristics of diffuse distribution in the lungs, it is challenging to segment the COPD regions. In addition, CT resolution also limits the segmentation of small airways (diameter < 2 mm) and their associated vessels. However, COPD results from the joint action of the lung parenchyma. Therefore, lung parenchyma images segmented from chest HRCT images, such as the ROI, were used to calculate the lung radiomics features in this paper.

    A trained ResU-Net [39] was used to automatically segment the lung region from the chest HRCT images. The lung region includes both the right and left lungs in this study. The architecture of the ResU-Net has been described in detail in our previous paper [40]. In addition, three experienced radiologists have checked and modified all of the lung region segmentation images to ensure the accuracy of the segmentation images. Please refer to our previous study [36] for a more detailed description of the ROI segmentation process.

    The 468 sets of original lung parenchyma images were extracted from the chest HRCT images using our previous method [41]. Figure 3 shows the typical parenchyma images with the Hounsfield unit (HU) value in the transverse plane. PyRadiomics [42] with the predefined class of radiomics features was implemented to calculate the lung radiomics features based on the original and derived lung parenchyma images. Finally, 1316 lung radiomics features per subject were obtained. Please also refer to our previous study [36] for a more detailed description of the lung radiomics feature calculation.

    Figure 3.  Typical parenchyma images with the HU value in the transverse plane. The top figures are the lung region segmentation images with red color, and the bottom figures are the corresponding original lung parenchyma images with the HU value.

    Lung radiomics features are the input features of the ML methods to classify the COPD stage. Before the COPD stage classification, the least absolute shrinkage and selection operator (Lasso) [43,44] selects the lung radiomics features by establishing the relationship between the lung radiomics features and COPD stages. Then, the selected lung radiomics features are used to pick up the best classifier from the different preset ML methods shown in Figure 3(C). This paper also introduces a radiomics combination strategy to construct the lung radiomics combination features for characterizing COPD stage evolution. Finally, the lung radiomics combination feature characterizing the COPD stage evolution is used to improve the performance of the best classifier.

    First, 19 lung radiomics features per subject were selected by Lasso from 1316 lung radiomics features. COPD Stages Ⅲ and Ⅳ were considered as one COPD stage to balance each COPD stage. The mathematical expression (Eq (1)) of the Lasso model [37] is

    $ arg\;min\left\{ {\sum\limits_{i = 1}^n {{{\left( {y_i^* - {\beta _0} - \sum\limits_{j = 1}^p {{\beta _j}} x_{ij}^*} \right)}^2}} + \lambda \sum\limits_{j = 0}^p {\left| {{\beta _j}} \right|} } \right\} $ (1)

    where $ x_{i j}^{*} $ is the value of the independent variable (1316 lung radiomics features) after a normalization operation, $ y_{i}^{*} $ is the value of the dependent variable (COPD Stages 0, Ⅰ, Ⅱ and Ⅲ & Ⅳ), λ is the penalty parameter (λ ≥0), βj is the regression coefficient, i∈[1, n] and j∈[0, p].

    The lung radiomics features of the four COPD stages are normalized by Eq (2).

    $ x_{ij}^* = \left( {{x_{ij}} - \overline {{x_j}} } \right)/\left( {{x_{jmax}} - {x_{jmin}}} \right) $ (2)

    where i = 1~468 (468 subjects), j = 1~1316 (1316 kinds of lung radiomics features for each subject), $ X_{i j} $ is the ith row and jth column of the 468 × 1316 lung radiomics features and $ \overline{x_{j}} $, ${{x_{jmax}}}$ and $ X_{jmin} $ are the mean, maximum and minimum of each kind of lung radiomics feature $ x_{j} $, respectively.

    Second, the best-performance classifier, i.e., the MLP classifier, is determined from ML classifiers, such as Random Forest (RF) [45], Adaboost (Ada) [46], Gradient boosting (GB) [47], Multi-layer perceptron (MLP) [48], Linear discriminant analysis (LDA) [49], and Support vector machine (SVM) [50] classifiers, as shown in Figure 2(C). The 468 subjects with the selected lung radiomics features were divided into 70 and 30%. The data for 70% of subjects trained the ML classifiers. Then, the data for 30% of the subjects were used to validate or test the trained ML classifiers. Of course, the labels for the 468 subjects were COPD stages (i.e., 0, 1, 2 and 3).

    The evaluation metrics for the classifiers were set as the accuracy, precision, recall, F1-score and AUC (a performance measurement for classification). The AUC evaluation metric was calculated based on the receiver operating characteristic (ROC) curve. The ROC curve, accuracy, precision, recall and F1-score can be calculated and drawn by the confusion matrix, which shows the distribution of the predicted and true labels (the COPD stages). A standard Python package "classification_report" was used to calculate the accuracy, precision, recall and F1-score. The AUC is usually the evaluation metric of binary classification.

    Figure 4 shows the confusion matrix and a schematic diagram of the ROC curve drawing for multi-classification. Figure 4(A) shows the confusion matrix of the binary classification. The true positive (TP) and false positive (FP) respectively represent the positive and negative samples predicted to be positive by the classifier. The false negative (FN) and true negative (TN) respectively represent the positive and negative samples predicted to be negative by the classifier. Like Figure 4(A), 4(B) shows that T00–T33 on the diagonal represents the correct classification results, and F represents the wrong classification results. Figure 4(C) shows a schematic diagram of the ROC curve drawing for multi-classification. The test set's COPD stages (the true and predicted label) are encoded by 0's and 1's. The position of 1 represents its classification. For example, the COPD stages 0, Ⅰ(1), Ⅱ(2) and Ⅲ & Ⅳ(3) are encoded as 1000, 0100, 0010 and 0001, respectively. Suppose the predicted label (classification result) is correct. In that case, the position value corresponding to 1 in the probability matrix P generated from the classifier is greater than the probability value of the position corresponding to 0. The coded COPD stages and their probability matrix P were used to draw the ROC curve according to the binary classification method.

    Figure 4.  Typical parenchyma images with the HU value in the transverse plane. The top figures are the lung region segmentation images with red color, and the bottom figures are the corresponding original lung parenchyma images with the HU value.

    Third, a radiomics combination strategy was proposed to construct the lung radiomics combination features used to characterize the COPD stage evolution. The lung radiomics combination features can be constructed using the class of the selected radiomics features. Eq (3) is the mathematical form of the lung radiomics combination strategy:

    $ { Radiomics }-X = \sum\limits_{i = 1}^{N} \beta_{2} x_{2} = \beta_{1} x_{1}+\beta_{2} x_{2}+\cdots+\beta_{N} x_{N} $ (3)

    where N is the number of the selected lung radiomics features in each class, and βi is the coefficient of the selected lung radiomics features xi generated by Lasso.

    The lung radiomics combination features are named Radiomics-X. The symbol "X" in Radiomics-X is the class name of the selected lung radiomics features, such as FIRST, SHAPE, GLCM, GLRLM, GLSZM, NGTDM and GLDM in Figure 2(B). In particular, Radiomics-ALL is constructed by using all selected lung radiomics features and their coefficients generated by Lasso. Finally, Radiomics-FIRST and Radiomics-ALL were picked up from the lung radiomics combination features (P-value < 0.05 between any two COPD stages) to characterize the COPD stage.

    Lastly, the 19 selected lung radiomics features with Radiomics-FIRST/Radiomics-ALL were used to train and validate the MLP classifier to improve the performance of the MLP classifier.

    Figure 5 presents the experimental design of this study to show the lung radiomics features' classification ability, highlight Lasso's role in classification and prove the proposed lung radiomics combination strategy's effectiveness in improving the performance of the MLP classifier.

    Figure 5.  Experimental design of this study.

    First, to compare the classification abilities of CNNs derived from chest HRCT image results and ML methods based on the lung radiomics features, we adopted two classic CNNs, DenseNet and GoogleNet, which achieved the best classification performance in our previous study [51]. The input images of DenseNet (2D/3D) and GoogleNet (2D/3D) were the original chest HRCT images or original parenchyma images, respectively. To obtain good classification performance based on the chest HRCT image results, before inputting the two classic CNNs, we also applied the following processes to the original chest HRCT images: 1) deleting the non-lung-region images (Fine selection); 2) deleting 1/6 images at the beginning and the end of all the slicers, respectively (Rough selection) and 3) applying multiple-instance learning [52]. After Rough selection, only the middle 4/6 slicers of the original chest HRCT images are used for COPD classification. In addition, multiple-instance learning was also applied to the original parenchyma images before inputting to the two classic CNNs. Table 1 shows the chest HRCT image data set division for the two classic CNNs. Tables 2 and 3 show the detailed parameters that were set for DenseNet and GoogleNet training. For the 2D CNN, the classification results were decided by the mean probability of all slices of the chest HRCT images. However, for the 3D CNN, the classification results were determined by the probability of the selected slices from the chest HRCT images. The numbers of selected slices were 20 slices and 16 slices, as shown in Tables 2 and 3.

    Table 1.  Chest HRCT image data set division for the two classic CNNs.
    Data set (6:1:3) Stage 0 Stage Ⅰ Stage Ⅱ Stage Ⅲ & Ⅳ Total
    Training set 76 subjects 65 subjects 75 subjects 64 subjects 280 subjects
    43,694 images 40,550 images 43,510 images 40,944 images 168,698 images
    Validation set 13 subjects 11 subjects 12 subjects 11 subjects 47 subjects
    7705 images 6981 images 7672 images 6924 images 29,282 images
    Test set 41 subjects 33 subjects 35 subjects 32 subjects 141 subjects
    23,940 images 21,141 images 22,283 images 20,478 images 87,842 images

     | Show Table
    DownLoad: CSV
    Table 2.  Parameters set to train the DenseNet.
    DenseNet: Input images Batch size (2D/3D) Input size (2D/3D) Epoch (2D/3D) Drop rate (2D/3D)
    Original chest HRCT images 20/2 512 × 512/512 × 512 × 20* 50/50 0.5/0.2
    Fine selection (HRCT images)
    Rough selection (HRCT images)
    Rough selection (HRCT images)
    Rough selection (HRCT images)
    Multiple instance (HRCT images) 16/2 512 × 512**/512 × 512 × 16*** 50/50 0.5/0.2
    Multiple instance (parenchyma)
    Note:* Each case (a set of chest HRCT images) was equally divided into 20 segments, with one slice taken equidistantly to obtain 20 slices in each case.
    ** After rough selection, each case was equally divided into 10 bags, with one slice taken randomly to obtain 10 slices in each case.
    *** After rough selection, each case was equally divided into 16 bags, with one slice taken equidistantly to obtain 16 slices in each case.

     | Show Table
    DownLoad: CSV
    Table 3.  Parameters set to train the GoogleNet.
    DenseNet: Input images Batch size (2D/3D) Input size (2D/3D) Epoch (2D/3D) Drop rate (2D/3D)
    Original chest HRCT images 16/2 512 × 512/512 × 512 × 20* 50/50 0.2/0.2
    Fine selection (HRCT images)
    Rough selection (HRCT images)
    Rough selection (HRCT images)
    Rough selection (HRCT images)
    Multiple instance (HRCT images) 16/2 512 × 512**/512 × 512 × 16*** 50/50 0.2/0.2
    Multiple instance (parenchyma)
    Note:* Each case (a set of chest HRCT images) was equally divided into 20 segments, with one slice taken equidistantly to obtain 20 slices in each case.
    ** After rough selection, each case was equally divided into 10 bags, with one slice taken randomly to obtain 10 slices in each case.
    *** After rough selection, each case was equally divided into 16 bags, with one slice taken equidistantly to obtain 16 slices in each case.

     | Show Table
    DownLoad: CSV

    Second, the lung radiomics features and their selected lung radiomics features were respectively used to train and test the different ML classifier, as shown in Figure 5, to highlight Lasso's role in classification. The selected lung radiomics features were determined by Lasso from the lung radiomics features directly calculated by PyRadiomics. Then, the ML classifier with the best classification performance was determined.

    Finally, the lung radiomics combination features that characterized the COPD stage were used to improve the performance of the best classifier.

    This section shows the results of Lasso, the radiomics combination strategy and the experiments.

    Table 4 presents the lung radiomics features selected by Lasso in detail, including the name, class and regression coefficient. To conveniently describe the selected lung radiomics features, we define the selected lung radiomics features as Radiomics1–19. Figure 6 further shows more detailed information on Radiomics1–19. Specifically, Figure 6(A) shows that Radiomics18 was the dominant feature in Radiomics1–19. Figure 6(B) shows that the FIRST class had seven selected lung radiomics features, i.e., the maximum number in all classes. Figure 6(C) also shows that the FIRST class is the most important of all classes.

    Table 4.  19 lung radiomics features selected by Lasso.
    Definition Name of the 19 selected lung radiomics features Class Coefficient
    Radiomics1 original_shape_Elongation Shape 0.0056
    Radiomics2 original_shape_Maximum2DDiameterSlice Shape -0.0789
    Radiomics3 original_shape_Sphericity Shape 0.0624
    Radiomics4 log.sigma.1.0.mm.3D_firstorder_Maximum First Order 0.0665
    Radiomics5 log.sigma.1.0.mm.3D_glcm_ClusterProminence GLCM 1 -0.0425
    Radiomics6 log.sigma.1.0.mm.3D_glszm_ZoneEntropy GLSZM 2 0.0394
    Radiomics7 log.sigma.2.0.mm.3D_firstorder_Maximum First Order 0.0129
    Radiomics8 log.sigma.2.0.mm.3D_ngtdm_Contrast NGTDM 3 -0.0318
    Radiomics9 log.sigma.2.0.mm.3D_gldm_DependenceVariance GLDM 4 -0.0136
    Radiomics10 log.sigma.4.0.mm.3D_firstorder_10Percentile First Order -0.0760
    Radiomics11 log.sigma.5.0.mm.3D_firstorder_10Percentile First Order -0.1669
    Radiomics12 wavelet.LLH_firstorder_RootMeanSquared First Order -0.0252
    Radiomics13 wavelet.HLH_firstorder_Mean First Order 0.0599
    Radiomics14 wavelet.HLH_glcm_Idmn GLCM 1 -0.0022
    Radiomics15 wavelet.HLH_ngtdm_Busyness NGTDM 3 0.0444
    Radiomics16 wavelet.HHL_gldm_SmallDependenceLowGrayLevelEmphasis GLDM 4 -0.0168
    Radiomics17 wavelet.HHH_glszm_GrayLevelNonUniformityNormalized GLSZM 2 -0.0043
    Radiomics18 wavelet.LLL_firstorder_10Percentile First Order -0.5314
    Radiomics19 wavelet.LLL_glcm_Imc2 GLCM 1 0.1383
    Note: 1 Gray level co-occurrence matrix.
    2 Gray level size zone matrix.
    3 Neighboring gray tone difference matrix.
    4 Gray level dependence matrix.

     | Show Table
    DownLoad: CSV
    Figure 6.  Detailed information on Radiomics1–19. (A) Comparison of regression coefficients; (B) Feature numbers for each class; (C) Feature importance for each class.

    The P-values and significant differences for Radiomics1–19 according to COPD stage evolution were further investigated. A Bonferroni-Dunn multiple comparisons test was applied to calculate the P-values among Radiomics1–19 according to COPD stage. Figure 7(N) and Table 5 show no significant differences for Radiomics14, regardless of COPD stage. Figure 7(A)(C), (E), (H), (I), (L), (M), (O), (R) and (S) and Table 5 show that only Radiomics1–3, 9, 13, 15 and 19 significantly increased, and that Radiomics5, 8, 12 and 18 significantly decreased with COPD stage evolution from COPD Stage 0 to COPD Stage Ⅰ, respectively. Figure 7(A), (C), (D), (F), (H), (J)(L), (O), (P), (R) and (S) and Table 5 show that only Radiomics1, 3, 4, 6, 13, 15 and 19 significantly increased, and that Radiomics8, 10–12, 16 and 18 significantly decreased with COPD stage evolution from COPD Stage 0 to COPD Stage Ⅱ, respectively. Figure 7(A), (C)(E), (F)(H), (J)(M) and (O)(S) and Table 5 show that only Radiomics1, 3, 4, 6, 7, 13, 15 and 19 significantly increased and Radiomics5, 8, 10–12 and 16–18 significantly decreased with COPD stage evolution from COPD Stage 0 to COPD Stages Ⅲ & Ⅳ, respectively. Figure 7(A)(D), (F)(K), (M), (R) and (S) and Table 5 show that only Radiomics1, 3, 4, 6, 7, 13 and 19 significantly increased, and that Radiomics2, 8–11 and 18 significantly decreased with COPD stage evolution from COPD Stage Ⅰ to COPD Stages Ⅲ & Ⅳ, respectively. Figure 7(A)(C), (J)(L), (O), (R) and (S) and Table 5 show that only Radiomics1–3, 15 and 19 significantly increased, and that Radiomics10–12 and 18 significantly decreased with COPD stage evolution from COPD Stage Ⅱ to COPD Stages Ⅲ & Ⅳ, respectively. Unfortunately, there were no significant differences between at least two COPD stages for the 19 selected lung radiomics features.

    Figure 7.  Box plots showing the 19 selected lung radiomics features at different COPD stages. (A)–(S) show the box plots for Radiomics1–19 at different COPD stages, respectively.
    Table 5.  P-values for the 19 selected lung radiomics features for the different COPD stages.
    Features 0 vs. Ⅰ 0 vs. Ⅱ 0 vs. Ⅲ & Ⅳ Ⅰ vs. Ⅱ Ⅰ vs. Ⅲ & Ⅳ Ⅱ vs. Ⅲ & Ⅳ
    Radiomics1 < 0.0001 < 0.0001 < 0.0001 0.9999 (ns1) < 0.0001 < 0.0001
    Radiomics2 0.0039 0.4406 (ns) > 0.9999 (ns) 0.5975 (ns) < 0.0001 0.0164
    Radiomics3 0.0004 < 0.0001 < 0.0001 > 0.9999 (ns) < 0.0001 0.0026
    Radiomics4 > 0.9999 (ns) 0.0244 0.0016 0.0066 0.0004 > 0.9999 (ns)
    Radiomics5 0.0009 0.4707 (ns) 0.0243 0.2483 (ns) > 0.9999 (ns) > 0.9999 (ns)
    Radiomics6 0.8609 (ns) < 0.0001 < 0.0001 0.0016 < 0.0001 0.7552 (ns)
    Radiomics7 > 0.9999 (ns) 0.1892 (ns) 0.0005 0.2978 (ns) 0.0013 0.3961 (ns)
    Radiomics8 < 0.0001 < 0.0001 < 0.0001 > 0.9999 (ns) 0.0229 0.5546 (ns)
    Radiomics9 0.0021 > 0.9999 (ns) 0.6507 (ns) 0.0026 < 0.0001 0.6705 (ns)
    Radiomics10 > 0.9999 (ns) 0.0001 < 0.0001 < 0.0001 < 0.0001 0.0045
    Radiomics11 0.0626 (ns) < 0.0001 < 0.0001 0.0001 < 0.0001 0.0055
    Radiomics12 < 0.0001 0.0006 < 0.0001 0.4505 (ns) 0.2677 (ns) 0.0006
    Radiomics13 < 0.0001 < 0.0001 < 0.0001 0.1717 (ns) < 0.0001 0.0800 (ns)
    Radiomics14 > 0.9999 (ns) > 0.9999 (ns) > 0.9999 (ns) > 0.9999 (ns) 0.1873 (ns) 0.6492 (ns)
    Radiomics15 < 0.0001 0.0011 < 0.0001 > 0.9999 (ns) 0.0878 (ns) 0.0019
    Radiomics16 0.7928 (ns) 0.0077 0.0005 0.6650 (ns) 0.1161 (ns) 0.8141 (ns)
    Radiomics17 > 0.9999 (ns) 0.1001 (ns) 0.0011 > 0.9999 (ns) 0.1153 (ns) 0.9721 (ns)
    Radiomics18 < 0.0001 < 0.0001 < 0.0001 0.1691 (ns) < 0.0001 < 0.0001
    Radiomics19 < 0.0001 < 0.0001 < 0.0001 > 0.9999 (ns) < 0.0001 < 0.0001
    Note: 1 ns: no significance.

     | Show Table
    DownLoad: CSV
    Table 6.  P-values for the seven lung radiomics combination features according to COPD stages.
    Features 0 vs. Ⅰ 0 vs. Ⅱ 0 vs. Ⅲ & Ⅳ Ⅰ vs. Ⅱ Ⅰ vs. Ⅲ & Ⅳ Ⅱ vs. Ⅲ & Ⅳ
    Radiomics-SHAPE > 0.999 (ns1) 0.4005 (ns) < 0.0001 0.2587 (ns) < 0.0001 0.0003
    Radiomics-FIRST < 0.0001 < 0.0001 < 0.0001 0.0003 < 0.0001 < 0.0001
    Radiomics-GLCM < 0.0001 < 0.0001 < 0.0001 > 0.999 (ns) < 0.0001 < 0.0001
    Radiomics-GLSZM 0.9780 (ns) < 0.0001 < 0.0001 0.0010 < 0.0001 0.7294 (ns)
    Radiomics-NGTDM < 0.0001 < 0.0001 < 0.0001 > 0.999 (ns) < 0.0051 0.0211
    Radiomics-GLDM > 0.999 (ns) 0.0111 < 0.0001 0.0057 < 0.0001 0.3038 (ns)
    Radiomics-ALL < 0.0001 < 0.0001 < 0.0001 0.0006 < 0.0001 < 0.0001
    Note:1 ns: no significance.

     | Show Table
    DownLoad: CSV

    The P-values and significant differences among different COPD stages are respectively shown in Figure 8 and Table 3 for the seven lung radiomics combination features. The Bonferroni-Dunn multiple comparisons test was also applied to calculate the P-values for the seven lung radiomics combination features according to COPD stage.

    Figure 8.  Box plots showing the seven lung radiomics combination features at different COPD stages. (A)–(G) show the box plots for the seven lung radiomics combination features at COPD Stages 0, Ⅰ, Ⅱ and Ⅲ & Ⅳ, respectively.

    Figure 8(B), (C), (E) and (G) and Table 3 show that only Radiomics-FIRST, Radiomics-GLCM, Radiomics-NGTDM and Radiomics-ALL significantly increased from COPD Stage 0 to Ⅰ, respectively. Figure 8(B)(G) and Table 3 show that only Radiomics-SHAPE, Radiomics-GLCM, Radiomics-GLSZM, Radiomics-NGTDM, Radiomics-GLDM and Radiomics-ALL significantly increased from COPD Stage 0 to Ⅱ, respectively. Figure 8(A)(G) and Table 3 show that all seven of the lung radiomics combination features significantly increased from COPD Stage 0 to Stages Ⅲ & Ⅳ, and from COPD Stage Ⅰ to Stages Ⅲ & Ⅳ. Figure 8(B), (E), (F) and (G) and Table 3 show that only Radiomics-FIRST, Radiomics-GLSZM, Radiomics-GLDM and Radiomics-ALL significantly increased from COPD Stage Ⅰ to Ⅱ. Figure 8(A)(C), (E) and (G) and Table 3 show that only Radiomics-SHAPE, Radiomics-FIRST, Radiomics-GLCM, Radiomics-NGTDM and Radiomics-ALL significantly increased from COPD Stage Ⅱ to Stages Ⅲ & Ⅳ. Therefore, only Radiomics-FIRST and Radiomics-ALL significantly increased with COPD stage evolution (P-value < 0.05).

    This section shows the classification results for the CNN classifier, ML classifier and our proposed method.

    Figures 911 show the classification results for the DenseNet and GoogleNet. The other evaluation metrics in Tables 7 and 8 were calculated from Figures 10 and 11, respectively. In Figures 10 and 11, the confusion matrices visually show the classification effect of each COPD stage.

    Figure 9.  ROC curves derived from the CNNs. (A) ROC curves from DenseNet; (B) ROC curves from GoogleNet.
    Figure 10.  Confusion matrix results for the DenseNet. (A) Confusion matrix results for the DenseNet with 2D input images; (B) Confusion matrix results for the DenseNet with 3D input images.
    Figure 11.  Confusion matrix results for the GoogleNet. (A) Confusion matrix results for the GoogleNet with 2D input images; (B) Confusion matrix results for the GoogleNet with 3D input images.
    Figure 12.  ROC curves for the different ML classifiers. (A) ROC curves for the different ML classifiers with 1316 lung radiomics features; (B) ROC curves for the different ML classifiers with 19 lung radiomics features selected by Lasso.
    Table 7.  Other evaluation metrics for applying the DenseNet to the test set.
    DenseNet: Input images Accuracy (2D/3D) Precision (2D/3D) Recall (2D/3D) F1-score (2D/3D)
    Original chest HRCT images 0.39/0.54 0.22/0.56 0.39/0.54 0.28/0.51
    Fine selection (HRCT images) 0.41/0.54 0.38/0.58 0.41/0.54 0.33/0.52
    Rough selection (HRCT images) 0.34/0.57 0.45/0.65 0.34/0.57 0.24/0.54
    Multiple instance (HRCT images) 0.40/0.59 0.32/0.62 0.40/0.59 0.32/0.57
    Original parenchyma images 0.47/0.58 0.47/0.61 0.47/0.58 0.43/0.58
    Multiple instance (parenchyma) 0.49/0.50 0.48/0.59 0.49/0.50 0.44/0.44

     | Show Table
    DownLoad: CSV
    Table 8.  Other evaluation metrics for applying the GoogleNet to the test set.
    GoogleNet: Input images Accuracy (2D/3D) Precision (2D/3D) Recall (2D/3D) F1-score (2D/3D)
    Original chest HRCT images 0.55/0.40 0.67/0.49 0.55/0.40 0.50/0.37
    Fine selection (HRCT images) 0.39/0.48 0.40/0.56 0.39/0.48 0.37/0.44
    Rough selection (HRCT images) 0.37/0.36 0.31/0.37 0.37/0.36 0.33/0.32
    Multiple instance (HRCT images) 0.39/0.38 0.37/0.36 0.39/0.38 0.28/0.33
    Original parenchyma images 0.55/0.39 0.56/0.47 0.55/0.39 0.55/0.34
    Multiple instance (parenchyma) 0.41/0.49 0.54/0.46 0.41/0.49 0.33/0.43

     | Show Table
    DownLoad: CSV

    Figure 9(A) and Table 7 show that the DenseNet with 3D images (3D DenseNet) had consistently better classification performance (based on the evaluation metrics) than the DenseNet with 2D images (2D DenseNet). Figure 10 intuitively shows the classification results for the 2D and 3D GoogleNet. The classification performance of the 2D and 3D DenseNet with original parenchyma images was better than that with the original chest HRCT images. Compared with the chest HRCT images after the fine selection, the classification performance of the 2D DenseNet with the chest HRCT images after the rough selection was lower for the test set, except for precision. However, the classification ability of the 3D DenseNet with the chest HRCT images after the rough selection was higher than that with the chest HRCT images after the fine selection. In particular, Figure 9(A) shows that the best AUC value (0.82) for the DenseNet was achieved by applying multiple-instance learning to 3D chest HRCT images. Table 7 shows that the other evaluation metrics based on applying multiple-instance learning to the 3D chest HRCT images processed with DenseNet were 0.59 (accuracy), 0.62 (precision), 0.59 (recall) and 0.57 (F1-score). However, it was lower than the rough selection result for the 3D chest HRCT images (0.65) in terms of precision.

    Figure 9(B) and Table 8 show that the best performance of GoogleNet was based on the 2D original parenchyma images. Finally, Figure 11 intuitively shows the classification results for the 2D and 3D GoogleNet. Specifically, the classification performance of the 2D GoogleNet with the original chest HRCT images/the original parenchyma images was better than that of the 3D GoogleNet. Furthermore, the classification performance of the 2D GoogleNet with the rough selection of the original chest HRCT images was also better than that of the 3D GoogleNet, except for precision. The classification performance of the 2D GoogleNet with the multiple-instance learning of the original chest HRCT images was also better than that of the 3D GoogleNet, except for the F1-score. However, the classification performance of the 2D GoogleNet with the fine selection of the original chest HRCT images was worse than that of the 3D GoogleNet. Furthermore, the classification performance of the 2D GoogleNet with the multiple-instance learning of the original parenchyma images was also worse than that of the 3D GoogleNet, except for precision. In particular, Figure 9(A) shows that the best AUC value (0.81) for the GoogleNet was achieved with the 2D original parenchyma images. Table 8 further shows that the other evaluation metrics of the 2D original parenchyma images process using GoogleNet were 0.55 (accuracy), 0.56 (precision), 0.55 (recall) and 0.55 (F1-score). It was lower than the 2D original chest HRCT images (0.67) in terms of precision.

    For all of the six kinds of input images in Tables 7 and 8, the results show that the classification performance of the 3D DenseNet was better than that of the 3D GoogleNet. However, the classification performance of the 2D GoogleNet with the original chest HRCT images/the original parenchyma images was also better than that of the 2D DenseNet.

    The classification performances of different ML classifiers were evaluated by using 1316 lung radiomics features and 19 selected lung radiomics features. In addition, the best-performance classifier (MLP classifier) was also determined, as described in this section.

    Figure 13 intuitively shows the classification results for different ML classifiers with 1316 lung radiomics features and 19 selected lung radiomics features. Table 9 reports the classification performance of the classifier with 1316 lung radiomics features. Compared with the classification performance of the CNNs (DenseNet and GoogleNet), that of the ML classifiers with 1316 lung radiomics features had an overwhelming effect on COPD stage classification. The accuracy, precision, recall, F1-score and AUC of the ML classifiers improved significantly. Overall, the classification performance of the ML classifiers was better than the DenseNet with the multiple-instance learning of the 3D chest HRCT images (best performance in DenseNet), even for the worst LDA classifier with 1316 lung radiomics features (except for precision). Compared with the classification performance of the ML classifiers with 1316 lung radiomics features, that of the ML classifiers with 19 selected lung radiomics features was further improved.

    Figure 13.  Confusion matrix results for the different ML classifiers. (A) Confusion matrix results for the ML classifiers with 1316 lung radiomics features; (B) Confusion matrix results for the ML classifiers with 19 selected lung radiomics features.
    Table 9.  Evaluation metrics for the different ML classifiers with the 1316 lung radiomics features when applied to the test set.
    Classifier Accuracy Precision Recall F1-score AUC
    RF classifier 0.72 0.72 0.72 0.72 0.90
    Ada classifier 0.70 0.69 0.70 0.69 0.90
    GB classifier 0.72 0.73 0.72 0.72 0.92
    MLP classifier 0.78 0.78 0.78 0.78 0.92
    LDA classifier 0.60 0.60 0.60 0.59 0.85
    SVM classifier 0.62 0.62 0.63 0.61 0.87
    RF classifier 0.72 0.72 0.72 0.72 0.90

     | Show Table
    DownLoad: CSV

    Table 9 shows that the accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 1316 lung radiomics features were 0.78, 0.78, 0.78, 0.78 and 0.92, respectively. Therefore, the MLP classifier is regarded as the best-performance classifier for the 1316 lung radiomics features. In addition, Table 10 shows that all of the evaluation metrics improved, and that the MLP classifier was also the best-performance classifier for the 19 selected lung radiomics features. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features were 0.80, 0.80, 0.80, 0.80 and 0.94, respectively.

    Table 10.  Evaluation metrics for the different ML classifiers with 19 selected lung radiomics features when applied to the test set.
    Classifier Accuracy Precision Recall F1-score AUC
    RF classifier 0.76 0.76 0.76 0.76 0.93
    Ada classifier 0.77 0.76 0.77 0.76 0.93
    GB classifier 0.73 0.74 0.73 0.73 0.92
    MLP classifier 0.80 0.80 0.80 0.80 0.94
    LDA classifier 0.65 0.65 0.65 0.65 0.89
    SVM classifier 0.71 0.71 0.71 0.71 0.92
    RF classifier 0.76 0.76 0.76 0.76 0.93

     | Show Table
    DownLoad: CSV

    The 19 selected lung radiomics features with Radiomics-FIRST/Radiomics-ALL were used to further evaluate the MLP classifier's performance.

    Figure 14(A) and Table 11 show the evaluation metrics for the MLP classifier with one lung radiomics combination feature (Radiomics-X). Figure 14(A) shows that the AUC of the MLP classifier with Radiomics-FIRST/Radiomics-ALL was 0.87/0.85, which is better than that of the other lung radiomics combination features. Figure 15(A) intuitively shows the classification results for the MLP classifier with seven lung radiomics combination features. The MLP classifier with Radiomics-GLSZM/ Radiomics-GLSZM/ Radiomics-GLDM could not distinguish COPD Stage Ⅰ from the other COPD stages ("0" at COPD stage, predicted label). Radiomics-FIRST and Radiomics-ALL, which characterized the COPD stage, showed better classification performance than the other lung radiomics combination features. However, Radiomics-ALL showed the best classification performance for all lung radiomics combination features. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with Radiomics-ALL were 0.60, 0.58, 0.60, 0.59 and 0.87, respectively.

    Figure 14.  ROC curves for the MLP classifiers with different features. (A) ROC curves for the MLP classifier with one lung radiomics combination feature (Radiomics-X); (B) ROC curves for the MLP classifier with 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL (20 lung radiomics features).
    Table 11.  Test set evaluation metrics for the MLP classifier with one lung radiomics combination feature (Radiomics-X).
    Radiomics-X Accuracy Precision Recall F1-score AUC
    Radiomics-FIRST 0.56 0.56 0.56 0.56 0.85
    Radiomics-SHAPE 0.39 0.40 0.39 0.36 0.64
    Radiomics-GLCM 0.49 0.51 0.49 0.47 0.72
    Radiomics-GLSZM 0.36 0.27 0.36 0.31 0.62
    Radiomics-NGTDM 0.42 0.32 0.42 0.36 0.66
    Radiomics-GLDM 0.28 0.21 0.28 0.24 0.58
    Radiomics-ALL 0.60 0.58 0.60 0.59 0.87

     | Show Table
    DownLoad: CSV
    Figure 15.  Confusion matrix results for the MLP classifiers with different features. (A) Confusion matrix results for the MLP classifier with one lung radiomics combination feature (Radiomics-X); (B) Confusion matrix results for the MLP classifier with 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL (20 lung radiomics features).

    Figure 14(B) and Table 12 show the evaluation metrics for the MLP classifier with 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL. Figure 15(B) intuitively shows the classification results for the MLP classifier with the 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL. Compared with the MLP classifier with the 19 selected lung radiomics features, all of the evaluation metrics for the MLP classifier with the 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL improved. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features and Radiomics- FIRST were 0.81, 0.82, 0.81, 0.81 and 0.94, respectively. The accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features and Radiomics-ALL were 0.83, 0.83, 0.83, 0.82 and 0.95, respectively.

    Table 12.  Test set evaluation metrics for the MLP classifiers with 19 selected lung radiomics features and Radiomics-FIRST/Radiomics-ALL (20 lung radiomics features).
    MLP Classifier: Input features Accuracy Precision Recall F1-score AUC
    19 radiomics features + Radiomics-FIRST 0.81 0.82 0.81 0.81 0.94
    19 radiomics features + Radiomics-ALL 0.83 0.83 0. 83 0.82 0.95

     | Show Table
    DownLoad: CSV

    Four topics will be discussed, including 1) the classification ability of the classic CNN based on the images and that of the ML classifiers based on the lung radiomics features; 2) the role of feature selection and 3) the reason why the constructed lung radiomics combination features characterizing the COPD stage can improve the lung radiomics combination features.

    The classification ability of the classic CNN based on the images was worse than that of the ML classifiers based on the lung radiomics features. The following discussion focuses on the characteristics of COPD and the classic CNN.

    First, COPD diffusely distributes in the lung. Therefore, there may be no lesions on some slices of chest HRCT images from a patient with COPD (Stage Ⅰ to Ⅳ). In addition, even if some participants were diagnosed without COPD (no airflow restriction), primary or mild lesions may already exist on their HRCT images. Alternatively, some slices of the chest HRCT images may not have lesions. Although we have made many attempts, including fine selection, rough selection and multiple-instance learning [52], to eliminate the above problems, the classification ability of the classic CNN based on the images still makes us disappointed.

    Second, the 3D DenseNet achieved better classification results than the 2D DenseNet. The reason is that, compared with the 2D DenseNet, the 3D DenseNet can extract interlayer information. Compared with the original chest HRCT images, the lung parenchyma image removes the non-lung region containing redundant information. Therefore, when inputting the lung parenchyma images, DenseNet (2D and 3D) can focus more on extracting the features of the lung region to achieve better classification results. Compared with the chest HRCT images after the fine selection (deleting the non-lung-region images), the classification ability of the 2D DenseNet with the chest HRCT images after the rough selection (deleting 1/6 images at the beginning and the end, respectively) was lower, except for precision. The reason is that, although the rough selection of the chest HRCT images deletes no redundant information interference (non-lung images), there is also a lack of effective information on the 2/6 deleted images for COPD classification. However, compared with the chest HRCT images after the fine selection, the classification ability of the 3D DenseNet with the chest HRCT images after the rough selection was better. The reason is that the spacing of the 20 slices (Table 2) after the rough selection was less than that after the fine selection.

    Third, similar to DenseNet, GoogleNet also cannot achieve the ideal classification effect with a small amount of training data. When the network dimension was transformed from 2D to 3D, the classification performance of DenseNet was greatly improved. However, the classification performance of GoogleNet improved only slightly, or even decreased. The 2D GoogleNet was proposed based on the classification of natural images (RGB images). Its structure does not have a targeted design for the details of these natural images. Inception network structure of 2D GoogleNet with many 1 × 1 convolution kernels strengthens the channel connection of RGB images, which cannot be reflected in the chest HRCT images. The classification ability of the 2D GoogleNet with the chest HRCT images after rough/fine selection was worse than that of the 2D GoogleNet with the original chest HRCT images. The reason is that non-lung slicers (fine selection) or few lung slicers (rough selection) are removed, but they are still effective information for COPD classification. The accuracy of the 2D GoogleNet with the chest HRCT images after the fine selection was higher than that resulting from rough selection, which confirms the above discussion. The accuracy of the 2D GoogleNet with the original chest HRCT images and the multiple-instance learning was the same. This also shows the role of multiple-instance learning in dealing with COPD classification, which aligns with its original intention. The AUC resulting from the 2D GoogleNet with the original parenchyma images was the maximum, showing that the ROI improves the AUC. In addition, the classification performance of the 2D GoogleNet with the multiple-instance learning of the original parenchyma images was better than that based on the original chest HRCT images. This further illustrates that the ROI improves the classification performance under the conditions of multiple-instance learning. The research on 3D GoogleNets is minimal [53,54,55,56]. As the number of 3D GoogleNet parameters increases, the problem of the limited number of training data will become more obvious. The Inception module structure is specially designed for 2D images. Different convolution kernels extract features from the 2D images, and then feature concatenation is implemented. The stitching is aligned and spliced according to the two dimensions of the tensor. Therefore, compared to the 2D GoogleNet, the 3D GoogleNet has more dimension tensors, which weakens the effect of the receptive fields.

    The classification ability of the ML classifiers based on the lung radiomics features is better than that of the classic CNN based on the images. Compared with the classic CNN based on the images, the ML classifiers with the lung radiomics features calculated by preset formulas are more interpretable for the COPD classification. The lung radiomics features were calculated based on information from all of the slicers of the parenchyma images. Therefore, the lung radiomics features cannot be affected by the location of lesions in the chest HRCT images.

    Compared to the classification performance of the ML classifiers with the 1316 lung radiomics features, that of the ML classifiers with the 19 selected lung radiomics features was better. Lasso is often used with survival analysis models to determine variables and eliminate the collinearity problem between variables [43,44]. In this study, Lasso was applied to select the classification features, and the classification performance improved. Lasso selects the classification features by establishing the relationship between the independent and dependent variables (lung radiomics features and the COPD stages). This operation selects the lung radiomics features related to COPD stages to reduce the complexity of the ML classifiers and avoid overfitting. While reducing the complexity of the ML classifiers, the ML classifiers can focus on the 19 selected lung radiomics features and improve the classification performance. At the same time, it also endows the lung radiomics features used for classification with strong explanatory power. Radiomics18 of the 19 selected lung radiomics features was the dominant feature with the maximum coefficient.

    T-tests have been widely used to select significant variables in survival analysis models, generalized linear models and regression models [57]. Therefore, we were inspired to construct features that can characterize the COPD stage to improve classification performance. The two lung radiomics combination features, Radiomics-FIRST and Radiomics-ALL, were constructed to characterize the COPD stage (P-value < 0.05 for all COPD stages). The features with P-value < 0.05 for all COPD stages showed improved classification performance. The reason can be explained from the perspective of statistics. Generally, a P-value of two groups that is < 0.05 means significant correlation between these two groups. Therefore, using them (P-value < 0.05) to classify the two groups can improve the classification performance.

    There are some limitations of this study. First, regarding the materials used in this study, there were not enough cases at the COPD Stage Ⅳ. Second, regarding the methods used in this study, many attempts were made to eliminate the problems of lesions in the HRCT images mentioned in Section 4.1, but the classification performance of the classic CNN remained unsatisfactory. The MLP classifiers with the 19 selected lung radiomics features and Radiomics-ALL achieved good classification performance, but the fixed calculation equations limit further development of the lung radiomics features. However, the CNN based on the chest HRCT images was not subject to the above restrictions. Fully combining a CNN classifier with the limited number of 3D medical images is an urgent problem to be solved. Transfer learning [58] in CNNs has become the first choice to solve the problem of a limited number of 3D medical images. Similarly, the method of data augmentation should be further tried. Inspired by lung radiomics features, which derive many features from each set of chest HRCT images, the 3D chest HRCT images of each subject can be resized into small-sized 3D images. For example, 3D chest HRCT images with 512 × 512 × N can be resized into other sizes, such as 256 × 256 × 300 and 64 × 64 × 50. Finally, the chest HRCT images used in this study were collected from 2009 to 2011, but they are still a rare and standard study cohort. We will also try our best to collect the updated study cohort in the future.

    The lung radiomics features were used to characterize and classify the COPD stage in this study. Compared with classic CNN classifiers based on the chest HRCT images, the ML method based on the use of lung radiomics features is more suitable and interpretable for COPD classification. Lasso was applied to select the lung radiomics features for enhancing the ML method's classification performance. The best-performance classifier, i.e., the MLP classifier, was determined. Two lung radiomics combination features, Radiomics-FIRST and Radiomics-ALL, were constructed based on 19 selected lung radiomics features by using the proposed lung radiomics combination strategy for characterizing COPD stage evolution. Radiomics-FIRST/Radiomics-ALL was used further to improve the classification performance of the MLP classifier. As a result, the accuracy, precision, recall, F1-score and AUC of the MLP classifier with the 19 selected lung radiomics features and Radiomics-ALL were 0.83, 0.83, 0.83, 0.82 and 0.95, respectively.

    Thanks to the Department of Radiology of the First Affiliated Hospital of Guangzhou Medical University for providing the data set, and to the National Natural Science Foundation of China (62071311), Natural Science Foundation of Guangdong Province, China (2019A1515011382), Stable Support Plan for Colleges and Universities in Shenzhen, China (SZWD2021010), Scientific Research Fund of Liaoning Province, China (JL201919) and the Special Program for Key Fields of Colleges and Universities in Guangdong Province (biomedicine and health) of China (2021ZDZX2008) for the funding support.

    The authors declare no conflict of interest.

    [1] Penuelas J, Carnicer J (2010) Climate change and peak oil: the urgent need for a transition to a non-carbon-emitting society. Ambio 39: 85-90. doi: 10.1007/s13280-009-0011-x
    [2] Kerr RA (2011) Energy supplies. Peak oil production may already be here. Science 331: 1510-1511.
    [3] Pauly M, Keegstra K (2008) Cell-wall carbohydrates and their modification as a resource for biofuels. Plant J 54: 559-568. doi: 10.1111/j.1365-313X.2008.03463.x
    [4] Vanholme B, Desmet T, Ronsse F, et al. (2013) Towards a carbon-negative sustainable bio-based economy. Front Plant Sci 4: 174.
    [5] Kalluri UC, Keller M (2010) Bioenergy research: a new paradigm in multidisciplinary research. J R Soc Interface 7: 1391-1401. doi: 10.1098/rsif.2009.0564
    [6] Chabannes M, Ruel K, Yoshinaga A, et al. (2001) In situ analysis of lignins in transgenic tobacco reveals a differential impact of individual transformations on the spatial patterns of lignin deposition at the cellular and subcellular levels. Plant J 28: 271-282. doi: 10.1046/j.1365-313X.2001.01159.x
    [7] Ezeji T, Blaschek HP (2008) Fermentation of dried distillers' grains and solubles (DDGS) hydrolysates to solvents and value-added products by solventogenic clostridia. Bioresource Technol 99: 5232-5242. doi: 10.1016/j.biortech.2007.09.032
    [8] Ezeji T, Qureshi N, Blaschek HP (2007) Butanol production from agricultural residues: impact of degradation products on Clostridium beijerinckii growth and butanol fermentation. Biotechnol Bioeng 97: 1460-1469. doi: 10.1002/bit.21373
    [9] Klinke HB, Thomsen AB, Ahring BK (2004) Inhibition of ethanol-producing yeast and bacteria by degradation products produced during pre-treatment of biomass. Appl Microbiol Biotechnol 66: 10-26. doi: 10.1007/s00253-004-1642-2
    [10] Lee S, Lee JH, Mitchell RJ (2015) Analysis of Clostridium beijerinckii NCIMB 8052's transcriptional response to ferulic acid and its application to enhance the strain tolerance. Biotechnol Bioeng 8: 68.
    [11] Palmqvist E, Hahn-Hagerdal B (2000) Fermentation of lignocellulosic hydrolysates. II: inhibitors and mechanisms of inhibition. Bioresource Technol 74: 25-33.
    [12] Persson P, Andersson J, Gorton L, et al. (2002) Effect of different forms of alkali treatment on specific fermentation inhibitors and on the fermentability of lignocellulose hydrolysates for production of fuel ethanol. J Agric Food Chem 50: 5318-5325. doi: 10.1021/jf025565o
    [13] Martinez A, Rodriguez ME, York SW, et al. (2000) Effects of Ca(OH)(2) treatments (""overliming"") on the composition and toxicity of bagasse hemicellulose hydrolysates. Biotechnol Bioeng 69: 526-536.
    [14] Berson RE, Young JS, Hanley TR (2006) Reintroduced solids increase inhibitor levels in a pretreated corn stover hydrolysate. Appl Biochem Biotechnol 129-132: 612-620.
    [15] Guo X, Cavka A, Jonsson LJ, et al. (2013) Comparison of methods for detoxification of spruce hydrolysate for bacterial cellulose production. Microb Cell Fact 12: 93. doi: 10.1186/1475-2859-12-93
    [16] Kumari R, Pramanik K (2013) Bioethanol production from Ipomoea carnea biomass using a potential hybrid yeast strain. Appl Biochem Biotechnol 171: 771-785. doi: 10.1007/s12010-013-0398-5
    [17] Kuhad RC, Gupta R, Khasa YP, et al. (2010) Bioethanol production from Lantana camara (red sage): Pretreatment, saccharification and fermentation. Bioresour Technol 101: 8348-8354. doi: 10.1016/j.biortech.2010.06.043
    [18] Nilvebrant NO, Reimann A, Larsson S, et al. (2001) Detoxification of lignocellulose hydrolysates with ion-exchange resins. Appl Biochem Biotechnol 91-93: 35-49. doi: 10.1385/ABAB:91-93:1-9:35
    [19] Hallsworth JE, Heim S, Timmis KN (2003) Chaotropic solutes cause water stress in Pseudomonas putida. Environ Microbiol 5: 1270-1280. doi: 10.1111/j.1462-2920.2003.00478.x
    [20] Bhaganna P, Volkers RJ, Bell AN, et al. (2010) Hydrophobic substances induce water stress in microbial cells. Microb Biotechnol 3: 701-716. doi: 10.1111/j.1751-7915.2010.00203.x
    [21] Hallsworth JE (1998) Ethanol-induced water stress in yeast. J Ferment Bioeng 85: 125-137. doi: 10.1016/S0922-338X(97)86756-6
    [22] da Costa MS, Santos H, Galinski EA (1998) An overview of the role and diversity of compatible solutes in Bacteria and Archaea. Adv Biochem Eng Biotechnol 61: 117-153.
    [23] Mansure JJ, Panek AD, Crowe LM, et al. (1994) Trehalose inhibits ethanol effects on intact yeast cells and liposomes. Biochim Biophys Acta 1191: 309-316. doi: 10.1016/0005-2736(94)90181-3
    [24] Hallsworth JE, Prior BA, Nomura Y, et al. (2003) Compatible solutes protect against chaotrope (ethanol)-induced, nonosmotic water stress. Appl Environ Microbiol 69: 7032-7034. doi: 10.1128/AEM.69.12.7032-7034.2003
    [25] Cray JA, Stevenson A, Ball P, et al. (2015) Chaotropicity: a key factor in product tolerance of biofuel-producing microorganisms. Curr Opin Biotechnol 33: 228-259. doi: 10.1016/j.copbio.2015.02.010
    [26] Mitchell RJ, Gu MB (2004) An Escherichia coli biosensor capable of detecting both genotoxic and oxidative damage. Appl Microbiol Biot 64: 46-52. doi: 10.1007/s00253-003-1418-0
    [27] Choi SH, Gu MB (2002) A portable toxicity biosensor using freeze-dried recombinant bioluminescent bacteria. Biosens Bioelectron 17: 433-440. doi: 10.1016/S0956-5663(01)00303-7
    [28] Choi SH, Gu MB (2003) Toxicity biomonitoring of degradation byproducts using freeze-dried recombinant bioluminescent bacteria. Anal Chim Acta 481: 229-238. doi: 10.1016/S0003-2670(03)00091-6
    [29] Van Dyk TK, Smulski DR, Reed TR, et al. (1995) Responses to toxicants of an Escherichia coli strain carrying a uspA'::lux genetic fusion and an E. coli strain carrying a grpE'::lux fusion are similar. Appl Environ Microbiol 61: 4124-4127.
    [30] Mitchell RJ, Gu MB (2006) Characterization and optimization of two methods in the immobilization of 12 bioluminescent strains. Biosens Bioelectron 22: 192-199. doi: 10.1016/j.bios.2005.12.019
    [31] Ahn J-M, Mitchell RJ, Gu MB (2004) Detection and classification of oxidative damaging stresses using recombinant bioluminescent bacteria harboring sodA∷, pqi∷, and katG∷ luxCDABE fusions. Enzyme Microb Tech 35: 540-544. doi: 10.1016/j.enzmictec.2004.08.005
    [32] Gao DH, Haarmeyer C, Balan V, et al. (2014) Lignin triggers irreversible cellulase loss during pretreated lignocellulosic biomass saccharification. Biotechnol Biofuels 7.
    [33] Kim HJ, Lee S, Kim J, et al. (2013) Environmentally friendly pretreatment of plant biomass by planetary and attrition milling. Bioresource Technol 144: 50-56. doi: 10.1016/j.biortech.2013.06.090
    [34] Oudshoorn A, van der Wielen LA, Straathof AJ (2009) Assessment of options for selective 1-butanol recovery from aqueous solution. Ind Eng Chem Res 48: 7325-7336. doi: 10.1021/ie900537w
    [35] Jang Y-S, Lee JY, Lee J, et al. (2012) Enhanced butanol production obtained by reinforcing the direct butanol-forming route in Clostridium acetobutylicum. MBio 3: e00314-00312.
    [36] Tangney M, Mitchell WJ (2000) Analysis of a catabolic operon for sucrose transport and metabolism in Clostridium acetobutylicum ATCC 824. J Mol Microb Biotech 2: 71-80.
    [37] Wang L, Chen H (2011) Increased fermentability of enzymatically hydrolyzed steam-exploded corn stover for butanol production by removal of fermentation inhibitors. Process Biochem 46: 604-607. doi: 10.1016/j.procbio.2010.09.027
    [38] Servinsky MD, Kiel JT, Dupuy NF, et al. (2010) Transcriptional analysis of differential carbohydrate utilization by Clostridium acetobutylicum. Microbiology 156: 3478-3491. doi: 10.1099/mic.0.037085-0
    [39] Ren C, Gu Y, Hu S, et al. (2010) Identification and inactivation of pleiotropic regulator CcpA to eliminate glucose repression of xylose utilization in Clostridium acetobutylicum. Metab Eng 12: 446-454. doi: 10.1016/j.ymben.2010.05.002
    [40] Tangney M, Galinier A, Deutscher J, et al. (2003) Analysis of the elements of catabolite repression in Clostridium acetobutylicum ATCC 824. J Mol Microb Biotech 6: 6-11. doi: 10.1159/000073403
    [41] Grimmler C, Held C, Liebl W, et al. (2010) Transcriptional analysis of catabolite repression in Clostridium acetobutylicum growing on mixtures of d-glucose and d-xylose. J Biotechnol 150: 315-323.
    [42] Bayer EA, Belaich J-P, Shoham Y, et al. (2004) The cellulosomes: multienzyme machines for degradation of plant cell wall polysaccharides. Annu Rev Microbiol 58: 521-554. doi: 10.1146/annurev.micro.57.030502.091022
    [43] Stevenson DM, Weimer PJ (2005) Expression of 17 genes in Clostridium thermocellum ATCC 27405 during fermentation of cellulose or cellobiose in continuous culture. Appl Environ Microb 71: 4672-4678. doi: 10.1128/AEM.71.8.4672-4678.2005
    [44] Feinberg L, Foden J, Barrett T, et al. (2011) Complete genome sequence of the cellulolytic thermophile Clostridium thermocellum DSM1313. J Bacteriol 193: 2906-2907. doi: 10.1128/JB.00322-11
    [45] Raman B, McKeown CK, Rodriguez M, et al. (2011) Transcriptomic analysis of Clostridium thermocellum ATCC 27405 cellulose fermentation. BMC Microbiol 11: 134. doi: 10.1186/1471-2180-11-134
    [46] Riederer A, Takasuka TE, Makino S-i, et al. (2011) Global gene expression patterns in Clostridium thermocellum as determined by microarray analysis of chemostat cultures on cellulose or cellobiose. Appl Environ Microb 77: 1243-1253. doi: 10.1128/AEM.02008-10
    [47] Wang Y, Li X, Mao Y, et al. (2012) Genome-wide dynamic transcriptional profiling in Clostridium beijerinckii NCIMB 8052 using single-nucleotide resolution RNA-Seq. BMC Genomics 13: 102. doi: 10.1186/1471-2164-13-102
    [48] Alsaker KV, Papoutsakis ET (2005) Transcriptional program of early sporulation and stationary-phase events in Clostridium acetobutylicum. J Bacteriol 187: 7103-7118. doi: 10.1128/JB.187.20.7103-7118.2005
    [49] Grimmler C, Janssen H, Krauβe D, et al. (2011) Genome-wide gene expression analysis of the switch between acidogenesis and solventogenesis in continuous cultures of Clostridium acetobutylicum. J Mol Microb Biotech 20: 1-15. doi: 10.1159/000320973
    [50] Shi Z, Blaschek HP (2008) Transcriptional analysis of Clostridium beijerinckii NCIMB 8052 and the hyper-butanol-producing mutant BA101 during the shift from acidogenesis to solventogenesis. Appl Environ Microb 74: 7709-7714. doi: 10.1128/AEM.01948-08
    [51] Hu S, Zheng H, Gu Y, et al. (2011) Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018. BMC Genomics 12: 93. doi: 10.1186/1471-2164-12-93
    [52] Lütke-Eversloh T, Bahl H (2011) Metabolic engineering of Clostridium acetobutylicum: recent advances to improve butanol production. Curr Opin Biotech 22: 634-647. doi: 10.1016/j.copbio.2011.01.011
    [53] Knoshaug EP, Zhang M (2009) Butanol tolerance in a selection of microorganisms. Appl Biochem Biotech 153: 13-20. doi: 10.1007/s12010-008-8460-4
    [54] Atsumi S, Hanai T, Liao JC (2008) Non-fermentative pathways for synthesis of branched-chain higher alcohols as biofuels. Nature 451: 86-89. doi: 10.1038/nature06450
    [55] de Lima Alves F, Stevenson A, Baxter E, et al. (2015) Concomitant osmotic and chaotropicity-induced stresses in Aspergillus wentii: compatible solutes determine the biotic window. Curr Genet 61: 457-477. doi: 10.1007/s00294-015-0496-8
    [56] Tomas CA, Welker NE, Papoutsakis ET (2003) Overexpression of groESL in Clostridium acetobutylicum results in increased solvent production and tolerance, prolonged metabolism, and changes in the cell's transcriptional program. Appl Environ Microb 69: 4951-4965. doi: 10.1128/AEM.69.8.4951-4965.2003
    [57] Tomas CA, Beamish J, Papoutsakis ET (2004) Transcriptional analysis of butanol stress and tolerance in Clostridium acetobutylicum. J Bacteriol 186: 2006-2018. doi: 10.1128/JB.186.7.2006-2018.2004
    [58] Alsaker KV, Spitzer TR, Papoutsakis ET (2004) Transcriptional analysis of spo0A overexpression in Clostridium acetobutylicum and its effect on the cell's response to butanol stress. J Bacteriol 186: 1959-1971. doi: 10.1128/JB.186.7.1959-1971.2004
    [59] Janssen H, Grimmler C, Ehrenreich A, et al. (2012) A transcriptional study of acidogenic chemostat cells of Clostridium acetobutylicum—solvent stress caused by a transient n-butanol pulse. J Bacteriol 161: 354-365.
    [60] Schwarz KM, Kuit W, Grimmler C, et al. (2012) A transcriptional study of acidogenic chemostat cells of Clostridium acetobutylicum-Cellular behavior in adaptation to n-butanol. J Bacteriol 161: 366-377.
    [61] Winkler J, Kao KC (2011) Transcriptional analysis of Lactobacillus brevis to N-butanol and ferulic acid stress responses. PloS ONE 6: e21438. doi: 10.1371/journal.pone.0021438
    [62] Santos J, Sousa MJ, Cardoso H, et al. (2008) Ethanol tolerance of sugar transport, and the rectification of stuck wine fermentations. Microbiology 154: 422-430. doi: 10.1099/mic.0.2007/011445-0
    [63] You KM, Rosenfield C-L, Knipple DC (2003) Ethanol tolerance in the yeast Saccharomyces cerevisiae is dependent on cellular oleic acid content. Appl Environ Microb 69: 1499-1503. doi: 10.1128/AEM.69.3.1499-1503.2003
    [64] Alexandre H, Ansanay-Galeote V, Dequin S, et al. (2001) Global gene expression during short-term ethanol stress in Saccharomyces cerevisiae. FEBS Lett 498: 98-103. doi: 10.1016/S0014-5793(01)02503-0
    [65] Machado IM, Atsumi S (2012) Cyanobacterial biofuel production. J Biotechnol 162: 50-56. doi: 10.1016/j.jbiotec.2012.03.005
    [66] Lan EI, Liao JC (2011) Metabolic engineering of cyanobacteria for 1-butanol production from carbon dioxide. Metab Eng 13: 353-363. doi: 10.1016/j.ymben.2011.04.004
    [67] Anfelt J, Hallström B, Nielsen J, et al. (2013) Using transcriptomics to improve butanol tolerance of Synechocystis sp. strain PCC 6803. Appl Environ Microb 79: 7419-7427.
    [68] Rühl J, Schmid A, Blank LM (2009) Selected Pseudomonas putida strains able to grow in the presence of high butanol concentrations. Appl Environ Microb 75: 4653-4656. doi: 10.1128/AEM.00225-09
    [69] Fischer CR, Klein-Marcuschamer D, Stephanopoulos G (2008) Selection and optimization of microbial hosts for biofuels production. Metab Eng 10: 295-304. doi: 10.1016/j.ymben.2008.06.009
    [70] Yomano L, York S, Ingram L (1998) Isolation and characterization of ethanol-tolerant mutants of Escherichia coli KO11 for fuel ethanol production. J Ind Microbiol Biot 20: 132-138. doi: 10.1038/sj.jim.2900496
    [71] López-Contreras AM, Claassen PA, Mooibroek H, et al. (2000) Utilisation of saccharides in extruded domestic organic waste by Clostridium acetobutylicum ATCC 824 for production of acetone, butanol and ethanol. Appl Environ Microb 54: 162-167.
    [72] Qureshi N, Saha BC, Dien B, et al. (2010) Production of butanol (a biofuel) from agricultural residues: Part I-Use of barley straw hydrolysate. Biomass Bioenerg 34: 559-565. doi: 10.1016/j.biombioe.2009.12.024
    [73] Qureshi N, Saha BC, Hector RE, et al. (2010) Production of butanol (a biofuel) from agricultural residues: Part II-Use of corn stover and switchgrass hydrolysates. Biomass Bioenerg 34: 566-571. doi: 10.1016/j.biombioe.2009.12.023
    [74] Mills TY, Sandoval NR, Gill RT (2009) Cellulosic hydrolysate toxicity and tolerance mechanisms in Escherichia coli. Biotechnol Biofuels 2: 26. doi: 10.1186/1754-6834-2-26
    [75] Zaldivar J, Martinez A, Ingram LO (1999) Effect of selected aldehydes on the growth and fermentation of ethanologenic Escherichia coli. Biotechnol Bioeng 65: 24-33.
    [76] Fitzgerald D, Stratford M, Gasson M, et al. (2004) Mode of antimicrobial action of vanillin against Escherichia coli, Lactobacillus plantarum and Listeria innocua. J Appl Microbiol 97: 104-113. doi: 10.1111/j.1365-2672.2004.02275.x
    [77] Cray JA, Stevenson A, Ball P, et al. (2015) Chaotropicity: a key factor in product tolerance of biofuel-producing microorganisms. Curr Opin Biotech 33: 228-259. doi: 10.1016/j.copbio.2015.02.010
    [78] Lee S, Monnappa AK, Mitchell RJ (2012) Biological activities of lignin hydrolysate-related compounds. BMB Rep 45: 265-274. doi: 10.5483/BMBRep.2012.45.5.265
    [79] Lee S, Nam D, Jung JY, et al. (2012) Identification of Escherichia coli biomarkers responsive to various lignin-hydrolysate compounds. Bioresource Technol 114: 450-456. doi: 10.1016/j.biortech.2012.02.085
    [80] Stead D (1993) The effect of hydroxycinnamic acids on the growth of wine‐spoilage lactic acid bacteria. J Appl Bacteriol 75: 135-141. doi: 10.1111/j.1365-2672.1993.tb02758.x
    [81] Guo W, Jia W, Li Y, et al. (2010) Performances of Lactobacillus brevis for producing lactic acid from hydrolysate of lignocellulosics. Appl Biochem Biotech 161: 124-136. doi: 10.1007/s12010-009-8857-8
    [82] Lee S, Lee JH, Mitchell RJ (2015) Analysis of Clostridium beijerinckii NCIMB 8052's transcriptional response to ferulic acid and its application to enhance the strain tolerance. Biotechnol Biofuels 8: 68. doi: 10.1186/s13068-015-0252-9
    [83] Zhang Y, Ezeji TC (2013) Transcriptional analysis of Clostridium beijerinckii NCIMB 8052 to elucidate role of furfural stress during acetone butanol ethanol fermentation. Biotechnol Biofuels 6: 66. doi: 10.1186/1754-6834-6-66
    [84] Wilson CM, Yang S, Rodriguez M, et al. (2013) Clostridium thermocellum transcriptomic profiles after exposure to furfural or heat stress. Biotechnol Biofuels 6: 131. doi: 10.1186/1754-6834-6-131
    [85] Jin Y, Fang Y, Huang M, et al. (2015) Combination of RNA Sequencing and Metabolite Data to elucidate improved toxic compound tolerance and butanol fermentation of Clostridium acetobutylicum from wheat straw hydrolysate by supplying sodium sulfide. Bioresour Technol 198:77-86. doi: 10.1016/j.biortech.2015.08.139
    [86] Guo T, He A, Du T, et al. (2013) Butanol production from hemicellulosic hydrolysate of corn fiber by a Clostridium beijerinckii mutant with high inhibitor-tolerance. Bioresour Technol 135: 379-385. doi: 10.1016/j.biortech.2012.08.029
    [87] Yoon SH, Lee EG, Das A, et al. (2007) Enhanced vanillin production from recombinant E. coli using NTG mutagenesis and adsorbent resin. Biotechnol Progr 23: 1143-1148.
    [88] Larsson S, Nilvebrant N, Jönsson L (2001) Effect of overexpression of Saccharomyces cerevisiae Pad1p on the resistance to phenylacrylic acids and lignocellulose hydrolysates under aerobic and oxygen-limited conditions. Appl Microbiol Biot 57: 167-174. doi: 10.1007/s002530100742
    [89] Corbisier P, van der Lelie D, Borremans B, et al. (1999) Whole cell-and protein-based biosensors for the detection of bioavailable heavy metals in environmental samples. Anal Chim Acta 387: 235-244. doi: 10.1016/S0003-2670(98)00725-9
    [90] Mitchell RJ, Hong HN, Gu MB (2006) Induction of kanamycin resistance gene of plasmid pUCD615 by benzoic acid and phenols. J Microbiol Biotechn 16: 1175.
    [91] Mitchell RJ, Gu MB (2005) Construction and evaluation of nagR-nagAa:: lux fusion strains in biosensing for salicylic acid derivatives. Appl Biochem Biotech 120: 183-197. doi: 10.1385/ABAB:120:3:183
    [92] Monnappa AK, Lee S, Mitchell RJ (2013) Sensing of plant hydrolysate-related phenolics with an aaeXAB:: luxCDABE bioreporter strain of Escherichia coli. Bioresource Technol 127: 429-434. doi: 10.1016/j.biortech.2012.09.086
    [93] Lee S, Mitchell RJ (2012) Detection of toxic lignin hydrolysate-related compounds using an inaA:: luxCDABE fusion strain. J Biotechnol 157: 598-604. doi: 10.1016/j.jbiotec.2011.06.018
    [94] Monnappa AK, Lee JH, Mitchell RJ (2013) Detection of furfural and 5-hydroxymethylfurfural with a yhcN:: luxCDABE bioreporter strain. Int J Hydrogen Energ 38: 15738-15743. doi: 10.1016/j.ijhydene.2013.05.037
    [95] Prevot AR, Fischer G, Bizzini B, et al. (1954) [Studies on ligninolytic bacteria]. C R Hebd Seances Acad Sci 238: 743-745.
    [96] Raynaud M, Bizzini B, Fischer G, et al. (1955) [Studies on ligninolytic bacteria; first part]. Ann Inst Pasteur (Paris) 88: 454-465; contd.
    [97] Fischer G, Bizzini B, Raynaud M, et al. (1955) [Ligninolytic bacteria. II. Characters of ligninolytic bacteria isolated from the soil]. Ann Inst Pasteur (Paris) 88: 618-624.
    [98] Bandounas L, Wierckx NJP, de Winde JH, et al. (2011) Isolation and characterization of novel bacterial strains exhibiting ligninolytic potential. Bmc Biotechnol 11: 94. doi: 10.1186/1472-6750-11-94
    [99] Raj A, Kumar S, Haq I, et al. (2014) Bioremediation and toxicity reduction in pulp and paper mill effluent by newly isolated ligninolytic Paenibacillus sp. Ecol Eng 71: 355-362. doi: 10.1016/j.ecoleng.2014.07.002
    [100] Hooda R, Bhardwaj NK, Singh P (2015) Screening and Identification of Ligninolytic Bacteria for the Treatment of Pulp and Paper Mill Effluent. Water Air Soil Poll 226: 305. doi: 10.1007/s11270-015-2535-y
    [101] Sahoo DK, Gupta R (2005) Evaluation of ligninolytic microorganisms for efficient decolorization of a small pulp and paper mill effluent. Process Biochem 40: 1573-1578. doi: 10.1016/j.procbio.2004.05.013
    [102] Chandra R, Singh S, Krishna Reddy MM, et al. (2008) Isolation and characterization of bacterial strains Paenibacillus sp. and Bacillus sp. for kraft lignin decolorization from pulp paper mill waste. J Gen Appl Microbiol 54: 399-407.
    [103] Chandra R, Bharagava RN (2013) Bacterial degradation of synthetic and kraft lignin by axenic and mixed culture and their metabolic products. J Environ Biol 34: 991-999.
    [104] Copley SD, Rokicki J, Turner P, et al. (2012) The whole genome sequence of Sphingobium chlorophenolicum L-1: insights into the evolution of the pentachlorophenol degradation pathway. Genome Biol Evol 4: 184-198. doi: 10.1093/gbe/evr137
    [105] Deng Y, Fong SS (2011) Metabolic engineering of Thermobifida fusca for direct aerobic bioconversion of untreated lignocellulosic biomass to 1-propanol. Metab Eng 13: 570-577. doi: 10.1016/j.ymben.2011.06.007
    [106] Chen CY, Huang YC, Wei CM, et al. (2013) Properties of the newly isolated extracellular thermo-alkali-stable laccase from thermophilic actinomycetes, Thermobifida fusca and its application in dye intermediates oxidation. AMB Express 3: 49. doi: 10.1186/2191-0855-3-49
    [107] Tian JH, Pourcher AM, Bouchez T, et al. (2014) Occurrence of lignin degradation genotypes and phenotypes among prokaryotes. Appl Microbiol Biotechnol 98: 9527-9544. doi: 10.1007/s00253-014-6142-4
    [108] Vanden Wymelenberg A, Gaskell J, Mozuch M, et al. (2009) Transcriptome and secretome analyses of Phanerochaete chrysosporium reveal complex patterns of gene expression. Appl Environ Microbiol 75: 4058-4068. doi: 10.1128/AEM.00314-09
    [109] Wong DW (2009) Structure and action mechanism of ligninolytic enzymes. Appl Biochem Biotechnol 157: 174-209. doi: 10.1007/s12010-008-8279-z
    [110] Gaskell J, Marty A, Mozuch M, et al. (2014) Influence of Populus genotype on gene expression by the wood decay fungus Phanerochaete chrysosporium. Appl Environ Microbiol 80: 5828-5835. doi: 10.1128/AEM.01604-14
    [111] Hori C, Ishida T, Igarashi K, et al. (2014) Analysis of the Phlebiopsis gigantea genome, transcriptome and secretome provides insight into its pioneer colonization strategies of wood. PLoS Genet 10: e1004759. doi: 10.1371/journal.pgen.1004759
    [112] MacDonald J, Doering M, Canam T, et al. (2011) Transcriptomic responses of the softwood-degrading white-rot fungus Phanerochaete carnosa during growth on coniferous and deciduous wood. Appl Environ Microbiol 77: 3211-3218. doi: 10.1128/AEM.02490-10
    [113] Macdonald J, Master ER (2012) Time-dependent profiles of transcripts encoding lignocellulose-modifying enzymes of the white rot fungus Phanerochaete carnosa grown on multiple wood substrates. Appl Environ Microbiol 78: 1596-1600. doi: 10.1128/AEM.06511-11
    [114] Garbelotto M, Guglielmo F, Mascheretti S, et al. (2013) Population genetic analyses provide insights on the introduction pathway and spread patterns of the North American forest pathogen Heterobasidion irregulare in Italy. Mol Ecol 22: 4855-4869. doi: 10.1111/mec.12452
    [115] Zubieta C, Krishna SS, Kapoor M, et al. (2007) Crystal structures of two novel dye-decolorizing peroxidases reveal a beta-barrel fold with a conserved heme-binding motif. Proteins 69: 223-233. doi: 10.1002/prot.21550
    [116] Chen F, Dixon RA (2007) Lignin modification improves fermentable sugar yields for biofuel production. Nat Biotechnol 25: 759-761. doi: 10.1038/nbt1316
    [117] Van Acker R, Vanholme R, Storme V, et al. (2013) Lignin biosynthesis perturbations affect secondary cell wall composition and saccharification yield in Arabidopsis thaliana. Biotechnol Biofuels 6: 46. doi: 10.1186/1754-6834-6-46
    [118] Pestana-Calsa MC, Pacheco CM, de Castro RC, et al. (2012) Cell wall, lignin and fatty acid-related transcriptome in soybean: Achieving gene expression patterns for bioenergy legume. Genet Mol Biol 35: 322-330. doi: 10.1590/S1415-47572012000200013
    [119] Vicentini R, Bottcher A, Brito Mdos S, et al. (2015) Large-Scale Transcriptome Analysis of Two Sugarcane Genotypes Contrasting for Lignin Content. PLoS ONE 10: e0134909. doi: 10.1371/journal.pone.0134909
    [120] Van Acker R, Leple JC, Aerts D, et al. (2014) Improved saccharification and ethanol yield from field-grown transgenic poplar deficient in cinnamoyl-CoA reductase. Proc Natl Acad Sci U S A 111: 845-850. doi: 10.1073/pnas.1321673111
    [121] Weber AP, Weber KL, Carr K, et al. (2007) Sampling the Arabidopsis transcriptome with massively parallel pyrosequencing. Plant Physiol 144: 32-42. doi: 10.1104/pp.107.096677
    [122] Cheung F, Haas BJ, Goldberg SM, et al. (2006) Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology. BMC Genomics 7: 272. doi: 10.1186/1471-2164-7-272
    [123] Barbazuk WB, Emrich SJ, Chen HD, et al. (2007) SNP discovery via 454 transcriptome sequencing. Plant J 51: 910-918. doi: 10.1111/j.1365-313X.2007.03193.x
    [124] Wong MM, Cannon CH, Wickneswari R (2011) Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing. BMC Genomics 12: 342. doi: 10.1186/1471-2164-12-342
  • This article has been cited by:

    1. Yingjian Yang, Shicong Wang, Nanrong Zeng, Wenxin Duan, Ziran Chen, Yang Liu, Wei Li, Yingwei Guo, Huai Chen, Xian Li, Rongchang Chen, Yan Kang, Lung Radiomics Features Selection for COPD Stage Classification Based on Auto-Metric Graph Neural Network, 2022, 12, 2075-4418, 2274, 10.3390/diagnostics12102274
    2. Yingjian Yang, Ziran Chen, Wei Li, Nanrong Zeng, Yingwei Guo, Shicong Wang, Wenxin Duan, Yang Liu, Huai Chen, Xian Li, Rongchang Chen, Yan Kang, Multi-modal data combination strategy based on chest HRCT images and PFT parameters for intelligent dyspnea identification in COPD, 2022, 9, 2296-858X, 10.3389/fmed.2022.980950
    3. Yanan Wu, Shuyue Xia, Zhenyu Liang, Rongchang Chen, Shouliang Qi, Artificial intelligence in COPD CT images: identification, staging, and quantitation, 2024, 25, 1465-993X, 10.1186/s12931-024-02913-z
    4. Meng Zhao, Yanan Wu, Yifu Li, Xiaoyu Zhang, Shuyue Xia, Jiaxuan Xu, Rongchang Chen, Zhenyu Liang, Shouliang Qi, Learning and depicting lobe-based radiomics feature for COPD Severity staging in low-dose CT images, 2024, 24, 1471-2466, 10.1186/s12890-024-03109-3
    5. Xingguang Deng, Wei Li, Yingjian Yang, Shicong Wang, Nanrong Zeng, Jiaxuan Xu, Haseeb Hassan, Ziran Chen, Yang Liu, Xiaoqiang Miao, Yingwei Guo, Rongchang Chen, Yan Kang, COPD stage detection: leveraging the auto-metric graph neural network with inspiratory and expiratory chest CT images, 2024, 62, 0140-0118, 1733, 10.1007/s11517-024-03016-z
    6. TaoHu Zhou, Yu Guan, XiaoQing Lin, XiuXiu Zhou, Liang Mao, YanQing Ma, Bing Fan, Jie Li, ShiYuan Liu, Li Fan, CT-based whole lung radiomics nomogram for identification of PRISm from non-COPD subjects, 2024, 25, 1465-993X, 10.1186/s12931-024-02964-2
    7. TaoHu Zhou, Yu Guan, XiaoQing Lin, XiuXiu Zhou, Liang Mao, YanQing Ma, Bing Fan, Jie Li, WenTing Tu, ShiYuan Liu, Li Fan, A clinical-radiomics nomogram based on automated segmentation of chest CT to discriminate PRISm and COPD patients, 2024, 13, 23520477, 100580, 10.1016/j.ejro.2024.100580
    8. Fei Shan, Minwen Zheng, 2024, Chapter 9, 978-981-99-8440-4, 153, 10.1007/978-981-99-8441-1_9
    9. Peng An, Junjie Liu, Mengxing Yu, Jinsong Wang, Zhongqiu Wang, Predicting mixed venous oxygen saturation (SvO2) impairment in COPD patients using clinical-CT radiomics data: A preliminary study, 2024, 32, 09287329, 1569, 10.3233/THC-230619
    10. Zecheng Zhu, Shunjin Zhao, Jiahui Li, Yuting Wang, Luopiao Xu, Yubing Jia, Zihan Li, Wenyuan Li, Gang Chen, Xifeng Wu, Development and application of a deep learning-based comprehensive early diagnostic model for chronic obstructive pulmonary disease, 2024, 25, 1465-993X, 10.1186/s12931-024-02793-3
    11. Yingjian Yang, Nanrong Zeng, Ziran Chen, Wei Li, Yingwei Guo, Shicong Wang, Wenxin Duan, Yang Liu, Rongchang Chen, Yan Kang, Weihua Yang, Multi‐Layer Perceptron Classifier with the Proposed Combined Feature Vector of 3D CNN Features and Lung Radiomics Features for COPD Stage Classification, 2023, 2023, 2040-2295, 10.1155/2023/3715603
    12. Tao-Hu Zhou, Xiu-Xiu Zhou, Jiong Ni, Yan-Qing Ma, Fang-Yi Xu, Bing Fan, Yu Guan, Xin-Ang Jiang, Xiao-Qing Lin, Jie Li, Yi Xia, Xiang Wang, Yun Wang, Wen-Jun Huang, Wen-Ting Tu, Peng Dong, Zhao-Bin Li, Shi-Yuan Liu, Li Fan, CT whole lung radiomic nomogram: a potential biomarker for lung function evaluation and identification of COPD, 2024, 11, 2054-9369, 10.1186/s40779-024-00516-9
    13. Taohu Zhou, Xiuxiu Zhou, Jiong Ni, Yu Guan, Xin’ang Jiang, Xiaoqing Lin, Jie Li, Yi Xia, Xiang Wang, Yun Wang, Wenjun Huang, Wenting Tu, Peng Dong, Zhaobin Li, Shiyuan Liu, Li Fan, A CT-Based Lung Radiomics Nomogram for Classifying the Severity of Chronic Obstructive Pulmonary Disease, 2024, Volume 19, 1178-2005, 2705, 10.2147/COPD.S483007
    14. Yingjian Yang, Jie Zheng, Peng Guo, Qi Gao, Yingwei Guo, Ziran Chen, Chengcheng Liu, Tianqi Wu, Zhanglei Ouyang, Huai Chen, Yan Kang, Three-stage registration pipeline for dynamic lung field of chest X-ray images based on convolutional neural networks, 2025, 8, 2624-8212, 10.3389/frai.2025.1466643
    15. Farzat Farha, Sageer Abass, Saba Khan, Javed Ali, Bushra Parveen, Sayeed Ahmad, Rabea Parveen, Transforming pulmonary healthcare: the role of artificial intelligence in diagnosis and treatment, 2025, 1747-6348, 10.1080/17476348.2025.2491723
  • Reader Comments
  • © 2015 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(6000) PDF downloads(1409) Cited by(0)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog