1.
Introduction
Natural hazards such as landslides and subsidence have been recognized as important impediments to developing nations' sustainable development [1,2,3,4]. For example, on December 20, 2015, in Guangming, Shenzhen, China, a catastrophic landfill slope landslide claimed the lives of 69 individuals [5]. Natural hazard risk assessment and management will have short-term benefits in terms of severity reduction and long-term benefits in terms of achieving sustainable development goals [1].
Slope stability evaluation is essential for analyzing and mitigating natural hazards in mountainous environments. Many attempts have been made to evaluate slope stability [6,7,8]. Owing to the inherent complexity and uncertainty, assessing slope stability for circular mode failure, a common problem, remains a challenge for practitioners and researchers [9]. Several methods for evaluating slope stability have been presented, with the limit equilibrium approach and numerical simulation method being the two most commonly used [10]. Limited equilibrium methods, such as the simplified Bishop, Spencer and Morgenstern-Price methods, are frequently implemented in practice. In general, soil material properties (unit weight, cohesion and friction angle) and the pore pressure ratio are required for limited equilibrium methods [11,12]. Numerical methods (e.g., finite element methods) have been extensively used to analyze slope stability. However, the major drawback is that their input parameters need to be back analyzed using in-situ measurements, which is not available in many cases [13]. Both methods have pros and cons. Finding the critical slip surface using the limit equilibrium method is difficult due to the large number of potential slip surfaces [14]. The numerical simulation method's accuracy is greatly influenced by the choice of constitutive models, mechanical parameters and boundary conditions, and it is frequently necessary to have a great deal of engineering expertise and to conduct on-site back analysis in order to make a reasonable choice and obtain reasonable results [15,16]. Consequently, predicting slope stability still presents considerable challenges.
In recent years, machine learning (ML) models have gained attention for solving very complex, nonlinear and multivariable geotechnical problems [17,18,19]. Assessments of slope stability circular failure using methods based on soft computing are summarized in Table 1. Despite their reliable and precise outputs, however, most algorithms are not readily applicable in practice owing to their complicated training and modeling procedures and "black box" aspects, i.e., these models do not demonstrate a transparent and understandable relationship between inputs and output. Quinlan [20] developed the model tree algorithm to get over these limitations. It integrates principles from decision trees and linear regression. In addition to the widespread application of soft computing techniques, several studies have only been conducted on a limited amount of data, which might restrict the classifier's ability to generalize. In the current study, an updated database of 627 cases comprising unit weight, cohesion, internal friction angle, slope angle and height, pore pressure ratio and stability status of circular mode failure has been compiled. To predict slope stability, a new logistic model tree (LMT) model is developed. This algorithm is an intelligent choice for classification and decision-making since it can solve the classification problem by combining a tree model and a logistic regression (LR) technique. Adding LR to the leaves of the tree allows for a probabilistic interpretation of the model's output, making it easier to explain, as they represent a series of if-then-else rules. The LMT has been employed in predicting pillar stability in geotechnical engineering [21], but it has not yet been used to predict slope stability.
2.
Materials and methods
2.1. LMT
LR is a straightforward method with features such as stability, low variance and time-efficient training [37], but its prediction outputs are frequently biased. Decision trees are another ML technique for searching a less confined space of candidate models and obtaining nonlinear patterns from a database; they have low bias but high variance and instability, making them susceptible to overfitting. Landwehr et al. [38] presented the LMT methodology as a result. It is based on Quinlan's model tree approach [20] for dealing with regression problems by combining linear regression with decision tree models, and it is extended to classification problems. This section provides a basic introduction to LMT, whereas the seminal work by Landwehr et al. [38] provides a more complete description.
2.2. Tree structure
A LR tree with LR functions built at the leaves constitutes an LMT. It has a set of leaves or terminal nodes T and an inner or non-terminal node set N. Each leaf t of the LMT model has correlated LR functions instead of classification labels or linear regression functions. The output vector is Y, and the input vector X is (X1, X2), for instance. S represents the complete instance space, which can be further partitioned into numerous subspaces St. Figure 1 displays a simple input space that has been partitioned into seven subspaces.
The model determines LR functions for the seven subspaces represented in Figure 1. Figure 2 depicts the structure of the tree.
2.3. Logistic function
In contrast to standard forms of LR, the LogitBoost technique for fitting additive LR models proposed by Friedman et al. [39] is employed for model construction here. The prediction probability is given by Eq (2).
where G denotes the output, J denotes the class labels, X denotes the inputs, and Fj(x) denotes the functions that the LMT will train in the tree's leaves, as follows:
where m is the number of iterations, fmj represents the functions of input variables, α represents the intercepts and coefficients of the linear function, and S represents the variables of the subset St at the leaf t.
2.4. LMT training
A LMT can be established by the following steps: initial tree growth, tree splitting and stopping, and tree pruning. This section presents the basic idea; the reader is referred to Landwehr et al. [38] for additional detail. The M5P technique, which is commonly used for tree growth, can first construct a standard tree, after which a LR model can be established at each node [40,41]. This technique merely trains the model using case histories at each node in isolation, without taking the surrounding tree structure into account; therefore, in order for the LogitBoost algorithm to iteratively change Fj(x) to increase the fit in a natural way, another technique-one that can incrementally refine logistic model fit at high levels-is used [38]. The function fmj is introduced to Fj by changing one of the function's coefficients or introducing another variable (see Eq (3)). As a result, in the initial growing process, a LR tree is formed in the root using proper iteration numbers. The tree then begins to grow by resembling specified subsets (t) from the database (S) to the child nodes, utilizing the C4.5 splitting law [20] to increase the accuracy of the classification variable. The LR functions are generated in the child nodes by running the LogitBoost algorithm with the logistic model, weights and probability estimate from the previous iteration at the parent node. The splitting process is then repeated. When it comes to model fitting, the tree stops splitting when a node has less than 15 cases. After the tree is constructed, the tree pruning approach is used to trade off tree size and model complexity while maintaining predictive accuracy. After experimenting with several pruning initiatives, Landwehr et al. [38] employed the classification and regression trees pruning approach [42] to make pruning decisions while taking training error and model complexity into account. These three processes can be used to create a LMT.
2.5. Data catalog
An updated database with 627 instances was obtained from previous studies [22,23,25,29,33,43] and can be found in Table S1 of the supplementary information file. The database includes the unit weight, cohesion, angle of internal friction, slope angle and height, pore pressure ratio and slope stability status. There are 311 positive (stable) and 316 negative (failed) samples. The statistics of the input features are summarized in Table 2. The data normalization is not carried out owing to the fact that tree-based methods are insensitive to feature scaling; they make decisions based on relative feature values and splits [44]. The database box plot is shown in Figure 3, where solid black spots represent "outliers", and the statistical features are shown in Table 2. The bottom and top quartiles are shown by horizontal lines, while the median values are represented by bold lines inside the boxes. Slopes with "failed" and "stable" instances are also demonstrated separately.
2.6. Performance metrics
The classification metrics include accuracy evaluation indices (accuracy (Acc), Matthews correlation coefficient (Mcc), precision (Prec), recall (Rec) and F-score) that are calculated from the confusion matrix (see Figure 4).
Each row of the matrix represents the instances in an actual class, while each column represents the instances in a predicted class [45]. In predictive analytics, a confusion matrix is described as a table with two rows and two columns that provides the numbers of true positives (TPs), true negatives (TNs), false positives (FPs) and false negatives (FNs). The classification evaluation metrics derived from the confusion matrix results were used to compare the prediction performances of the models [46,47,48]:
Acc is the sum of TP and TN and represents the percentage of correctly classified instances in the data as a whole. This metric measures the model's total prediction accuracy. If the data set is unbalanced, that is, the numbers of observations in different classes vary substantially, accuracy will be deceiving [49]. As a result, further assessment metrics such as precision, recall, F-score and Mcc were utilized to examine the model's performance further. Precision is known as positive predicted value, and recall is known as true positive rate (TPR). F-score is a generalized index that evaluates the performance of both recall and precision and ranges from 0 (worst value) to 1 (best value). Mcc denotes the degree of agreement between observed and predicted classes of failed and stable instances [48]. It is a standard metric used by statisticians that accepts values ranging from −1 to 1. Mcc value of −1 indicates complete disagreement (strong negative association), a value of 1 indicates complete agreement (strong positive association), and a value of 0 indicates that the prediction was unrelated to the ground truth (very weak or no correlation between dependent and independent variables). Additionally, another succinct metric, i.e., area under the receiver operating characteristic curve (AUC), was employed for effective evaluation. This curve is expressed in a graphical plot that shows a binary classification system's diagnostic capacity as its discrimination threshold changes. The AUC is plotted by comparing the TPR vs. the false positive rate (FPR) at various threshold levels. An acceptable classification model should have an AUC near to 1. Table 3 shows the main rule for defining discrimination based on AUC value [25].
3.
Results and discussion
3.1. Development of LMT models for slope stability prediction
Waikato Environment for Knowledge Analysis (WEKA) software was used for developing a model based on the acquired data set. WEKA is a collection of ML algorithms that supports data mining tasks by providing a wide range of tools that can be used for data pre-processing, classification, clustering, regression, association and visualization [50]. The database was divided into two parts: training (80%) and testing (20%). The training set contained 249 stable instances and 253 failed cases, while the test set contained 125 instances, 62 of which were stable and 63 of which were failed instances. The stable-to-failed instance ratios in the training and testing sets were close to one, indicating that the distribution of these two instances does not necessitate a cost-sensitive technique to address problem due to imbalance [51,52].
Every terminal node (or leaf) of the tree model was trained and updated using LR models during training (see Section 2). The minimal number of instances for the LMT model was determined to be 15 based on predictive performance, readily applicable tree structure and total training data. The size of the tree was 45, and it had 23 leaves. LogitBoost had a weight trimming value of 0.2 and a number of iterations of −1. Figure 5 depicts a representation of a tree generated by LMT. The LMT model has 23 logistic functions (LMs), and their detailed expressions are shown in Table 4. It is important to notice that some of the functions in Table 4 do not incorporate all of the parameters that have been chosen. The LM1 function for stable slopes in Table 4, for example, does not account for unit weight (γ). The simple logistic technique is used in the LMT training phase [38]. The goal of the simple logistic method is to control the parameter numbers as simply and straightforwardly as possible. New parameters are gradually introduced during training in order to improve the performance of each function at each node of the tree (see Section 2). This can also help to avoid the issue of model significance in LR, especially when using multiple logistic functions to build a full logistic model with all parameters. However, just a few of the functions had fewer parameters than those selected, showing that most of the factors we chose have an influence on predictive performance.
3.2. Model performance evaluation
The proposed LMT model was quantified using several performance metrics based on confusion matrices. The confusion matrices of the model in training and test sets were then obtained, as shown in Table 5.
The proposed LMT model predicts two classes (stable and failed). Table 5 displays the confusion matrices that illustrate training and prediction results. The number of cases that could have been correctly predicted is indicated by the values along the major diagonal. The results of Table 5 show that the LMT correctly classified the majority of the cases.
The prediction outcomes of classification problems can be assessed using a variety of metrics, such as Acc, Mcc, AUC, recall, precision and F-score. The LMT model performed well in both the training set (Acc = 92.23%, Mcc = 0.846, AUC = 0.974, F-score for stable state = 0.924 and F-score for failed state = 0.921) and testing set (Acc = 85.60%, Mcc = 0.713, AUC = 0.907, F-score for stable state = 0.850 and F-score for failed state = 0.862). The employed metrics show that the prediction's results are accurate and acceptable.
The LMT model achieved better prediction performance (Acc = 0.856 and AUC = 0.907) in comparison to the SVM model (Acc = 0.812 and AUC = 0.796), SGD model (Acc = 0.640 and AUC = 0688), QDA model (Acc = 0.788 and AUC = 0.817), GNB model (Acc = 0.812 and AUC = 0.775), DT model (Acc = 0.788 and AUC = 0.829) and RT model (Acc = 0.788 and AUC = 0.904), reported by Pham et al. [25], respectively, for the test data. When using ML techniques to predict slope stability, Acc generally ranges from 0.640 to 0.812, according to the results of previous study by Pham et al. [25]. Meanwhile, in the present study, it is 0.856 for the test data set. However, due to the use of different data sets, a comparison between these results is unwarranted. A project that uses different data sets is needed to give a generalized model to predict slope stability. Additionally, the results for relatively small sample sizes (less than 100) are not presented or compared. These comparative results demonstrate conclusively that the LMT model is capable of improved generalization performance compared to other models in the literature. The potential reasons underlying this observation are that the LMT model can capture interactions between features effectively, the branching structure of the decision tree can identify feature interactions, and LR at the leaves can model the relationships between these interactions and the target variable.
4.
Case studies
Two case studies are analyzed using our proposed LMT-based slope stability prediction model to determine its efficacy and applicability in engineering practice. The LMT model predicted ten slope stability status events in two different projects. The field data was obtained from the available literature, which includes the Shao Jiazhuang slope failure [53] and the Daguangbao landslide [54].
The slope at Shao Jiazhuang village in Guizhou province, China (see Figure 6 (a)) [53], is utilized as a real-world example to analyze slope stability with the proposed LMT model and validate its efficacy and practicality in engineering practice application. As depicted in Figure 6(c)–(e), during the survey, there were localized shallow surface collapses and new cracks on the slope's left side [53].
The input data shown in Table 6 were used by the LMT to predict the stability of the case slope based on the material parameters and geometry feature parameters of the case slope. The slope is failed, according to the prediction's outcome. This study demonstrates that the LMT model is reliable for predicting slope stability and that it can be applied to a variety of geotechnical applications.
The Daguangbao landslide, is one of the few extremely big landslides known to exist worldwide, with a size of over 1 billion m3 [55]. It is also the most extensive and largest landslide ever recorded in Chinese historical records. Because of its massive volume, unusual genetic mechanism and complex movement process [56,57,58], the Daguangbao landslide has drawn a lot of attention and interest.
The stability evaluation is carried out on 9 slopes, approved by specialists, from the Daguangbao landslide, of which 5 are stable, and 4 failed [54]. The index values of the samples and predicted results of the LMT model are compared to the fuzzy discriminant method and the unascertained measure method in Table 7.
The prediction outcomes in Table 7 indicate that the slope stability was predicted correctly for all cases. The theory of unascertained measure and its mathematical processing were first put forward by Wang [59] in 1990. On this basis, Liu et al. [60,61] established the unascertained mathematical theory and applied it to decision-making problems. Unascertained information is a new kind of uncertainty information which is different from fuzzy information, random information and grey information. Compared with other evaluation methods, the unascertained measure method has the advantages of non-negativity, normalization and additivity, which also ensures the order of evaluation space. A fuzzy discriminant method [62] is a decision classification to construct a numerical tabular knowledge base from historical cases, and it derives inferences from particular case histories using discrimination and connectivity analyses which are based on a theory of fuzzy relations. For further details regarding these methods, readers may refer to [54,59,60,61,62].
5.
Risk analysis
To demonstrate a use of the proposed LMT model for risk analysis, we use the case history data in Table 7. The following lists typical input data for the Shao Jiazhuang, China, slope failure area: γ = 20 kN/m3, c = 22.4 kPa, ϕ = 28°, β = 39.47°, H = 14 m, and ru = 0. Then, using the tree structure from Figure 5, we use the appropriate functions to compute the probability of failure (PoF). For example, to make a prediction for a new input instance, start at the root node of the LMT model (e.g., the value of c in this case is 22.4 kPa) and follow the path through the tree according to the splitting rules. Then, proceed to the end function at a leaf node (for this instance, LM8). The function values (FS and FF) are first calculated using the LM8 equations in Table 4. Then, using Eq (2), the probability of slope stability is given by Eq (11):
Finally, we can get the results: PS = 1%, and PF = 99%, implying that this instance has a PoF equal to 99%. This also accords with the actual facts. Such probability results can then be incorporated into risk analysis with the associated failure cost estimate.
6.
Conclusions
For the assessment of slope stability, a novel application of LMT is proposed. The tree structure and corresponding functions are used to assess slope stability, given information on several features, such as slope height (H), slope angle (β), cohesion (c), internal friction angle (ϕ), unit weight (γ) and pore pressure ratio (ru). LogitBoost learns the LMT by utilizing a larger database obtained from the literature. The results show that the LMT model can effectively predict slope stability. A testing set was used to validate the trained LMT model. Furthermore, real-world application to new cases—which had not previously been used for training-was used to validate the proposed LMT model. Comparative study and engineering application results show that the LMT model has the best prediction effect and is the best, optimal model. Furthermore, the results indicate that the LMT technique can provide useful information regarding the probability of slope failure, allowing it to be used for risk analysis of slope stability. Finally, the main advantages of LMT over other soft computing models commonly used for slope stability prediction are that it can be trained easily (even with more input parameters) and that its tree structure, with an LR function in each leaf, explicitly demonstrates the relationship between the inputs and predictive output. Furthermore, the LMT has the potential to be used to solve other geotechnical problems in the future due to its intuitive features and ease of implementation. It is recommended for future work that considering tangents (i.e., tan ϕ and tan β) could improve performance.
Use of AI tools declaration
The authors declare they have not used artificial intelligence (AI) tools in the creation of this article.
Acknowledgments
This work was supported by the National Key Research and Development Plan of China under Grant No. 2021YFB2600703. The authors also wish to thank CEU San Pablo University Foundation for the funds dedicated to the ARIE Research Group, through Project Ref. EC01/0720-MGI23RGL, provided by the CEU San Pablo University.
Conflict of interest
The authors declare there is no conflict of interest.