
In order to grasp the degradation of rolling bearings and prevent the failure of mechanical equipment, a remaining useful life (RUL) prediction method of rolling bearings based on degradation detection and deep bidirectional long short-term memory networks (BiLSTM) was proposed, considering the incomplete degradation feature extraction and low prediction accuracy of existing methods. By extracting the characteristics of time domain, frequency domain, and time-frequency domain of the full-life bearing vibration signal, the monotonicity, trend, and robustness measurement indexes of each feature were calculated. The best feature set that can fully reflect the degradation information was constructed by ranking the weighted comprehensive indexes of the features. A degradation detection strategy was used to determine the degradation starting time for setting piecewise linear network label. The RUL prediction model based on deep BiLSTM was established and optimized through Dropout technology and piecewise learning rate. The model was verified by the full-life data set of rolling bearings. The results showed that compared with the support vector machine (SVM), the traditional recurrent neural network (RNN), the single-layer BiLSTM, and long short-term memory networks (LSTM) model without Dropout, the proposed method fitted the degradation trend best, and the root mean square error (RMSE) was the smallest and only 0.0281, which improved the accuracy of RUL prediction of rolling bearings, helped prevent bearing failure, and ensured the safe and reliable operation of rotating machinery.
Citation: Shuang Cai, Jiwang Zhang, Cong Li, Zequn He, Zhimin Wang. A RUL prediction method of rolling bearings based on degradation detection and deep BiLSTM[J]. Electronic Research Archive, 2024, 32(5): 3145-3161. doi: 10.3934/era.2024144
[1] | Dewang Chen, Xiaoyu Zheng, Ciyang Chen, Wendi Zhao . Remaining useful life prediction of the lithium-ion battery based on CNN-LSTM fusion model and grey relational analysis. Electronic Research Archive, 2023, 31(2): 633-655. doi: 10.3934/era.2023031 |
[2] | Zhenzhong Xu, Xu Chen, Linchao Yang, Jiangtao Xu, Shenghan Zhou . Multi-modal adaptive feature extraction for early-stage weak fault diagnosis in bearings. Electronic Research Archive, 2024, 32(6): 4074-4095. doi: 10.3934/era.2024183 |
[3] | Ilyоs Abdullaev, Natalia Prodanova, Mohammed Altaf Ahmed, E. Laxmi Lydia, Bhanu Shrestha, Gyanendra Prasad Joshi, Woong Cho . Leveraging metaheuristics with artificial intelligence for customer churn prediction in telecom industries. Electronic Research Archive, 2023, 31(8): 4443-4458. doi: 10.3934/era.2023227 |
[4] | Yu Chen, Qingyang Meng, Zhibo Liu, Zhuanzhe Zhao, Yongming Liu, Zhijian Tu, Haoran Zhu . Research on filtering method of rolling bearing vibration signal based on improved Morlet wavelet. Electronic Research Archive, 2024, 32(1): 241-262. doi: 10.3934/era.2024012 |
[5] | Changhai Wang, Jiaxi Ren, Hui Liang . MSGraph: Modeling multi-scale K-line sequences with graph attention network for profitable indices recommendation. Electronic Research Archive, 2023, 31(5): 2626-2650. doi: 10.3934/era.2023133 |
[6] | Xiao Ma, Teng Yang, Feng Bai, Yunmei Shi . Sentence opinion mining model for fusing target entities in official government documents. Electronic Research Archive, 2023, 31(6): 3495-3509. doi: 10.3934/era.2023177 |
[7] | Xite Yang, Ankang Zou, Jidi Cao, Yongzeng Lai, Jilin Zhang . Systemic risk prediction based on Savitzky-Golay smoothing and temporal convolutional networks. Electronic Research Archive, 2023, 31(5): 2667-2688. doi: 10.3934/era.2023135 |
[8] | Peng Lu, Yuze Chen, Ming Chen, Zhenhua Wang, Zongsheng Zheng, Teng Wang, Ru Kong . An improved stacking-based model for wave height prediction. Electronic Research Archive, 2024, 32(7): 4543-4562. doi: 10.3934/era.2024206 |
[9] | Peng Lu, Yuchen He, Wenhui Li, Yuze Chen, Ru Kong, Teng Wang . An Informer-based multi-scale model that fuses memory factors and wavelet denoising for tidal prediction. Electronic Research Archive, 2025, 33(2): 697-724. doi: 10.3934/era.2025032 |
[10] | Huimin Qu, Haiyan Xie, Qianying Wang . Multi-convolutional neural network brain image denoising study based on feature distillation learning and dense residual attention. Electronic Research Archive, 2025, 33(3): 1231-1266. doi: 10.3934/era.2025055 |
In order to grasp the degradation of rolling bearings and prevent the failure of mechanical equipment, a remaining useful life (RUL) prediction method of rolling bearings based on degradation detection and deep bidirectional long short-term memory networks (BiLSTM) was proposed, considering the incomplete degradation feature extraction and low prediction accuracy of existing methods. By extracting the characteristics of time domain, frequency domain, and time-frequency domain of the full-life bearing vibration signal, the monotonicity, trend, and robustness measurement indexes of each feature were calculated. The best feature set that can fully reflect the degradation information was constructed by ranking the weighted comprehensive indexes of the features. A degradation detection strategy was used to determine the degradation starting time for setting piecewise linear network label. The RUL prediction model based on deep BiLSTM was established and optimized through Dropout technology and piecewise learning rate. The model was verified by the full-life data set of rolling bearings. The results showed that compared with the support vector machine (SVM), the traditional recurrent neural network (RNN), the single-layer BiLSTM, and long short-term memory networks (LSTM) model without Dropout, the proposed method fitted the degradation trend best, and the root mean square error (RMSE) was the smallest and only 0.0281, which improved the accuracy of RUL prediction of rolling bearings, helped prevent bearing failure, and ensured the safe and reliable operation of rotating machinery.
Rolling bearings often operate in complex working conditions and harsh environments. Due to overload, poor lubrication, fatigue, corrosion, and many other internal and external causes, the performance of rolling bearings deteriorates inevitably over time and directly determines the performance and reliability of modern equipment. Once the bearing has wear, fracture and other problems, it will affect the normal work of the entire equipment, cause equipment damage and huge property losses, and more likely lead to serious casualties.
Prognostics and health management (PHM) of mechanical equipment takes advantage of advanced sensing technology to realize real-time perception of equipment (such as vibration, temperature, force, etc.). Industrial big data (operation information, maintenance history, and usage plans, etc.) and various algorithms and models (such as signal processing, failure models, machine learning, expert systems, etc.) are utilized in PHM to identify equipment operating states, detect early failures, assess fault degree, reveal degradation laws, and predict future states and remaining useful life (RUL). Furthermore, the maintenance cost and spare parts inventory information are used in PHM to change the control strategy or adjust the production plan to extend the service life of the equipment, so as to realize the adaptive fault-tolerant control of the equipment, improve the efficiency of resource management, and optimize the operation and maintenance strategy. As the key step of PHM, RUL prediction plays a connecting role. It is not only a comprehensive summary of equipment monitoring data, running condition and remaining life, but also the basis for future maintenance decisions. Therefore, monitoring and predicting the RUL of equipment can help timely implement real-time fault-tolerant operations, thereby minimizing performance degradation and avoiding dangerous situations. It is of great significance to effectively ensure the reliability and safety of mechanical equipment. Rolling bearings are one of the most basic and failure-prone parts in mechanical equipment. The RUL prediction of rolling bearings can reveal the deterioration trend of the equipment, which can help to formulate reasonable maintenance decisions and reduce the safety risk of mechanical equipment [1].
The commonly used RUL prediction methods mainly include model-driven method and data-driven method. Model-driven methods describe the degradation process of machinery by establishing mathematical models based on equipment fault mechanisms or failure principles [2]. Zhao et al. [3] developed a Paris model based gear RUL prediction method, combined with a finite element model and a Bayesian method. Deng et al. [4] used the Pairs-Erdogan crack growth model to express the degradation state of bearings, combined with auxiliary particle filtering methods for bearing RUL prediction. However, the model-driven method needs to conduct mathematical modeling and analysis of the equipment degradation process, and the modeling accuracy is affected by various factors, and it is difficult to model the actual complex process [5].
With the continuous promotion of machine learning, data-driven methods have been widely used in the field of RUL prediction. Such methods dig deep into the operating state information of equipment through historical data, which has significant advantages when the working principle of equipment is unclear and it is difficult to accurately establish complex models [6]. Rai et al. [7] proposed a bearing RUL estimation method based on nonlinear autoregressive neural network and wavelet filtering technique. Liu et al. [8] adopted support vector machine to predict the RUL of rolling bearings based on health state assessment. The above shallow machine learning methods [9] are simple in structure and have some prediction accuracy, but suffer from too many adjustable parameters, slow training speed, and easy overfitting, thus limiting the generalizability of the methods.
In recent years, deep learning methods have been widely used in the field of mechanical equipment fault diagnosis and RUL prediction [10]. Among them, long short-term memory networks (LSTM) have unique advantages in dealing with timing problems due to their long-term memory capability. Zhang et al. [11] proposed a LSTM-based model to assess bearing performance degradation. A new indicator of waveform entropy is proposed as the LSTM input for predicting the RUL of the bearings. Chang et al. [12] combined Fourier transform and principal component analysis methods to fuse rolling bearing vibration feature data, and optimized the LSTM model based on multilayer grid search algorithm to improve the accuracy of RUL prediction. However, LSTM only considers the forward transmission of information when processing degraded feature sequences, while bidirectional long short-term memory networks (BiLSTM) can take into account the backward feature information, thus comprehensively mine bearing degradation features [13].
Therefore, a rolling bearing RUL prediction method based on degradation detection and deep BiLSTM is proposed by comprehensively considering the diversity and sequence correlation of bearing degradation processes. The main contributions are summarized as follows:
● The time domain, frequency domain and time-frequency domain features are fully extracted from the full-life vibration signals of the bearings. A weighted comprehensive indexes considering the monotonicity, trend, and robustness of each feature is constructed to select the optimal features.
● A degradation detection strategy is adopted to determine the beginning time of bearing degradation for setting the segmented linear network labels.
● A deep BiLSTM model is established for rolling bearing RUL prediction. The model is optimized by Dropout technology and piecewise learning rate to improve the prediction accuracy, so as to prevent bearing failure and ensure the safe operation of equipment.
The remainder of the paper is as follows. The principles of optimal feature selection are introduced in Section 2. In Section 3, a RUL prediction method for rolling bearings is proposed based on deep BiLSTM model and a degradation detection strategy. Section 4 verifies the effectiveness of the method through a life-cycle experimental data of rolling bearings. Section 5 gives the conclusions and future work.
The accuracy and speed of the training network are both influenced by degradation features. Reasonable selection of degradation features can contribute to enhancing the prediction accuracy. Three measurement indexes, namely robustness, trend, and monotonicity, are utilized here to identify the most optimal degradation features [14].
Robustness refers to the resistance of feature parameters to influences such as measurement noise, the randomness of the degradation process, and variations in operational conditions. A robust degradation feature should exhibit stability when confronted with interference. The formula for measuring the robustness of each feature is presented in Eq (1):
Rob(Z)=1nn∑j=1exp(−|zj−ˉzjzj|) | (1) |
where ¯zj represents the jth feature value of the degradation feature Z after undergoing a smoothing process; n represents the total number of values in the feature sequence.
Here, we employ the exponentially weighted moving average (EWMA) method for data smoothing, and the formula is presented in Eq (2):
¯zj=β¯zj−1+(1−β)zj | (2) |
where the coefficient β indicates the speed of weight decay, and a smaller value leads to a faster weight decrease.
The equipment degradation process is irreversible. Therefore, the characteristics sensitive to the degradation process should exhibit a monotonous degradation trend. However, mechanical equipment usually shows a nonlinear degradation trend. The Spearman correlation coefficient [15] can be used to measure the nonlinear correlation between two variables, which can reflect the degradation trend of each feature over time. The trend measurement for each feature is as shown in Eq (3), which helps to select suitable features suitable for RUL prediction by calculating the trend of rolling bearing degradation features.
Tre(Z)=|n(n∑j=1˜zj˜tj)−(n∑j=1˜zj)(n∑j=1˜tj)|√[nn∑j=1˜z2j−(n∑j=1˜zj)2][nn∑j=1˜t2j−(n∑j=1˜tj)2] | (3) |
where ˜zj is a sorted sequence of zj, and ˜tj is a sorted sequence of the sampling time tj.
The monotonicity is employed to evaluate the degree of monotonic changes in various features during the degradation process, and the monotonicity measurement is given in Eq (4).
Mon(Z)=1n−1|n−1∑j=1ε(zj+1−zj)−n−1∑j=1ε(zj−zj+1)| | (4) |
where ε(∙) represents the unit step function.
Considering the robustness, trend, and monotonicity of the features, a weighted comprehensive index Q is formulated as Eq (5).
Q=ω1Rob(Z)+ω2Tre(Z)+ω3Mon(Z) | (5) |
where the weight ω1,ω2,ω3>0, ω1+ω2+ω3=1.
To fully explore the temporal correlation characteristics of vibration signals throughout the entire lifespan of rolling bearings, we propose a rolling bearing RUL prediction method based on degradation detection and deep BiLSTM model. Initially, the common time domain, frequency domain, and time-frequency domain features of the signals are extensively extracted, and three feature evaluation metrics are employed to construct a weighted synthetical index for optimal feature selection. A degradation detection method for bearings is proposed to determine the beginning time of degradation and define linear RUL prediction labels. Subsequently, a deep BiLSTM model was established, incorporating Dropout mechanisms and segmented learning rates to optimize the model for rolling bearing RUL prediction. The main process of the method is shown in Figure 1, including three key components: Feature selection, label processing, and model construction for RUL prediction. The details are as follows.
Vibration signal analysis has the advantages of convenient collection and abundant information. It has become a common method for bearing fault diagnosis and RUL prediction. To enhance the accuracy of RUL prediction, it is necessary to extract and select the original vibration data appropriately. The specific steps are as follows.
To comprehensively capture the degradation information of rolling bearings, the common time domain, frequency domain and time-frequency domain features are extracted from the full lifespan vibration data of the bearings in this paper.
The time domain characteristics have the advantages of intuition, simple calculation and obvious trend, and can reflect the relevant information of bearing degradation, but the fault defect or degradation degree characterized by different parameters are different to some extent. The mean value reflects the change of vibration intensity. The root mean square value reflects the change of wear degree. The peak value reflects the change of shock vibration caused by the fault. The skewness reflects that the bearing has local pitting or spalling, but it is not sensitive to early failure. Pulse factor and margin factor are sensitive to surface damage and early failure. Therefore, it is necessary to use the feature parameters comprehensively, so that the extracted degradation feature set can reflect the performance change of the whole life cycle. Eleven time domain features are extracted in this paper, including mean, standard deviation, root mean square, root mean square amplitude, peak value, skewness, kurtosis, peak factor, margin factor, waveform factor, and pulse factor.
Bearing degradation will cause the change of vibration amplitude, which is manifested in time domain characteristics, but the essence is the change of vibration frequency component. The frequency domain analysis of vibration signal describes the distribution of different frequency components in the spectrum, reflecting the fault type and degree of rolling bearing. Similar to the time domain features, the bearing fault related information reflected by the frequency domain features also has some differences. The mean frequency reflects the change of vibration energy; the frequency standard deviation reflects the variation of the dispersion degree of vibration energy. The centroid frequency reflects the change of the spectrum distribution. Different parameters have different sensitivity to the type and degree of fault. We should make comprehensive use of the frequency domain features, so that the extracted features can reflect the performance changes of the whole life cycle. Six frequency domain features are extracted in this paper, including mean frequency, centroid frequency, root mean square frequency, frequency standard deviation, frequency kurtosis, and frequency skewness.
Time domain and frequency domain analysis with low time-frequency resolution cannot reflect the frequency change law of non-stationary vibration signal with time and perform local analysis of vibration signal. Time-frequency domain analysis can reflect the change of different frequencies over time. Based on wavelet transform, wavelet packet decomposition cannot only effectively represent the low-frequency signal characteristics of the bearing, but also repartition the high-frequency components of the signal to characterize the degradation information contained in the high-frequency signal, so that the method can reflect the degradation information of the rolling bearing more comprehensively. Therefore, a four-layer wavelet packet decomposition was conducted to perform time-frequency domain analysis in this study, and the normalized energy of the first 8 nodes in the time-frequency domain is selected as eight feature parameters.
Data standardization is a commonly used data preprocessing technique employed to transform data into a uniform form with similar scales and distributions, so as to facilitate improved performance within certain machine learning algorithms. Data standardization helps ensure that variations in the ranges of different feature values do not adversely affect the model's training and performance. In this study, we applied the Z-score standardization method to process the data, with the specific formula as follows:
z′j=zj−μσ | (6) |
where zj represents the jth feature value in the feature sequence, z′j represents the jth standardized feature value in the feature sequence, σ represents the standard deviation of all feature values and μ represents the mean of all feature values.
The principles in Section 2 are utilized for the optimal feature selection. The robustness, trend and monotonicity measurement indexes of each feature are calculated using Eqs (1)–(4).
Considering the robustness, trend, and monotonicity of the features, the weighted comprehensive index Q of each feature is calculated using Eq (5).
A degradation detection strategy is used to determine the degradation starting time. Calculating the mean value μnor and standard deviation σnor of the absolute value of the bearing vibration acceleration under normal operating conditions, when the absolute value of the vibration acceleration exceeds μnor+3σnor for many consecutive times from certain point, this point is regarded as the beginning time of bearing degradation, and it is also the demarcation point of the bearing's normal and degradation stage. When the bearing is in the degradation stage, the RUL prediction can help to find the bearing failure in time, so as to take effective measures to ensure the safe and reliable operation of the equipment.
Considering that there is no obvious failure when the bearing is in the normal stage, the network labels before the beginning time of bearing degradation are set to 0. Subsequently, the network labels increase linearly, with the label set to 1 at the moment of bearing failure. These obtained network labels are utilized as training labels for the predictive model, as illustrated in Figure 2.
The time domain, frequency domain, and time-frequency domain feature sequences of vibration signals for each bearing's entire lifespan are calculated individually. The optimal degradation features are selected as the model input through the characteristic measurement indexes presented in Section 3.1. The RUL label is established as the model output through the method described in Section 3.2.
LSTM finds extensive application in the field of time series forecasting, with a pronounced advantage in modeling long-term dependencies. A LSTM cell consists of input gate, forget gate, and output gate. The forget gate ft in Eq (7) determines what information the cell state discards.
ft=σ(Wf⋅[ht−1,xt]+bf) | (7) |
where ht represents the hidden layer state, W is the weight matrix, and b is the bias matrix.
The input gate it in Eq (8) controls the flow of new information into the cell.
it=σ(Wi⋅[ht−1,xt]+bi) | (8) |
The cell state is updated by combining the information from the forget gate and the input gate through Eqs (9) and (10).
ˉct=tanh(Wc⋅[ht−1,xt]+bc) | (9) |
ct=ft×ct−1+it×ˉct | (10) |
The final output state ht in Eq (12) is calculated based on the state ct and the output gate ot in Eq (11).
ot=σ(Wo⋅[ht−1,xt]+bo) | (11) |
ht=ot×tanh(ct) | (12) |
BiLSTM, an enhanced LSTM model, offers significant improvements over traditional LSTM. LSTM can process only input sequences in the forward direction, while BiLSTM has the capability to handle both forward and backward input sequences simultaneously. This enables BiLSTM to capture the relationships more effectively between the current time step and both past and future information, thus enhancing the feature extraction performance of the model.
In BiLSTM, the input of each time step is passed to two different LSTM layers (forward LSTM layer and backward LSTM layer), whose purpose is to ensure that the feature data obtained at a specific time t simultaneously encompasses information from both preceding and subsequent time steps. The specific structure of BiLSTM is depicted in Figure 3. Here, Xt represents the input at time t, ht corresponds to the memory output of the forward LSTM unit at time t, ht´ denotes the memory output of the backward LSTM unit at time t, and Yt is the output at time t.
To enhance the predictive performance of the model, a deep BiLSTM model is constructed as depicted in Figure 4. The input layer is the selected optimal feature set from Section 3.1. There are two BiLSTM layers in the model, and there is a Dropout layer after each BiLSTM layer. The fully connected layer is after the BiLSTM layers, followed by the regression layer. The RUL label described in Section 3.2 is the model output.
The Dropout mechanism is implemented to prevent overfitting, enabling the random dropout of a certain proportion of neurons within the BiLSTM layer, thereby improving the model's generalization capability. Segmented learning rates are set to optimize the model, facilitating rapid convergence during the initial training stage and ensuring more stable convergence in the later stage.
After dividing the dataset into a training set and a test set, the deep BiLSTM model is trained using the training set, and use the test set to verify the model. The accuracy of the model is assessed using the root mean square error (RMSE). RMSE is defined as follows:
erms=√1nn∑j=1(lj−^lj)2 | (13) |
where lj represents the actual values of the RUL labels, ^lj represents the predicted values of the RUL labels.
The life-cycle experimental data of rolling bearings used in this paper is the XJTU-SY bearing data set [16] released by Professor Lei Yaguo's team of Xi'an Jiaotong University. In order to obtain the vibration signal of the bearing throughout its life cycle, two unidirectional acceleration sensors are used to collect the bearing vibration information in the horizontal and vertical directions fixed by the magnetic seat, respectively. The sampling interval is 1 min, each continuous sampling time is 1.28 s, and the information is recorded in an Excel file with a sampling frequency of 25.6 kHz. XJTU-SY bearing data set collects a total of 15 groups of rolling bearing life-cycle data under 3 working conditions, the specific information is shown in Table 1. There are 5 groups of data under each working condition. The radial force and the rotational speed of each condition are listed in Table 1.
Working condition | 1 | 2 | 3 |
Radial force/kN | 12 | 11 | 10 |
Rotational speed/(r/min) | 2100 | 2250 | 2400 |
Number of groups | 5 | 5 | 5 |
In order to avoid the influence of different working conditions on the RUL prediction results, we take five groups of bearing horizontal vibration data samples in working condition 1 as an example (the number of samples in each group is listed in Table 2), and calculates the time domain, frequency domain, and time-frequency domain characteristics of each group of vibration data, which is a total of 25 feature parameters. In order to avoid the influence of the calculation results of different bearings on feature selection, Bearing1-1, Bearing1-3, and Bearing1-4 samples are selected here to calculate the robustness, trend and monotonicity measurement of the features of each sample according to Eqs (1)–(4), and the average value of the three samples is taken as the measurement index value of each feature. Figures 5–7 show the results.
Working condition | 1 | 2 | 3 | 4 | 5 |
Number of samples | 123 | 161 | 158 | 122 | 52 |
Figure 5 lists the values of each metric index of six frequency domain features, namely mean frequency (Z1), centroid frequency (Z2), root mean square frequency (Z3), frequency standard deviation (Z4), frequency kurtosis (Z5), and frequency skewness (Z6).
Wavelet packet decomposition is used to analyze vibration signals in the time-frequency domain. "db44" is selected as the wavelet packet base function to perform 4-layer wavelet packet decomposition. The normalized energy of the first 8 nodes is taken as the 8 feature parameters in the time-frequency domain, denoted as Z7 to Z14 respectively, and their metric values are shown in Figure 6.
Figure 7 lists the value of each metric index of 11 time domain features: mean value (Z15), standard deviation (Z16), root mean square (Z17), root mean square amplitude (Z18), peak value (Z19), skewness (Z20), kurtosis (Z21), peak factor (Z22), margin factor (Z23), waveform factor (Z24), and pulse factor (Z25).
The weighted comprehensive index Q value is calculated based on Eq (5). In this paper, let ω1=0.3, ω2=0.4, ω3=0.3, and 13 features with Q > 0.7 are taken as model inputs, including 4 frequency domain features, 4 time-frequency domain features and 5 time domain features as shown in Figure 8 and Table 3. Each input feature is standardized by Eq (6).
Selected feature | Description | Domain |
Z1 | Mean frequency | Frequency |
Z5 | Frequency kurtosis | Frequency |
Z18 | Root mean square amplitude | Time |
Z16 | Standard deviation | Time |
Z17 | Root mean square | Time |
Z4 | Frequency standard deviation | Frequency |
Z24 | Waveform factor | Time |
Z10 | The normalized energy of node 4 of 4-layer wavelet packet decomposition | Time-frequency |
Z11 | The normalized energy of node 5 of 4-layer wavelet packet decomposition | Time-frequency |
Z3 | Root mean square frequency | Frequency |
Z19 | Peak value | Time |
Z14 | The normalized energy of node 8 of 4-layer wavelet packet decomposition | Time-frequency |
Z9 | The normalized energy of node 3 of 4-layer wavelet packet decomposition | Time-frequency |
μnor and σnor of the absolute value of the vibration acceleration in the first 10 Excel files of each bearing (under normal working conditions) are calculated, and when the absolute value of the vibration acceleration exceeds μnor+3σnor for 14 consecutive times from a certain point, this point will be regarded as the starting moment of bearing degradation. The RUL label is set as the model output using the method proposed in Section 3.2.
Figure 4 shows the establishment of a deep BiLSTM model based on Dropout mechanism and piecewise learning rate. The specific network parameter settings are shown in Table 4.
Network parameters | Specific setting |
Optimization algorithm | Adam |
Number of input layer units | 13 |
Number of output layer units | 1 |
Number of BiLSTM1 layer units | 100 |
Discard rate of Dropout layer | 0.5 |
Number of BiLSTM2 layer units | 50 |
Maximum number of epoch | 60 |
Initial learning rate | 0.008 |
Learning rate decline period | 20 |
Learning rate decline coefficient | 0.5 |
Furthermore, the support vector machine (SVM), the traditional recurrent neural network (RNN), the single-layer BiLSTM, and LSTM prediction models without Dropout mechanism are established respectively. Bearing1-1, Bearing1-2, Bearing1-3, and Bearing1-4 samples are selected as the training set to train the models, and Bear-ing1-5 is used as the test set. The prediction results of the methods are shown in Figure 9. The RMSE of the methods is calculated according to Eq (13) as shown in Table 5.
Method | erms |
Proposed method | 0.0281 |
LSTM | 0.0623 |
BiLSTM | 0.0825 |
RNN | 0.1090 |
SVM | 0.1170 |
From Figure 9 and Table 5, it can be seen that the prediction results obtained by the proposed method are the closest to the real RUL label with less fluctuation, and the trend fitting effect is the best and the RMSE is the smallest. The SVM model has the biggest RMSE, and its prediction result deviates from the true RUL curve to a higher degree. The prediction result of the RNN model has greater fluctuation. When the bearing tends to fail, the LSTM prediction value deviates more from the real label.
In order to fully explore the time-series correlation features of bearing vibration signals over the life span, a total of 25 commonly used signal features in time domain, frequency domain and time-frequency domain were extracted. By calculating the monotonicity, trend, and robustness measurement of each feature, the weighted com-prehensive index values were sorted, and a total of 13 optimal features that could fully reflect the bearing degradation information were selected.
A new detection method for bearing degradation was proposed to determine the starting point of degradation, the piecewise linear network labels were constructed as the RUL prediction labels. A deep BiLSTM model is built for predicting the rolling bearing RUL by adding Dropout mechanism and setting piecewise learning rate. Compared with SVM, RNN, the single-layer BiLSTM, and LSTM prediction models without Dropout mechanism, the proposed method has the best trend fitting effect, and the RMSE value is the smallest (0.0281). Therefore, the method in this paper can accurately predict the rolling bearing RUL, so as to protect the safe and reliable operation of rotating machinery.
In the next step, the proposed method will be extended to the RUL prediction of rolling bearings of multiple conditions in combination with the transfer learning method.
The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.
This research was funded by Science and Technology Project of Inner Mongolia (No. 2022YFSH0019), Hulunbuir Science and Technology Project (No. SF2023019), Hulunbuir University Doctoral Foundation Project (No. 2020BS03), National Natural Science Foundation of China (No. 52304274), Science and Technology Project of Inner Mongolia (No. 2021GG0296).
The authors declare that there are no conflicts of interest.
[1] |
Z. Y. Fan, W. R. Li, K. C. Chang, A bidirectional long short-term memory autoencoder transformer for remaining useful life estimation, Mathematics, 11 (2023), 4972. https://doi.org/10.3390/math11244972 doi: 10.3390/math11244972
![]() |
[2] |
Y. G. Lei, N. P. Li, L. Guo, N. B. Li, T. Yan, J. Lin, Machinery health prognostics: A systematic review from data acquisition to RUL prediction, Mech. Syst. Signal Process., 104 (2018), 799–834. https://doi.org/10.1016/j.ymssp.2017.11.01 doi: 10.1016/j.ymssp.2017.11.01
![]() |
[3] |
F. Q. Zhao, Z. G. Tian, Y. Zeng, Uncertainty quantification in gear remaining useful life prediction through an integrated prognostics method, IEEE Trans. Reliab., 62 (2013), 146–159. https://doi.org/10.1109/TR.2013.2241216 doi: 10.1109/TR.2013.2241216
![]() |
[4] | S. C. Deng, Z. Q. Chen, Z. Chen, Auxiliary particle filter-based remaining useful life prediction of rolling bearing, in 2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), (2017), 15–19. |
[5] |
Y. X. Li, X. Z. Huang, T. H. Gao, C. Y. Zhao, S. J. Li, A wiener-based remaining useful life prediction method with multiple degradation patterns, Adv. Eng. Inf., 57 (2023), 102066. https://doi.org/10.1016/j.aei.2023.102066 doi: 10.1016/j.aei.2023.102066
![]() |
[6] |
N. P. Li, Y. G. Lei, J. Lin, S. X. Ding, An improved exponential model for predicting remaining useful life of rolling element bearings, IEEE Trans. Ind. Electron., 62 (2015), 7762–7773. https://doi.org/10.1109/TIE.2015.2455055 doi: 10.1109/TIE.2015.2455055
![]() |
[7] |
A. Rai, S. H. Upadhyay, The use of MD-CUMSUM and NARX neural network for anticipating the remaining useful life of bearings, Measurement, 111 (2017), 397–410. https://doi.org/10.1016/j.measurement.2017.07.030 doi: 10.1016/j.measurement.2017.07.030
![]() |
[8] |
Z. Liu, M. J. Zuo, Y. Qin, Remaining useful life prediction of rolling element bearings based on health state assessment, Proc. Inst. Mech. Eng., Part C: J. Mech., 230 (2016), 314–330. https://doi.org/10.1177/0954406215590167 doi: 10.1177/0954406215590167
![]() |
[9] |
F. Deng, Y. Bi, Y. Liu, S. Yang, Deep-learning-based remaining useful life prediction based on a multi-scale dilated convolution network, Mathematics, 9 (2021), 3035. https://doi.org/10.3390/math9233035 doi: 10.3390/math9233035
![]() |
[10] |
B. Rezaeianjouybari, Y. Shang, Deep learning for prognostics and health management: State of the art, challenges, and opportunities, Measurement, 163 (2020), 107929. https://doi.org/10.1016/j.measurement.2020.107929 doi: 10.1016/j.measurement.2020.107929
![]() |
[11] |
B. Zhang, S. H. Zhang, W. H. Li, Bearing performance degradation assessment using long short-term memory recurrent network, Comput. Ind., 106 (2019), 14–29. https://doi.org/10.1016/j.compind.2018.12.016 doi: 10.1016/j.compind.2018.12.016
![]() |
[12] |
Z. H. Chang, W. Yuan, K. Huang, Remaining useful life prediction for rolling bearings using multi-layer grid search and LSTM, Comput. Electr. Eng., 101 (2022), 108083. https://doi.org/10.1016/j.compeleceng.2022.108083 doi: 10.1016/j.compeleceng.2022.108083
![]() |
[13] |
J. Y. Guo, J. Wang, Z. Y. Wang, Y. Gong, J. L. Qi, G. Y. Wang, et al., A CNN‐BiLSTM‐Bootstrap integrated method for remaining useful life prediction of rolling bearings, Qual. Reliab. Eng. Int., 39 (2023), 1796–1813. https://doi.org/10.1002/qre.3314 doi: 10.1002/qre.3314
![]() |
[14] |
B. Zhang, L. J. Zhang, J. W. Xu, Degradation feature selection for remaining useful life prediction of rolling element bearings, Qual. Reliab. Eng. Int., 32 (2016), 547–554. https://doi.org/10.1002/qre.1771 doi: 10.1002/qre.1771
![]() |
[15] |
Y. G. Lei, N. P. Li, S. Gontarz, J. Lin, S. Radkowski, J. Dybala, A model-based method for remaining useful life prediction of machinery, IEEE Trans. Reliab., 65 (2016), 1314–1326. https://doi.org/10.1109/TR.2016.2570568 doi: 10.1109/TR.2016.2570568
![]() |
[16] |
B. Wang, Y. G. Lei, N. P. Li, N. B. Li, A hybrid prognostics approach for estimating remaining useful life of rolling element bearings, IEEE Trans. Reliab., 69 (2020), 401–412. https://doi.org/10.1109/TR.2018.2882682 doi: 10.1109/TR.2018.2882682
![]() |
1. | Baobao Zhang, Jianjie Zhang, Peibo Yu, Jianhui Cao, Yihang Peng, Asymmetric-Based Residual Shrinkage Encoder Bearing Health Index Construction and Remaining Life Prediction, 2024, 24, 1424-8220, 6510, 10.3390/s24206510 | |
2. | Jiaping Shen, Haiting Zhou, Muda Jin, Zhongping Jin, Qiang Wang, Yanchun Mu, Zhiming Hong, RUL Prediction of Rolling Bearings Based on Fruit Fly Optimization Algorithm Optimized CNN-LSTM Neural Network, 2025, 13, 2075-4442, 81, 10.3390/lubricants13020081 |
Working condition | 1 | 2 | 3 |
Radial force/kN | 12 | 11 | 10 |
Rotational speed/(r/min) | 2100 | 2250 | 2400 |
Number of groups | 5 | 5 | 5 |
Working condition | 1 | 2 | 3 | 4 | 5 |
Number of samples | 123 | 161 | 158 | 122 | 52 |
Selected feature | Description | Domain |
Z1 | Mean frequency | Frequency |
Z5 | Frequency kurtosis | Frequency |
Z18 | Root mean square amplitude | Time |
Z16 | Standard deviation | Time |
Z17 | Root mean square | Time |
Z4 | Frequency standard deviation | Frequency |
Z24 | Waveform factor | Time |
Z10 | The normalized energy of node 4 of 4-layer wavelet packet decomposition | Time-frequency |
Z11 | The normalized energy of node 5 of 4-layer wavelet packet decomposition | Time-frequency |
Z3 | Root mean square frequency | Frequency |
Z19 | Peak value | Time |
Z14 | The normalized energy of node 8 of 4-layer wavelet packet decomposition | Time-frequency |
Z9 | The normalized energy of node 3 of 4-layer wavelet packet decomposition | Time-frequency |
Network parameters | Specific setting |
Optimization algorithm | Adam |
Number of input layer units | 13 |
Number of output layer units | 1 |
Number of BiLSTM1 layer units | 100 |
Discard rate of Dropout layer | 0.5 |
Number of BiLSTM2 layer units | 50 |
Maximum number of epoch | 60 |
Initial learning rate | 0.008 |
Learning rate decline period | 20 |
Learning rate decline coefficient | 0.5 |
Method | erms |
Proposed method | 0.0281 |
LSTM | 0.0623 |
BiLSTM | 0.0825 |
RNN | 0.1090 |
SVM | 0.1170 |
Working condition | 1 | 2 | 3 |
Radial force/kN | 12 | 11 | 10 |
Rotational speed/(r/min) | 2100 | 2250 | 2400 |
Number of groups | 5 | 5 | 5 |
Working condition | 1 | 2 | 3 | 4 | 5 |
Number of samples | 123 | 161 | 158 | 122 | 52 |
Selected feature | Description | Domain |
Z1 | Mean frequency | Frequency |
Z5 | Frequency kurtosis | Frequency |
Z18 | Root mean square amplitude | Time |
Z16 | Standard deviation | Time |
Z17 | Root mean square | Time |
Z4 | Frequency standard deviation | Frequency |
Z24 | Waveform factor | Time |
Z10 | The normalized energy of node 4 of 4-layer wavelet packet decomposition | Time-frequency |
Z11 | The normalized energy of node 5 of 4-layer wavelet packet decomposition | Time-frequency |
Z3 | Root mean square frequency | Frequency |
Z19 | Peak value | Time |
Z14 | The normalized energy of node 8 of 4-layer wavelet packet decomposition | Time-frequency |
Z9 | The normalized energy of node 3 of 4-layer wavelet packet decomposition | Time-frequency |
Network parameters | Specific setting |
Optimization algorithm | Adam |
Number of input layer units | 13 |
Number of output layer units | 1 |
Number of BiLSTM1 layer units | 100 |
Discard rate of Dropout layer | 0.5 |
Number of BiLSTM2 layer units | 50 |
Maximum number of epoch | 60 |
Initial learning rate | 0.008 |
Learning rate decline period | 20 |
Learning rate decline coefficient | 0.5 |
Method | erms |
Proposed method | 0.0281 |
LSTM | 0.0623 |
BiLSTM | 0.0825 |
RNN | 0.1090 |
SVM | 0.1170 |