1.
Introduction
C4 olefin is an important industrial feedstock and is widely used in the production of chemical products and pharmaceuticals. At present, there are many methods to produce C4 olefin [1,2,3]. For example, the preparation of C4 olefins using carbon-oxygen and hydrogen is a well-known method [4,5]. Among which, ethanol is more important as a base material for C4 olefin production because of its wide source and low pollution [6,7]. In the production of C4 olefin from ethanol, different catalyst combinations and reaction temperatures play an important role. Although important results have been achieved in the production of C4 olefin from ethanol [8,9], based on the chain growth reaction mechanism, it is known that ketones and aldehydes are inevitably generated during the reaction, resulting in a low selectivity and poor economy of target products.
Lu [10] studied the C4 olefin production conditions by using the controlled variable method and concluded that the maximum yield of C4 olefin was obtained when the Co loading was 1 wt%, the Co/SiO2 and HAP mass ratio were 1, and the reaction temperature was 400 ℃ by conducting several manual experiments. However, conducting repeated manual experiments can cause more human, material, and financial resources consumption, while applying artificial intelligence technology to find the optimal reaction conditions for ethanol can effectively reduce resource consumption. At present, more experts and scholars have applied swarm intelligence algorithms to optimize various chemical problems [11,12,13]. The swarm intelligence algorithm has the advantages of simplicity, parallelism and applicability compared with traditional optimization algorithms [14,15]. It is particularly suitable for solving optimization problems of various complex systems [16,17]. Wang and Zhang [18] used a response surface method and a quadratic regression model to simulate the process of synthesizing methyl chloroacetate in the next-door tower of reaction distillation and obtained the operating parameters that make the synthesis product of the highest purity. Gao et al [19] used BP neural networks and genetic algorithms to optimize the S-Zorb device, which effectively reduced the octane loss of gasoline in the cracking process. Zeng [20] et al. used genetic and sequential algorithms to optimize the biogas decarbonization process and obtained the CO2 values that make the LNG yield reach the standard. However, the problem of optimization of C4 olefin production conditions has not been investigated.
Therefore, a hybrid model combining the advantages of GXGB and SSA (i.e., GXGB-SSA) is proposed for the first time in this paper. This model was established to obtain the highest possible ethanol reaction conditions for C4 olefin production and reduce the resource consumption of repeated manual experiments by mining the experimental data of C4 olefin production.
The first task to achieve optimization of C4 olefin production conditions is to establish the objective function of the variable to be optimized for C4 olefin yield. However, the chemical reaction mechanism for the production of C4 olefin is nonlinear and complex, leading to the difficulty of establishing this objective function using conventional fitting models. In addition, the experimental data of C4 olefin production for the study conducted in this paper was obtained from the China 2021 National Student Mathematical Modeling Competition; it has a small sample size of only 109 groups, so the overfitting problem will occur by using an artificial neural network model to establish the objective function of the variables to be optimized. As an integrated learning algorithm based on the idea of Boosting, the extreme gradient boosting tree (XGB) [21,22], can effectively reduce the bias of the model. Additionally, its introduction of the regularization term effectively reduces the probability of overfitting the model [23,24]. Therefore, it is very suitable for solving the establishment problem of this objective function [25]. In order to further improve the fitting effect of XGB, this paper proposes a sample increment type limit gradient boosting tree based on Gaussian noise, namely GXGB. The experimental results show that it can obtain a better fitting effect compared with XGB without the improvement.
In the process of optimizing the C4 olefin production conditions, the traditional grid search algorithm will undoubtedly consume a lot of time due to the large range of values of each decision variable, while the swarm intelligence algorithm, as an emerging evolutionary computing technology, has a greater advantage in operational efficiency; therefore, it is a good choice to apply the swarm intelligence algorithm to solve the optimization problem of C4 olefin production conditions. The sparrow search algorithm [26,27,28] is a new swarm intelligence optimization algorithm proposed based on the feeding behavior of sparrows. The main idea of the sparrow search algorithm is to perform local and global search by imitating the foraging and anti-feeding behavior of sparrows, and the sparrow foraging process is the algorithm seeking process [29,30]. And it has a greater advantage in the efficiency of finding the best [31,32]. Therefore, in this paper, we combine it with GXGB to propose a hybrid model, namely GXGB-SSA, which can solve the complex problem of finding the optimal conditions, apply it to the problem of finding the optimal conditions for C4 olefin production, and obtain the combination of ethanol reaction conditions that makes the highest yield of C4 olefin.
In addition, since GXGB is a typical black-box model, it has a weaker ability to interpret the data when compared to the conventional fitting model [33,34,35]. Although the degree of influence of each reaction condition of ethanol on the yield of C4 olefin can be obtained, the positive and negative effects of the influence and the dynamic process of the influence are not available. Therefore, in this paper, the SHAP value [36,37,38] is creatively combined with it to investigate the effect of each reaction condition of ethanol on C4 olefin yield. And the constraints of each decision variable involved in the optimization search are adjusted according to the analysis results [39,40]. Therefore, in this paper, the SHAP value is creatively combined with it to investigate the effect of ethanol reaction conditions on the yield of C4 olefin, and the constraints of each decision variable of optimization are adjusted according to the analysis results.
The innovations of this paper are as follows: (1) combines Gaussian noise with XGB for the first time, and proposes a GXGB model that can solve the modeling problem with small samples and complex mechanisms; (2) proposes a GXGB-SSA hybrid model for the first time, and applies it to the optimization problem of C4 olefin production conditions, and the yield of C4 olefin is improved by nearly 25.46% compared with the manual experimental data; (3) creatively combines the SHAP value with GXGB to solve the problem of its insufficient ability to explain data as a black-box model, effectively analyzes the effect of each reaction condition of ethanol on the yield of C4 olefin, and limits the search range of the optimal solution.
The structure of this paper is as follows: (1) the second part introduces the chemical reaction principles of C4 olefin production and quantifies the ethanol reaction conditions to obtain 109 sets of experimental data for modeling; (2) the third part of describes the process of establishing the proposed hybrid model GXGB-SSA; (3) the fourth part describes the process of using GXGB-SSA to find the optimal C4 olefin reaction conditions and the combination of ethanol reaction conditions that maximizes the yield of C4 olefin; and (4) the fifth part concludes the paper and provides an outlook for future work.
2.
Materials and methods
2.1. Preliminaries
The chemical principle applied in the C4 olefin production experiment studied in this paper is the ethanol dehydration reaction, the reaction process of which is divided into three main stages as follows.
The dehydration reaction of ethanol is capable of producing C4 olefin, ethylene, and ether. The main reaction (a) is a strong heat-absorbing reaction; reaction (b) is a weakly exothermic reaction; and reaction (c) is a strong exothermic reaction, and its change with temperature is not obvious. Therefore, a proper increase in temperature is beneficial for the reaction to produce C4 olefin.
In addition, during the dehydration reaction of ethanol, decreasing the ethanol concentration actually decreases the partial pressure of the ethanol involved in the reaction, which facilitates the increase of the molar coefficient. Therefore, appropriately reducing the ethanol concentration also facilitates the reaction to form C4 olefin.
The catalysts utilized for the ethanol dehydration reaction studied in this paper are Co and SiO2-HAP, where Co is a metal with dehydrogenation activity to catalyze the dehydration reaction of ethanol and SiO2-HAP is a catalyst with both acid and base active sites.
In this paper, the ethanol reaction conditions were quantified by splitting them into Co loading (independent variable X1), the Co/SiO2 and HAP loading ratio (independent variable X2), ethanol concentration (independent variable X3), total catalyst mass (independent variable X4), and reaction temperature (independent variable X5), where Co loading is the ratio of Co to SiO2 by weight and the Co/SiO2 and HAP mass ratio is the mass ratio of Co/SiO2 and HAP. By adjusting the Co loading and SiO2/HAP loading ratio, the acidity and alkalinity of the catalyst surface can be adjusted.
Furthermore, in this paper, the efficiency of the ethanol dehydration reaction is expressed in terms of the C4 olefin yield (dependent variable, Y), whose value is equal to the ethanol conversion rate multiplied by the C4 olefin selectivity, where, ethanol conversion is the one-way conversion of ethanol per unit time and C4 olefin selectivity is the percentage of C4 olefin in all products.
The quantitative results of some experimental data for the ethanol dehydration reaction are shown in Table 1.
Therefore, the tasks of this paper are described in two parts: (1) to establish the objective function f(X) of the variables to be optimized (i.e., the objective function with C4 olefin yield as the dependent variable and Co loading, the Co/SiO2 and HAP mass ratio, ethanol concentration, total catalyst mass, and reaction temperature as the independent variables); and (2) to find the values of the independent variables that make the objective function f(X) the largest (i.e., to obtain the combination of ethanol reaction conditions with the highest C4 olefin yield).
2.2. Algorithms
2.2.1. Modeling ideas of GXGB-SSA
In this paper, we fully absorb the advantages of GXGB and SSA, and creatively fuse them to propose a model that can efficiently solve the optimization of complex problems (i.e., GXGB-SSA) and apply it to the optimization problem of reaction conditions for C4 olefin production. First, the basic idea of the model is to establish the objective function f(X) of the variables to be optimized for C4 olefin yield using GXGB, and then apply SSA to find the optimal value of the objective function to obtain the values of the decision variables that make the highest C4 olefin yield. The flow chart of the model is shown in Figure 1.
As can be seen in Figure 1, the establishment process of GXGB-SSA is divided into two main modules, namely, the GXGB module (the objective function establishment part) and the SSA module (the decision variable seeking part).
For the GXGB module, its main purpose is to establish the objective function of the variables to be optimized for C4 olefin yield, which is established as follows:
(1) Import a dataset and slice this dataset into a training set for model training and a test set for model testing, where the input variables to this dataset are Co loading, the Co/SiO2 and HAP mass ratio, ethanol concentration, total catalyst mass, reaction temperature, and the output variables are C4 olefin yields.
(2) The grid search method finds the optimal hyperparameter (i.e., the hyperparameter that minimizes the error in the test set) and the grid search range for each hyperparameter is shown in Table 2.
where noise level is the size of the noise level and increment size is a multiple of the sample increment.
(3) Make a model using the optimal model hyperparameters obtained in step (2) and output the objective function of the variable to be optimized for the C4 olefin yield.
In addition, GXGB is a typical black-box model because of its weaker ability to interpret data compared to the traditional fitting model. Therefore, in this paper, the SHAP value is combined with GXGB to investigate the effect of each reaction condition of ethanol on the yield of C4 olefin, and the constraints of each decision variable involved in the optimization search are adjusted according to the analysis results.
For the SSA module, the main objective is to find the values of each decision variable that makes the highest yield of C4 olefin, and the module is created as follows:
(1) Set the values of the optimization parameters (i.e., the number of populations (pop) and the maximum number of iterations (maxiter));
(2) Determine the constraints for each decision variable (i.e., the range of values available for Co loading, the Co/SiO2 and HAP mass ratio, ethanol concentration, total catalyst mass, and reaction temperature);
(3) Determine the objective function of the optimization (i.e., the objective function of the variable to be optimized for the C4 olefin yield obtained by the GXGB module);
(4) Iterate the model until the maximum number of iterations is reached and output the highest value of the C4 olefin yield and the value of each decision variable that makes the C4 olefin yield reach the maximum.
2.2.2. The establishment process of GXGB-SSA
For the GXGB module, the basic idea is to obtain a strong learner by continuously integrating weak learners, the learning result is the weighted mean value of each weak learner, and its flow chart is shown in Figure 2.
In the GXGB module, the added Gaussian noise samples obey the underlying normal distribution, whose expression is shown in Equation (1):
where z is a Gaussian random variable, −z is the mean of z, and σ is the standard deviation of z. Gaussian noise samples can be added to the data samples by setting the Gaussian noise size and the sample increment multiplier.
The objective function established using the GXGB module consists of two main components, namely, the loss function and the regularization term. The introduction of the loss function can reduce the bias of the model, the introduction of the regularization term can reduce the probability of overfitting the model, and the expression of this objective function is shown in Equation (2):
where Obj(t) is the objective function when integrating the t th weak learner; ˆy(t−1)j is the objective value calculated by integrating the previous t−1 weak learners; loss() is the loss function; c is the constant term; and Ω(ft) is the regularization term, whose expression is shown in Equation (3):
where, Ω(ft) is the regularization term of the t th weak learner; γ and λ are the regularization coefficients; T is the number of all nodes of a certain weak learner; and wo is the weight of the oth node of a certain weak learner.
A Taylor expansion of the objective function results in Equation (4) and (5) as follows:
Since the constant term does not affect the model solution, the constant term c and the fixed value loss(yj,ˆy(t−1)j) in the above equation are removed, and the result is shown in Equation (6):
The objective function is deformed, and the result is shown in Equation (7):
Let Go=∑j∈Jo gj,Ho=∑j∈Jo hj, then the objective function is as shown in (8):
The final objective function is obtained by taking partial derivatives of wo, and its expression is shown in Equation (9):
In the GXGB module, assuming that the ith sample is xi, the jth feature of the ith sample is 𝑥𝑖𝑗, the marginal contribution of the feature is mcij, and the weight is 𝑤𝑖, then the ith expression for the SHAP value of the jth feature of the sample is shown in Equation (10).
Assuming that the predicted value of GXGB for this sample is 𝑦𝑖 and the baseline of the entire model (i.e., the mean of all sample target variables) is ybase, the expression for SHAP value is shown in Equation (11):
f(xij) is the value of the contribution of the jth feature in the ith sample to the final prediction 𝑦𝑖, and the SHAP value of each feature indicates the change in the model prediction when conditioned on that feature. When f(xij) > 0, it means that the feature boosts the prediction value, and vice versa, it means that the feature reduces the prediction value.
For the SSA module, the proposed basic idea is mainly inspired by the foraging behavior of sparrows. In the process of foraging, as explorers, sparrows provide the search direction and area for the population; as followers, sparrows search through the explorers' guidance; and as vigilantes, sparrows rely on anti-predation strategies to avoid the population from falling into local optimum.
During the iterative search, the expression for the explorer position update is shown in Equation (12):
where xti,j is the position of the ith sparrow in the jth dimension in the t th iteration, and xt+1i,j is the position of the ith sparrow in the jth dimension in the t+1 th iteration; r2 is the warning value, whose value lies between (0,1]; ST is the safety value, whose value lies between [0.5,1]; α is a random number between (0,1], whose value obeys uniform distribution; itermax is the maximum number of iterations; Q is a random number of (0,1] obeying normal distribution; and L is a matrix with element 1.
When the warning value r2 is less than the safety value ST, the searcher performs a wide range of jump search; when the warning value r2 is greater than the safety value ST, the searcher moves to other locations for search. Then, the expression of follower position update is shown in Equation (13):
where xti,j is the position of the ith sparrow in the jth dimension in the t th iteration, and xt+1i,j is the position of the ith sparrow in the jth dimension in the t+1 th iteration; xworst is the global worst position found by the discoverer's search; and A is a 1×d matrix where each element is randomly assigned to 1 or −1, and A+=AT(AAT)−1.
If a sparrow with i>n/2 has a low fitness value and does not obtain food, then it is necessary to jump and move the search in the direction of the minimum value, and the other follower sparrows move towards the optimal position found by the explorer. Vigilantes are some sparrows randomly selected from the sparrow population explorers and followers to avoid getting into local optimum by anti-predation strategy.
3.
Results
3.1. Optimization of the objective function based on GXGB-SSA for C4 olefin variables
In this paper, with C4 olefin yield (Y) as the output variable and each reaction condition of ethanol as the input variable (i.e., Co loading (X1), the Co/SiO2 and HAP mass ratio (X2), ethanol concentration (X3), total catalyst mass (X4), and reaction temperature (X5)), the GXGB module was used to establish the objective function of the variables to be optimized for the C4 olefin yield.
The C4 olefin data set was sliced in a ratio of 4:1 to obtain a training set for model training and a test set for model testing, and a grid search method was applied to find the optimal hyperparameters of GXGB, and the combination of hyperparameters that minimized the fitting error of the test set was obtained, and the results are shown in Table 3.
As seen from Table 3, the magnitude of the optimal Gaussian noise error level and the multiplicity of sample increments obtained by the grid search method are 0.001 and 5, respectively. Since the sample size of the experimental data for the production of C4 olefin studied in this paper is 109, the total sample size after the operation of sample increment is 545. Under the rule of dividing the training set and the test set with a ratio of 4:1, the final sample size involved in training is 436 groups, while the sample size involved in testing is 109 groups, where the GXGB is fitted on the test set as shown in Figure 3.
Figure 3 shows the fitting effect of GXGB on the testing set. Among them, the red circular marker represents the target value and the ×-shaped marker of the blue line represents the predicted value, which basically overlap. This shows that GXGB fits very well on the testing set.
To further validate the performance of GXGB, a 10-fold cross-validation was conducted in this paper (i.e., the data samples from 10 different training and testing sets were trained 10 times). Additionally, the mean of the three metrics of goodness of fit (R2), mean square error (MSE), and absolute error (MAE) obtained from the 10-fold cross-validation were used to evaluate the fitting accuracy of GXGB, and the variance (S2) of the mean square error (MSE) and absolute error (MAE) obtained from the 10-fold cross-validation were used to evaluate the stability of GXGB. The stability of the GXGB was assessed by the variance (S2) of the mean square error (MSE) obtained from the 10-fold cross-validation, where the expressions for the three indicators of goodness of fit (R2), mean square error (MSE), and absolute error (MAE) were calculated as shown in Equations (14), (15), and (16):
where yi is the target value of C4 olefin yield, ^yi is the fitted value of the model, and m is the sample size.
For the goodness of fit (R2), a larger value indicates a better model fit, and a smaller value indicates a worse model fit. For mean square error (MSE) and absolute error (MAE), the smaller the value, the better the model fit, and the larger the value, the worse the model fit. For the variance (S2), the smaller the value, the better the stability of the model, and the larger the value, the worse the stability of the model.
In this paper, GXGB is compared with XGB, random forest (RF) and support vector machines (SVM) without the introduction of Gaussian noise for improvement, and the results are shown in Tables 4 and 5.
As shown in Tables 4 and 5, GXGB outperforms XGB without the introduction of Gaussian noise for improvement in both the model fitting effect and stability. Among them, the goodness of fit (R2) of GXGB is very high (very close to the ideal state 1) and its mean square error (MSE) and absolute mean error (MAE) are very small, indicating that the fitting error for this C4 olefin yield dataset is very small, and for assessing the variance (S2) of model stability, GXGB also has a greater advantage.
In addition, this paper also compares GXGB with RF SVM, both of which have been used to adjust the hyperparameters using the grid search method. From Tables 4 and 5, it can be seen that GXGB outperforms RF and SVM in terms of model fitting effect and stability, which again verifies the good performance of GXGB.
3.2. Optimization of C4 olefin production conditions based on GXGB-SSA
In this paper, the objective function of the variables to be optimized for the C4 olefin yield established by the above GXGB module is used as the fitness function of the SSA module to find the optimal value, and the fitness value that makes the highest C4 olefin yield (i.e., the optimal value of each decision variable) is obtained. The constraints of each decision variable are shown in Table 6.
In this paper, the number of populations (pop) and the maximum number of iterations (maxiter) in the optimization parameters are uniformly set to 20 and 200, respectively, and the optimization search results of GXGB-SSA are compared with those of GXGB-GWO and GXGB-PSO using the Gray Wolf Optimization algorithm. The iterative process of the three algorithms is shown in Figure 4.
As can be seen in Figure 4, all three algorithms converge after 200 iterations, and GXGB-SSA has a better convergence speed and optimization results than GXGB-GWO and GXGB-PSO for the optimization of C4 olefin production conditions. The optimization results of the three algorithms and their comparison with the optimal results obtained from manual experiments (ME) are shown in Table 7.
As can be seen from Table 7, the values of each decision variable that makes the highest C4 olefin yield are obtained for GXGB-SSA compared to GXGB-GWO and GXGB-PSO (i.e., when the Co loading (X1) is 1.2508, the Co/SiO2 and HAP mass ratio (X2) is 1.4273, the ethanol concentration (X3) is 0.9026, the total catalyst mass (X4) is 399.88, and the reaction temperature (X5) was 430.36) the highest C4 olefin yield (Y) of 5547.3526 was obtained, which was improved by nearly 24.02% compared with the highest C4 olefin yield of 4472.81 obtained from 109 manual experiments.
4.
Discussion
4.1. Investigation of the effect of each reaction condition on C4 olefin based on SHAP
The SHAP value was combined with GXGB to investigate the effect of ethanol reaction conditions on the yield of C4 olefin, and the constraints of the optimized decision variables were adjusted according to the results of the analysis. The SHAP feature variable importance diagram is shown in Figure 5, and the summary diagram of the SHAP feature analysis is shown in Figure 6.
In Figure 5, the X-axis represents the SHAP value of the feature variable, and the larger the value is, the greater the influence of the feature variable. Then, in descending order, the degree of influence of each decision variable on the yield of C4 olefin was the reaction temperature (X5), total catalyst mass (X4), ethanol concentration (X3), the Co/SiO2 and HAP mass ratio (X2), and Co loading (X1).
In Figure 6, the points of each feature represent the feature samples in the corresponding dataset, with the color change from blue to red indicating the value of the sample feature from small to large, and the positive or negative SHAP value indicating the positive or negative correlation of the feature with the target feature, respectively.
For the reaction temperature (X5), which has the greatest influence, the samples with larger eigenvalues have a positive SHAP value, indicating that it acts as a positive contributor to the target characteristic and increases the target value of the C4 olefin yield. The feature points near the bottom half of the variable bars are mainly concentrated in the region with negative SHAP values, indicating that when the reaction temperature (X5) decreases, the target value of the C4 olefin yield also decreases. Similarly, the total catalyst mass (X4), which is the second most influential catalyst, also positively contributes to the C4 olefin yield. For the Co/SiO2 and HAP mass ratio (X2) and Co loading (X1), which are the third and fourth most influential catalysts, the SHAP value of the sample with the larger characteristic value is negative, indicating that it has a reverse inhibitory effect on the target characteristic and reduces the target value of C4 olefin yield. For the least influential Co loading (X1), the positive and negative effects on the C4 olefin yield were not significant.
The magnitude and positive and negative effects of each decision variable on C4 olefin yield are able to be obtained in the SHAP characteristic analysis summary diagram; however, to explore the dynamic process of the effect of each decision variable on C4 olefin yield, it is necessary to view the SHAP characteristic analysis dependence diagram, as shown in Figures 7, 8, 9, 10, and 11.
In Figure 7, the Y-axis represents the SHAP value of the feature variables. The points of each feature represent the feature samples in the corresponding data set. Additionally, the samples with larger feature values have positive SHAP values, indicating that they play a positive contributing role to the target features. From Figure 9, it can be seen that when the Co loading (X1) takes the value of 1, there exists a positive value of its SHAP value (i.e., the contribution to the C4 olefin yield is positive).
From Figure 8, it can be seen that when the Co/SiO2 and HAP charge ratio (X2) takes the value of either 1 or 2, there is a positive value of SHAP value (i.e., the contribution to C4 olefin yield is positive).
From Figure 9, it can be seen that when the ethanol concentration (X3) is taken as either 0.3, 0.9, or 1.68, there is a positive value of its SHAP value (i.e., the contribution to the C4 olefin yield is positive).
From Figure 10, it can be seen that when the total catalyst mass (X4) is taken as 400, there is a positive value of SHAP value (i.e., the contribution to the C4 olefin yield is positive).
From Figure 11, it can be seen that when the reaction temperature (X5) is taken as either 400 or 450, there exists a positive value of its SHAP value (i.e., the contribution to the C4 olefin yield is positive).
Therefore, based on the SHAP characteristic analysis dependence diagram, the possible range of values for each decision variable when the contribution to C4 olefin yield is positive can be derived, and then the adjusted range of values for the decision variables is shown in Table 8.
The range of values of the adjusted decision variables in Table 8 are used as the constraint of GXGB-SSA for the optimization search again, and the iterative process is shown in Figure 12.
As can be seen in Figure 12, with adjusted constraints, the GXGB-SSA is already in a converged state after 200 iterations, and its convergence speed and optimized results for the optimization of C4 olefin production conditions are improved compared with the GXGB-SSA without adjusted constraints. In this paper, the results of this optimization are compared with those of GXGB-SSA without adjustment of constraints, as shown in Table 9.
As can be seen from Table 9, the convergence rate and optimization results of GXGB-SSA with reduced constraints are improved compared to the optimization results without reduced constraints (i.e., when the Co loading (X1) is 1.1248, the Co/SiO2 and HAP mass ratio (X2) is 1.8402, the ethanol concentration (X3) is 0.8992, the total catalyst mass (X4) is 400.00, and a reaction temperature (X5) of 420.37) resulted in a higher C4 olefin yield of 5611.46, which was nearly 25.46% higher than the highest C4 olefin yield of 4472.81 in the experimental data.
5.
Conclusions
In order to improve the efficiency of the ethanol dehydration reaction to produce C4 olefin, a GXGB-SSA hybrid model is developed in this paper to obtain the combination of ethanol reaction conditions that makes the highest yield of C4 olefin. First, the objective function of C4 olefin to be optimized was established by using GXGB module with the C4 olefin yield as the output variable and ethanol reaction conditions as the input variable. Second, the SSA module was used to optimize the objective function to obtain the combination of ethanol reaction conditions that makes the C4 olefin yield as high as possible. Finally, the SHAP value was used to investigate the effect of each ethanol reaction condition on the C4 olefin yield. The SHAP value was used to investigate the effect of ethanol reaction conditions on the C4 olefin yield, and the constraints of the decision variables involved in the optimization were adjusted according to the analysis results. The team will continue to investigate the machine learning algorithm in order to build a better optimization model and solve more complex problems.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Conflict of interest
The authors declare there is no conflict of interest.