Research article

Catalytic upgrading of palm oil derived bio-crude oil for bio-hydrocarbon enrichment using protonated zeolite-Y as catalyst

  • This research was conducted to study upgrading bio-crude oil (BCO) produced by pyrolysis of palm oil without the use of a catalyst, using protonated zeolite-Y designated as H-Y. Preparation of H-Y was carried out by subjecting zeolite-Y synthesized from rice husk silica (RHS) and food grade aluminium foil (FGAF) to a cation exchange process using ammonium nitrate solution with different concentrations of 2.0, 2.5, 3.0, and 3.5 M to obtain (H-Y) samples with different protonation extents. To confirm the formation of Na-Y, the sample was characterized using XRD and SEM, and to evaluate the protonation, the Na-Y and H-Y samples were analyzed using XRF. Characterization using XRD showed that the Na-Y sample is faujasite, which is the characteristic phase of zeolite-Y, and supported by the existence of particles with octahedral structure as seen by SEM. Successful protonation resulted in a reduction of Na content up to 89.948% from that of the Na-Y, which was demonstrated by the XRF results. Catalytic upgrading experiments demonstrated that H-Y zeolites functioned to increase the bio-hydrocarbon content from 80.23% in the BCO to practically 100% in the upgraded oil. In addition, no acids were identified in the upgraded fuels, implying that H-Y zeolite is a promising catalyst for BCO upgrading for bio-hydrocarbon enrichment of the oil.

    Citation: Wasinton Simanjuntak, Kamisah Delilawati Pandiangan, Tika Dwi Febriyanti, Aryani Putri Islami, Sutopo Hadi, Ilim Ilim. Catalytic upgrading of palm oil derived bio-crude oil for bio-hydrocarbon enrichment using protonated zeolite-Y as catalyst[J]. AIMS Energy, 2024, 12(3): 600-616. doi: 10.3934/energy.2024028

    Related Papers:

    [1] Yongquan Zhou, Yanbiao Niu, Qifang Luo, Ming Jiang . Teaching learning-based whale optimization algorithm for multi-layer perceptron neural network training. Mathematical Biosciences and Engineering, 2020, 17(5): 5987-6025. doi: 10.3934/mbe.2020319
    [2] Weibin Jiang, Xuelin Ye, Ruiqi Chen, Feng Su, Mengru Lin, Yuhanxiao Ma, Yanxiang Zhu, Shizhen Huang . Wearable on-device deep learning system for hand gesture recognition based on FPGA accelerator. Mathematical Biosciences and Engineering, 2021, 18(1): 132-153. doi: 10.3934/mbe.2021007
    [3] Mengya Zhang, Qing Wu, Zezhou Xu . Tuning extreme learning machine by an improved electromagnetism-like mechanism algorithm for classification problem. Mathematical Biosciences and Engineering, 2019, 16(5): 4692-4707. doi: 10.3934/mbe.2019235
    [4] Long Wen, Yan Dong, Liang Gao . A new ensemble residual convolutional neural network for remaining useful life estimation. Mathematical Biosciences and Engineering, 2019, 16(2): 862-880. doi: 10.3934/mbe.2019040
    [5] Paweł Konieczka, Lech Raczyński, Wojciech Wiślicki, Oleksandr Fedoruk, Konrad Klimaszewski, Przemysław Kopka, Wojciech Krzemień, Roman Y. Shopa, Jakub Baran, Aurélien Coussat, Neha Chug, Catalina Curceanu, Eryk Czerwiński, Meysam Dadgar, Kamil Dulski, Aleksander Gajos, Beatrix C. Hiesmayr, Krzysztof Kacprzak, Łukasz Kapłon, Grzegorz Korcyl, Tomasz Kozik, Deepak Kumar, Szymon Niedźwiecki, Szymon Parzych, Elena Pérez del Río, Sushil Sharma, Shivani Shivani, Magdalena Skurzok, Ewa Łucja Stępień, Faranak Tayefi, Paweł Moskal . Transformation of PET raw data into images for event classification using convolutional neural networks. Mathematical Biosciences and Engineering, 2023, 20(8): 14938-14958. doi: 10.3934/mbe.2023669
    [6] Fengyong Li, Meng Sun . EMLP: short-term gas load forecasting based on ensemble multilayer perceptron with adaptive weight correction. Mathematical Biosciences and Engineering, 2021, 18(2): 1590-1608. doi: 10.3934/mbe.2021082
    [7] Chunmei He, Hongyu Kang, Tong Yao, Xiaorui Li . An effective classifier based on convolutional neural network and regularized extreme learning machine. Mathematical Biosciences and Engineering, 2019, 16(6): 8309-8321. doi: 10.3934/mbe.2019420
    [8] Youqiong Liu, Li Cai, Yaping Chen, Bin Wang . Physics-informed neural networks based on adaptive weighted loss functions for Hamilton-Jacobi equations. Mathematical Biosciences and Engineering, 2022, 19(12): 12866-12896. doi: 10.3934/mbe.2022601
    [9] Rong Zheng, Heming Jia, Laith Abualigah, Qingxin Liu, Shuang Wang . An improved arithmetic optimization algorithm with forced switching mechanism for global optimization problems. Mathematical Biosciences and Engineering, 2022, 19(1): 473-512. doi: 10.3934/mbe.2022023
    [10] Shijing Ma, Yunhe Wang, Shouwei Zhang . Modified chemical reaction optimization and its application in engineering problems. Mathematical Biosciences and Engineering, 2021, 18(6): 7143-7160. doi: 10.3934/mbe.2021354
  • This research was conducted to study upgrading bio-crude oil (BCO) produced by pyrolysis of palm oil without the use of a catalyst, using protonated zeolite-Y designated as H-Y. Preparation of H-Y was carried out by subjecting zeolite-Y synthesized from rice husk silica (RHS) and food grade aluminium foil (FGAF) to a cation exchange process using ammonium nitrate solution with different concentrations of 2.0, 2.5, 3.0, and 3.5 M to obtain (H-Y) samples with different protonation extents. To confirm the formation of Na-Y, the sample was characterized using XRD and SEM, and to evaluate the protonation, the Na-Y and H-Y samples were analyzed using XRF. Characterization using XRD showed that the Na-Y sample is faujasite, which is the characteristic phase of zeolite-Y, and supported by the existence of particles with octahedral structure as seen by SEM. Successful protonation resulted in a reduction of Na content up to 89.948% from that of the Na-Y, which was demonstrated by the XRF results. Catalytic upgrading experiments demonstrated that H-Y zeolites functioned to increase the bio-hydrocarbon content from 80.23% in the BCO to practically 100% in the upgraded oil. In addition, no acids were identified in the upgraded fuels, implying that H-Y zeolite is a promising catalyst for BCO upgrading for bio-hydrocarbon enrichment of the oil.



    Neural networks (NNs) are non-parametric machine learning models inspired by the structure and operation of the biological brain cells. They consist of interconnected computational units, called perceptrons, organized into layers [1]. In recent decades, NNs have been used in various classification, regression, and forecasting tasks [2,3,4,5]. Their architecture may vary depending on the specific tasks they are designed to solve, and their efficiency is highly affected by their learning process. Multi-layer perceptrons (MLP) are feedforward NNs consisting of sequential layers of neurons, also called perceptrons, that are interconnected through weighted synapses [1]. A simple MLP consists of three layers of perceptrons that are arranged as follows. The input layer receives the input signals, whereas the middle and output layers perform complex calculations and make the MLP capable of approximating any continuous function. MLPs are often trained on a set of labeled training instances that are prepared as input-target pairs to learn the association between inputs and outputs of the MLP model. The learning process of an MLP, which belongs to the group of NP-hard problems, involves adjusting the hyperparameters and weights of that MLP model by implementing two main categories of optimization techniques: gradient descent backpropagation algorithms and heuristic methods. The gradient-based algorithms such as gradient descent, conjugate gradient, and Levenberg Marquardt algorithms are conventional back-propagation supervised learning algorithms [6,7,8,9]. There are a few significant challenges associated with most of the gradient descent-based learning algorithms, namely, (1) the tendency to get stuck in local minima or saddle points in the loss function's landscape of the MLP; (2) the slow convergence of the learning process, especially in vanishing and exploding gradient phenomena [10]; and (3) the strong dependency on initial weights and learning rates [11]. All of those limitations, and others, make it difficult for the MLP to learn effectively.

    Recently, heuristic optimization algorithms have shown several advantages over conventional gradient-descent methods in the training of MLPs. Many evolutionary and swarm-based optimization algorithms are extensively used to solve complex optimization problems and have been suggested in the literature as competitive alternatives to optimize the weights and hyperparameters of MLPs [12,13,14]. In contrast to gradient-based learning, heuristic optimization methods can prevent the occurrence of local minima since they are designed to search for global optima across complex search spaces of weights. This is particularly convenient in complex MLPs with many parameters, where finding a global optimum is essential to improve the overall performance [15].

    Besides, heuristic algorithms for optimization can efficiently optimize the MLPs with non-differentiable or discontinuous loss functions since they do not count on gradient information. Therefore, those optimization algorithms are suitable for problems with non-smooth objective functions [16,17].

    Although there are many suggested evolutionary approaches to optimize MLP models, there is still an open problem of stagnation in local optima because of the likely weak exploration and/or exploitation of some of those heuristic approaches. Furthermore, evolutionary optimization approaches are generally slow since they require computing an objective function value for each candidate solution in the evolving populations. Motivated by these reasons, in this research, a new framework based on the recently introduced dynamic group-based cooperative optimizer (DGCO) optimization algorithm is suggested and validated for the training of MLPs with a single hidden layer. The DGCO, a meta-heuristic evolutionary computing algorithm for optimization, is inspired by the dynamic collaboration of teamwork agents. The DGCO dynamically manages the exploration and exploitation tasks over all iterations until the final one is reached to solve complex optimization problems [18]. We examined the proposed approach for the optimization of MLP models by using five classification benchmarking datasets for engineering applications. The simulation results are compared to those obtained by commonly used optimizers in the literature, including cooperative and competitive evolutionary algorithms for optimization, namely the genetic algorithm (GA), grey wolf optimization GWO algorithm [19], and particle swarm optimization (PSO) algorithm, as well as those obtained by the conventional gradient descent-based algorithm.

    In general, the evolutionary computation (EC) algorithms deployed for the automatic learning of MLP models mainly aim to optimize either the structure of those models or the weights of the connections among their perceptrons. In this research, we validate our approach to optimizing the weights and biases of MLP models for classification. In this work, we introduce the DGCO as a competitive optimizer to optimize the parameters of MLP models for applications with limited training datasets, as shown in the simulation results in Section 6.2.

    The main contributions of this paper are as follows:

    1) Introducing a novel framework to optimize feedforward MLP models by using a cooperative EC algorithm. The framework validated the ability of the DGCO algorithm to optimize a machine learning model. The suggested optimizer showed its competitive performance in in its ability to avoid local optima as well as its fast convergence and optimization time.

    2) We validate the proposed framework's ability to optimize the weights of MLP models even in cases of reduced or limited training datasets.

    3) We compare the performance of the proposed framework with other competitive evolutionary optimizers in terms of accuracy and run time.

    The remainder of this paper is organized as follows. Section 2 shows a background and introduces related work, and Section 3 briefly describes MLP models and their basic concepts. Section 4 overviews the DGCO evolutionary algorithm and its equations and main features. In Section 5, we detail the proposed DGCO-based framework. The simulation results are depicted and discussed in Section 6. Finally, we conclude our work and summarize its limitations and how they can affect our future research perspectives in Section 7.

    Swarm-based algorithms are the most investigated heuristic methods among the EC algorithms for the optimization of MLP models. Those population-based algorithms are nature-inspired, where a population consists of agents representing solution candidates for the optimization problem. The initially randomized solutions in the early populations are evolved and updated until a satisfactory solution is found or a stopping criterion is reached. The incorporated randomness in the EC algorithms permits the exploration of the search space and the move from a local search to a global one, which makes those algorithms suitable for global optimization [15].

    In general, the EC optimization algorithms are considered as either competitive, like the GA, or cooperative, like the PSO algorithm [20,21]. GAs, introduced in [22,23,24], constitute one type of the most investigated evolutionary heuristic algorithms for the optimization of MLP models [25,26,27,28]. Several swarm-based optimization algorithms are reported in the literature as NN optimizers, namely, the ant colony optimization algorithm [29,30] and the artificial bee colony (ABC) algorithm [31]. Those algorithms, with their variations, were investigated for their ability to optimize MLP models [32,33,34,35,36]. Besides, other EC algorithms, like the differential evolution (DE) [37] and the brain storm optimization algorithms [38] that simulate brainstorming process, are also employed to optimize the weights and biases of MLPs, effectively tweaking an MLP model to improve its performance [39,40]. Recently, optimizing MLP models by using heuristic evolutionary optimizers for medical, agriculture, food quality, and mineral grading applications have been reported in the literature [41,42,43]. In [43] the authors compared the performance of several EC heuristic optimizers like ABC, PSO, and the GA to estimate gold grade with highly sparse data. On the other hand, several papers have suggested combined approaches that hybridize gradient-based learning with EC for global optimization [44,45], while in others, hybridized heuristics are proposed to optimize MLP models [46]. In [44], the authors introduced and validated a framework that combines the DE optimizer with the local conjugate gradient optimization algorithm. The authors claimed that the DE-GC approach outperformed other optimizers, including two variants of DE. In a forecasting application, the authors of [45] proposed a hybrid PSO and conjugate gradient learning of MLP models to predict air pollution parameter data, namely the suspended particulate matter. The authors claimed that the hybrid approach provided better predictions than the gradient conjugate and PSO separately. In [46], the authors proposed a framework that hybridizes the GA algorithm and ABC algorithm to optimize MLP models for medical applications to detect diabetes and breast cancer cases.

    MLPs are artificial NNs of the feedforward type where sequential layers of processing units called perceptrons are interconnected through weighted connections [1]. An MLP unit is a cell of computation called a perceptron. Each perceptron has an activation function that decides the final output of that perceptron. A simple MLP may consist of three layers of perceptrons that are arranged as follows: the input layer receives the input signals to map them to the next MLP layer, whereas the middle and output layers make the decisions about the input signals. An MLP may include an arbitrary number of hidden layers that perform complex calculations and make the MLP capable of approximating any continuous function for regression or classification. MLP layers are, in general, interconnected in a one-directional fashion. The connections are represented by real numbers that hold the knowledge of the MLP.

    The MLP models are trained on labeled input-target pairs to learn how to model the associations between input vectors and output targets. The learning process involves adjusting their parameters and weights through a supervised learning process. The supervised learning of MLP models is completed by using the classical gradient descent backpropagation algorithm or one of its variants or, more recently, by an EC-based learner. Figure 1 depicts the general structure of an MLP model with one hidden layer. The outcome of each perceptron is computed in two steps. First, the weighted sum of the inputs is calculated by using Eq (1).

    Figure 1.  An MLP NN with a single hidden layer.
    Uj=ni=1Wijxi+bj (1)

    where xi is the input to the jth perceptron and wij is the weight of the connection between the input i and the perceptron j in the next layer. The bias bj of the perceptron j decides the output of that perceptron against the value of the weighted input Wijxi. In the second step, the final output of the neuron j, denoted by φ(Uj) is computed by applying an activation function φ(.) on the summation value Uj computed in Eq (1). Several types of continuous activation functions are commonly used in MLP models [1]. Equation (2) shows the computation of the output of neuron j by using the sigmoid function that is widely used in the literature;

    φ(Uj)=11+eaUj (2)

    where Uj is the weighted summation of inputs to perceptron j, and a is the slant parameter of the sigmoid function.

    The DGCO [18] is influenced by the cooperative behavior adopted by swarm individuals to achieve their global goals. The optimization process handled by the DGCO starts with an initial random population of individuals representing candidate solutions to the problem being solved. Then, the fitness of each individual of the population is calculated. After that, the DGCO divides the population into two groups: the exploration group and the exploitation group (Figure 2). The responsibility of the exploration group is to search for promising areas in the search space. To do that, individuals in the exploration group apply one of two techniques. The first one searches the area around a solution. This strategy is mathematically modeled by using the following equations:

    Figure 2.  Exploration and exploitation groups of the DGCO.
    D=r1.(S(t)1) (3)
    S(t+1)=S(t)+D.(2r21) (4)

    where r1 and r2 are coefficient vectors in the intervals [0, 2] and [0, 1] respectively, t refers to the current iteration, S is the current solution vector, and D is the distance between the current agent and the solution agent.

    The second strategy that is adopted by individuals of the exploration group is Mutation, which improves the diversity in the population and therefore helps to prevent stagnation into local optima. In general, these two strategies help to increase the DGCO's exploration capabilities.

    Besides, individuals in the exploitation group are responsible for finding better positions in the search space. To achieve that, individuals in the exploitation group apply two different techniques. The first one allows the search agents to move toward the best solution found so far by using random walks. This task is mathematically modeled by using the following equations:

    D=r3.(L(t)S(t)) (5)
    S(t+1)=S(t)+D (6)

    where r3 is a random vector of values in the interval [0, 2] that controls the moving steps toward the best solution, also referred to as the solution, t refers to the current iteration, S is the vector of the current solution, L is the vector of the best solution, and D indicates the distance vector.

    The second technique used by individuals in the exploitation task permits to randomly search around the best solution. This strategy is modeled by using the following equations:

    D=L(t)(kr4) (7)
    S(t+1)=S(t)+D.(2r51) (8)
    k=22×t2iters_count2 (9)

    where r4 and r5 are random vectors of numbers in the interval [–1, 1], k decreases exponentially from 2 to 0 throughout iterations, L is the vector of the best solution, S is the current solution vector, and D indicates the diameter of the circle in which the solution looks for better solutions.

    One of the advantages of the DGCO algorithm is its ability to achieve exploration and exploitation balance up to the last iterations of the algorithm. The DGCO dynamically changes the number of individuals in each sub-group during each iteration based on its recorded convergence history. Initially, the DGCO starts with a 70/30 schema, which assigns 70% of the population to the exploration group and the remaining 30% to the exploitation group. Starting with more individuals in the exploration group helps the model to find more promising areas in the search space.

    Throughout iterations, the DGCO decreases the number of individuals in the exploration group and increases the number of individuals in the exploitation group. The DGCO applies an elitism operation by electing the best solution in each iteration to be in the next population. This facilitates improvement of the convergence behavior of the algorithm.

    At any iteration, the DGCO may increase the number of exploring individuals and decrease the number of individuals in the exploitation group if the fitness of the leader solution does not improve significantly for three consecutive iterations. Finally, the DGCO may randomly interchange the roles of individuals of each group in each iteration to guarantee the diversity of the population. The pseudo-code of the DGCO algorithm is presented in Algorithm 1.

    Algorithm 1. Pseudo-code of DGCO optimization algorithm
    Initialize the population S={S1,S2,,Sd}
    L = the best solution (leader)
    While t < iters_count
      Calculate fitness of each solution, calculate best solution
      k = 2 – 2 * t / iter_count
      Update the number of solutions in each group
      if best fitness did not change from previous 2 iterations
       Increase the number of solutions in the exploration group
      end if
      
      for each solution in the exploration group
         update r1, r2, and p
          elitism of the best solution
         if p > = 0.5
           mutate the solution
         else
          search around current solution (Eq (4))
        end if
      end for
      
      for each solution in the exploitation group
        elitism of the best solution
        update r2, r3, r4, p
         if p > = 0.5
           move towards the best solution (Eq (6))
         else
          search around the best solution (Eq (8))
        end if
      end for
      amend solutions that go beyond the search space
      update prev_fitness1, prev_fitness2
    end while

    The proposed framework is based on the DGCO algorithm for training MLP models with a single hidden layer. We will refer to this framework by the abbreviation DGCO-MLP. Two main factors need to be carefully selected when a heuristic optimization algorithm is designated to train an MLP model, namely, the representation of the search agents and the selection of the fitness function, also named the objective function. In the DGCO-MLP, each search agent is represented by a one-dimensional vector that represents the hyperparameters of a candidate MLP model. A group of search agents, also referred to as the population, consists of a number of search agents that evolve from one training epoch to the next one. Each vector that represents an agent in a population contains the weights and biases of the MLP candidate related to that agent. The dimension of each vector is equal to the total number of connections among the perceptrons in the sequentially connected layers of the MLP model, plus the number of biases of those perceptrons, as shown in Eq (10).

    Dimensionofagentsvector=(n×h)+(h×2)+1 (10)

    where n is the dimension of the input layer and h is the dimension of the hidden layer. Equation (10) includes the number of connections among input units and hidden perceptrons (n×h), the number of connections among the hidden perceptrons and the output perceptron (h×1), and the number of biases of the hidden perceptrons (h×1) and the output perceptron (+1). Figure 3 illustrates the compositions of a search agent's vector in the DGCO-MLP framework.

    Figure 3.  The compositions of a search agent's vector in the DGCO-MLP framework.

    We adopted the mean square error (MSE) as the value of the fitness of each search agent. The MSE is calculated as the difference between the actual output of the MLP ˜y and the exact value y that is available for each instance in the training dataset. Equation (11) shows the calculation of the MSE for one epoch of training.

    MSE=1nni=1(y˜y)2 (11)

    where n is the number of training instances in the training dataset.

    The main steps of the DGCO adopted in the DGCO-MLP framework are as follows:

    Step 1. Initialization: The first generation of the search agent's population is randomly initialized. Each search agent designates a potential MLP candidate.

    Step 2. Fitness estimation: The potential of each search agent in the pool of candidates is estimated by using a designated fitness function, also named "cost function". This is realized by assigning the weights and biases stored in each vector that represents a search agent to an MLP. Then, the obtained MLP is evaluated by using training and validation datasets. The heuristic-based training aims to find an MLP candidate that produces the minimal MSE during the training phase.

    Step 3. Updating the population: The positions of the search agents are dynamically updated by using the equations described in Section 4.

    Steps 2 and 3 are repeated until an adopted stopping criterion is reached. The MLP with the optimal fitness score is returned as the solution for the optimization problem. That MLP is then validated on an unseen testing subset.

    Figure 4 shows the main steps of the DGCO-MLP approach for MLP learning.

    Figure 4.  Main chart of the DGCO-MLP framework.

    We evaluated the proposed DGCO-based approach for training MLPs by using five standard benchmarking datasets for classification. We selected those datasets from the University of California at Irvine machine learning repository [47] and the Kaggle repository [48]. Table 1 shows some parameters of these datasets in terms of the number of training and testing samples, classes, and features. Every selected dataset has a particular number of each of those coordinates. Therefore, we created several MLP models with different numbers of input units to validate the DGCO-MLP against each dataset. Table 2 depicts the MLP structure for each examined dataset. In the literature, several approaches are used to select the number of perceptrons in the hidden layer with no standard that guarantees superiority. In our simulations, we adopted the following formula to decide the number of hidden units in the hidden layer of the examined MLP;

    Table 1.  Datasets for classification to validate the evolutionary optimization approach.
    Dataset # Classes # Features Training Samples Testing Samples
    Parkinson's 2 22 130 65
    Ionosphere 2 34 234 127
    Hepatitis 2 10 104 51
    Vertebral 2 6 208 103
    Water Potability 2 9 1529 764

     | Show Table
    DownLoad: CSV
    Table 2.  MLP structure for each examined dataset.
    Dataset # Features MLP Structure
    Parkinson's 22 22-12-1
    Ionosphere 34 34-18-1
    Hepatitis 10 10-6-1
    Vertebral 6 6-4-1
    Water Potability 9 9-5-1

     | Show Table
    DownLoad: CSV
    h=n+m2 (12)

    where n is the number of input units, which, in our case, is equal to the number of features in the dataset, and m is the number of output units.

    We validated and compared the accuracy of the proposed DGCO-MLP framework against the classical gradient descent backpropagation algorithm and other commonly used heuristic evolutionary optimizers, namely GWO, PSO, and the GA. Table 3 lists the controlling parameters of the examined evolutionary optimizers with their related initial values.

    Table 3.  Controlling parameters of the examined evolutionary optimizers.
    Algorithm Parameter Value
    GWO population size
    maximum number of iterations
    randomization interval
    200
    350
    [–1, 1]
    PSO population size
    maximum number of iterations
    acceleration constants c1, c2
    inertia weight values
    200
    350
    {2.1, 2.1}
    [0.6, 0.9]
    GA population size
    maximum number of iterations
    crossover probability
    mutation rate
    selection mechanism
    200
    350
    0.9
    0.1
    roulette wheel
    DGCO population size
    maximum number of iterations
    randomization interval
    exploration initialrate
    200
    350
    [–1, 1]
    70%

     | Show Table
    DownLoad: CSV

    In our simulations, we used Python as the primary programming language to implement our version of the DGCO and all of the examined optimization algorithms. Training and validation routines were also implemented and validated with Python frameworks and libraries. We have benefited from the extensive Python libraries and tools such as Scikit-learn 0.24.2 packages that require Python 3.6 or newer versions [49]. Scikit-learn provides many machine learning algorithms, preprocessing techniques, and evaluation tools. All simulation related tasks, including training and testing of examined optimizers and models, were implemented by using the following processor: Intel® Core™ i7-2720QM CPU 2.20 GHz, with 4.00 GB RAM.

    All examined datasets were divided into 70% for training and 30% for testing. In a preprocessing step, we normalized each dataset's feature values to eliminate the effect of features with different scales in their values. We adopted the min-max normalization described by Eq (13).

    pscaled=pmin(P)max(P)min(P) (13)

    where p is the exact value of parameter P. max(P) and min(P) are the maximum and minimum values of the parameter P, respectively.

    All optimization experiments were executed for thirty runs, each with 350 iterations (generations). The simulations were run with the standard DGCO parameter values described in [18].

    Table 4 shows the statistical scores obtained over the 30 runs of each optimizer on each benchmarking dataset, namely the average (AVG), the standard deviation (STD), the best accuracy (BEST), the worst score (WORST), and the median (MEDIAN).

    Table 4.  Performance scores of tested trainers. Scores in bold rank first, and the underlined rank second.
    Dataset BP GWO PSO GA DGCO
    Parkinson's AVG 0.65448 0.71791 0.72089 0.71791 0.74039
    STD 0.01626 0.01990 0.02227 0.02213 0.02582
    BEST 0.68656 0.77611 0.77611 0.76119 0.80597
    WORST 0.64179 0.68656 0.68656 0.68656 0.70149
    MEDIAN 0.64179 0.71641 0.71641 0.71641 0.73380
    Ionosphere AVG 0.86416 0.95416 0.93241 0.88875 0.86061
    STD 0.11939 0.01365 0.04871 0.06663 0.07192
    BEST 0.95000 0.98333 0.96666 0.96666 0.97800
    WORST 0.55 0.93333 0.75 0.73333 0.67333
    MEDIAN 0.9125 0.95416 0.94166 0.90416 0.89133
    Hepatitis AVG 0.66666 0.66132 0.66320 0.66698 0.69918
    STD 0.00000 0.02966 0.04297 0.04468 0.03631
    BEST 0.66667 0.73585 0.75472 0.75662 0.78872
    WORST 0.66666 0.603773585 0.58490 0.56603 0.67490
    MEDIAN 0.66666 0.66037 0.66981 0.66981 0.69811
    Vertebral AVG 0.70755 0.817925 0.816038 0.83396 0.82316
    STD 0.00000 0.026525 0.024003 0.026928 0.021512
    BEST 0.70755 0.84905 0.84905 0.88679 0.86449
    WORST 0.70755 0.754717 0.745283 0.792453 0.79266
    MEDIAN 0.71 0.82 0.82 0.83 0.82
    Water Potability AVG 0.60124 0.61333 0.60000 0.60729 0.64476
    STD 0.00877 0.02818 0.02572 0.02961 0.04716
    BEST 0.60320 0.66667 0.63333 0.66250 0.83167
    WORST 0.56395 0.53333 0.52916 0.55 0.61233
    MEDIAN 0.60319 0.61666 0.60416 0.6125 0.63916

     | Show Table
    DownLoad: CSV

    Table 5 shows each examined trainer's average run time (in seconds), including the heuristic optimizers and the backpropagation learning algorithm. The run time scores in each row were computed as the average elapsed time of an optimizer over thirty runs on each dataset. For heuristic optimizers, each run consists of 350 iterations (generations) of evolution. For the gradient-based trainer, each run consisted of 350 epochs of backpropagation training.

    Table 5.  Average elapsed time (in seconds) of optimizers against each examined dataset over the simulation runs. Scores in bold indicate the fastest, and the underlined rank second.
    Dataset BP GWO PSO GA DGCO
    Parkinson's 0.369 369.006 294.528 236.692 178.086
    Ionosphere 0.508 518.267 401.460 230.194 229.363
    Hepatitis 0.193 113.812 111.629 114.144 93.058
    Vertebral 0.152 103.754 97.330 108.404 89.642
    Water Potability 0.065 156.262 145.451 151.856 131.714

     | Show Table
    DownLoad: CSV

    The scores illustrated in Table 4 show that, on most of the datasets, the DGCO-based trainer outperforms the other heuristic and gradient-based trainers to produce the best performing MLP either with the best average of correct classifications over the runs (AVG), or with the best score in one of the elapsed runs (BEST). On other datasets, the DGCO trainer performed competitively on the task of optimizing the MLP models, ranking in the second best place (underlined scores), as shown for the Ionosphere and Vertebral datasets.

    For the datasets Parkinson's, Hepatitis, and Water Potability, the DGCO-based framework provided an amelioration of 2.7%, 4.83%, and 5.13% relative to the second best performing optimizers, respectively. The amelioration for Water Potability is given by Eq (14):

    5.13%=0.644760.613330.61333×100 (14)

    To quantify the statistical significance of the observed results, we applied the Student's t-test [50]. This statistical significance test aims to check if the differences between the obtained mean values in Table 4 are significant for the cases when the DGCO-MLP optimizer is the best performing. In our experiments, we used a significance level of 0.05. That indicates a 5% risk of concluding that a difference exists when there is no actual difference between the examined means. Table 6 shows the results of the Student's t-test for the scores obtained on the datasets Parkinson's, Hepatitis, and Water Potability where the DGCO-MLP rank first.

    Table 6.  Student's t-test as a statistical significance test.
    dataset: Parkinson's
    variables variable 1: PSO variable 2: DGCO
    observations 30 30
    means 0.72089 0.74039
    variance 0.000495 0.000666
    P(Tt) One Tail 0.01003.05
    P(Tt) Two Tail 0.02006.05
     
    dataset: Hepatitis
    variables variable 1: GA variable 2: DGCO
    observations 30 30
    means 0.66698 0.69918
    variance 0.001996 0.001555
    P(Tt) One Tail 0.01030.05
    P(Tt) Two Tail 0.02060.05
     
    dataset: Water Potability
    variables variable 1: GWO variable 2: DGCO
    observations 30 30
    means 0.61333 0.64476
    variance 0.000794 0.002224
    P(Tt) One Tail 0.00732.05
    P(Tt) Two Tail 0.01464.05

     | Show Table
    DownLoad: CSV

    The comparative classification scores provided by the DGCO trainer indicate that the DGCO-MLP framework is a competitive optimizer that can prevent premature convergence toward local minima in the search for the best hyperparameters of an MLP model, namely the synaptic weights and biases. This is due to the excellent exploration capabilities of the DGCO in spaces with either moderate or high numbers of dimensions. Furthermore, the competitive performance of the DGCO-MLP trainer is shown with the datasets that include limited training data records, as illustrated in Tables 1 and 4. The problem of reduced training datasets is an open challenge in machine learning and data science projects, mainly when researchers apply gradient-based trainers.

    The average run times of the examined learners in Table 5 show that the gradient-based backpropagation learner's run time is much shorter than those related to the competing heuristic optimizers. This is expected since those global optimizers involve computing the objective function of multiple solutions in the evolving population in each iteration (generation). In contrast, the gradient descent learner involves the calculation of gradients of an error function. On the other hand, the scores in Table 5 indicate that the DGCO-MLP optimizer is much faster than the other benchmarked optimizers. This could be justified by the dynamic mechanism of controlling the exploration in the DGCO during the evolved populations up to the final steps of this optimizer. Figure 5 shows the convergence curves for the GWO, GA, PSO, and DGCO optimizers when examined on each of the five benchmarking datasets. The curves show the MSEs produced by the MLPs that have been optimized by the five optimizers for a randomly selected run among the thirty independent runs. For clarity, Figure 5 shows the MSE for the first 30 training iterations on each benchmark dataset. The charts indicate that the DGCO exhibits a fast convergence in the first iterations for two datasets. For other classification datasets, the DGCO shows competitive performance compared to the best method in each case.

    Figure 5.  MSE convergence curves for evolutionary optimizers on different classification datasets.

    Figure 6 shows the boxplots for each of the benchmarking classification datasets. The boxplots are depicted for 30 MSEs that were obtained by each optimizer at the beginning of the training phase. Each box in this plot is related to the interquartile range. The whiskers depict the minimum and maximum MSEs, and the bars inside the boxes represent median values. Besides, the small circles represent outliers when they exist.

    Figure 6.  Boxplot charts for different classification datasets.

    The dynamic changes in the number of individuals in each of the two sub-groups of exploration and exploitation in each iteration is the main characteristic that empowers the DGCO algorithm. This particular feature of the DGCO ensures a progressive exploration of the promising areas in the search space, which helps the model to avoid local minimum points. The algorithm uses the convergence history of the individuals to handle the dynamic changing of individuals' numbers in each sub-group, which causes a balance between exploration and exploitation performance. On the other hand, the exploitation formula permits the rigorous investigation of the local neighbors of the current best agent in the exploitation team. The convergence of the DGCO-MLP trainer is ensured by the elitism mechanism that holds the current best agent B* and transfers it to the next population. Moreover, the search agents in the DGCO tend to search locally around the promising candidate agents. The cooperative behavior of the exploration agents in each generation leads to a high exploration characteristic of the evolutionary algorithm. That allows it to show better local minimum avoidance.

    The results indicate that most of the benchmarked evolutionary algorithms, including the DGCO, show competitive scores when dealing with datasets with a small number of training data records as shown for most of the examined datasets. The issue of minor or reduced training datasets in the training of NNs is a significant challenge in machine learning.

    In this research, we propose and validate an EC framework to train MLP models. The consistent convergence and the high local optima avoidance were the essential motivations for introducing the recently proposed evolutionary cooperative optimization algorithm, the DGCO. The objective of the optimization algorithm was to minimize the MSE of MLP responses by optimizing the synaptic weights and biases of those MLPs through EC. Using to the different numbers of features in each of the five examined datasets, we built an MLP model with a particular structure for each dataset to validate our optimization framework against that dataset. Therefore, each MLP had its particular number of input and hidden perceptrons to suit the classification problem described in each dataset. We compared the performance of the proposed DGCO-based against the conventional backpropagation algorithm and several commonly used evolutionary optimization algorithms, namely the GA, GWO, and PSO. The simulation results showed the competitive performance of the proposed optimizer against other examined ones on most of the benchmarked datasets in terms of overall performance and convergence. The DGCO showed its ability to avoid local minima in the loss landscape of the objective function, which is, in our case, the MSE of an MLP response. This is due to the ability of the DGCO to balance between exploration and exploitation capabilities by applying its dynamic mechanism to manage the number of agents in the two cooperative teams of exploration and exploitation. The simulation results concluded that the DGCO-MLP trainer is reliable for training MLPs to classify datasets with different difficulty levels including the number of features and the available training data records.

    The DGCO-MLP performed well on the task of optimizing MLP models with a competitive run time on all examined datasets. Moreover, it outperformed the benchmarked optimizers on the task of training MLP models on very limited datasets that include less than 135 data records for training. Optimizing MLPs with limited data is still one of the open challenges.

    A minor limitation of our current work is that it studies the optimization of one type of artificial NN, namely the MLPs. Another limitation could be the exploration of the capacity of our framework for the engineering application of classification, as well as the comparison of the performance of this framework with commonly used heuristic optimizers in the literature that did not include comparing it against hybrid optimization approaches. Hence, extending the current simulations to different datasets with more features and many data records is one of the next steps for future work. Another perspective involves applying the proposed framework to optimize the MLP to approximate regression and forecasting task functions. Combining the DGCO evolutionary algorithm with other types of NNs can also be a valuable contribution.

    The authors declare that they have not used artificial intelligence tools in the creation of this article.

    The authors declare that there is no conflict of interest.



    [1] Hasanudin H, Asri WR, Zulaikha IS, et al. (2022) Hydrocracking of crude palm oil to a biofuel using zirconium nitride and zirconium phosphide-modified bentonite. RSC Adv 12: 21916–21925. https://doi.org/10.1039/d2ra03941a doi: 10.1039/d2ra03941a
    [2] Moreira JdeBD, Rezende DBde, Pasa VMD (2020) Deoxygenation of Macauba acid oil over Co-based catalyst supported on activated biochar from Macauba endocarp: A potential and sustainable route for green diesel and biokerosene production. Fuel 269: 1–12. https://doi.org/10.1016/j.fuel.2020.117253 doi: 10.1016/j.fuel.2020.117253
    [3] Cheng S, Wei L, Zhao X, et al. (2015) Directly catalytic upgrading bio-oil vapor produced by prairie cordgrass pyrolysis over Ni/HZSM-5 using a two stage reactor. AIMS Energy 3: 227–240. https://doi.org/10.3934/energy.2015.2.227 doi: 10.3934/energy.2015.2.227
    [4] Simanjuntak W, Pandiangan KD, Sembiring Z, et al. (2021) The effect of crystallization time on structure, microstructure, and catalytic activity of zeolite-A synthesized from rice husk silica and food grade aluminum foil. Biomass Bioenergy 148: 106050–106056. https://doi.org/10.1016/j.biombioe.2021.106050 doi: 10.1016/j.biombioe.2021.106050
    [5] Abatyough MT, Ajibola VO, Agbaji EB, et al. (2022) Properties of upgraded bio-oil from pyrolysis of waste corn cobs. J Sustainability Environ Manage 1: 120–128. https://doi.org/10.3126/josem.v1i2.45348 doi: 10.3126/josem.v1i2.45348
    [6] Phromphithak S, Onsree T, Saengsuriwong R, et al. (2021) Compositional analysis of bio-oils from hydrothermal liquefaction of tobacco residues using two-dimensional gas chromatography and time-of-flight mass spectrometry. Sci Prog 104: 1–12. https://doi.org/10.1177/00368504211064486 doi: 10.1177/00368504211064486
    [7] Pandiangan KD, Simanjuntak W, Avista D, et al. (2022) Synthesis of hydroxy-sodalite from rice husk silica and food-grade aluminum foil as a catalyst for biomass pyrolysis. Trends Sci 19: 1–11. https://doi.org/10.48048/tis.2022.6252 doi: 10.48048/tis.2022.6252
    [8] Attia M, Farag S, Chaouki J (2020) Upgrading of oils from biomass and waste: Catalytic hydrodeoxygenation. Catalysts 10: 1–30. https://doi.org/10.3390/catal10121381 doi: 10.3390/catal10121381
    [9] Chen LH, Yoshikawa K (2018) Bio-oil upgrading by cracking in two-stage heated reactors. AIMS Energy 6: 203–315. https://doi.org/10.3934/energy.2018.1.203 doi: 10.3934/energy.2018.1.203
    [10] Garba MU, Musa U, Olugbenga AG, et al. (2018) Catalytic upgrading of bio-oil from bagasse: Thermogravimetric analysis and fixed bed pyrolysis. Beni-Suef Univ J Basic Appl Sci 7: 776–781. https://doi.org/10.1016/j.bjbas.2018.11.004 doi: 10.1016/j.bjbas.2018.11.004
    [11] Efeovbokhan VE, Ayeni AO, Eduvie OP, et al. (2020) Classification and characterization of bio-oil obtained from catalytic and non-catalytic pyrolysis of desludging sewage sample. AIMS Energy 8: 1088–1107. https://doi.org/10.3934/energy.2020.6.1088 doi: 10.3934/energy.2020.6.1088
    [12] Yan P, Azreena IN, Peng H, et al. (2023) Catalytic hydropyrolysis of biomass using natural zeolite-based catalysts. Chem Eng J 476: 146630–146642. https://doi.org/10.1016/j.cej.2023.146630 doi: 10.1016/j.cej.2023.146630
    [13] Liu X, Mäki-Arvela P, Aho A, et al. (2018) Zeta potential of beta zeolites: Influence of structure, acidity, pH, temperature and concentration. Molecules 23: 1–14. https://doi.org/10.3390/molecules23040946 doi: 10.3390/molecules23040946
    [14] Hasanudin H, Asri WR, Permatahati U, et al. (2023) Conversion of crude palm oil to biofuels via catalytic hydrocracking over NiN-supported natural bentonite. AIMS Energy 11: 197–212. https://doi.org/10.3934/energy.2023011 doi: 10.3934/energy.2023011
    [15] Verdoliva V, Saviano M, Luca SDe (2019) Zeolites as acid/basic solid catalysts: Recent synthetic developments. Catalysts 9: 1–21. https://doi.org/10.3390/catal9030248 doi: 10.3390/catal9030248
    [16] Bahgaat AK, Hassan HE, Melegy AA, et al. (2020) Synthesis and characterization of zeolite-Y from natural clay of Wadi Hagul, Egypt. Egypt J Chem 63: 3791–3800. https://doi.org/10.21608/ejchem.2020.23195.2378 doi: 10.21608/ejchem.2020.23195.2378
    [17] Xu Y, Cai L, Shao L, et al. (2012) Preparation and characterization of NaY zeolite in a rotating packed bed. Pet Sci 9: 106–109. https://doi.org/10.1007/s12182-012-0190-0 doi: 10.1007/s12182-012-0190-0
    [18] Costa PA, Barreiros MA, Mouquinho AI, et al. (2022) Slow pyrolysis of cork granules under nitrogen atmosphere: by-products characterization and their potential valorization. Biofuel Res J 9: 1562–1572. https://doi.org/10.18331/BRJ2022.9.1.3 doi: 10.18331/BRJ2022.9.1.3
    [19] Yoo ML, Park YH, Park YK, et al. (2016) Catalytic pyrolysis of wild reed over a zeolite-based waste catalyst. Energies 9: 1–9. https://doi.org/10.3390/en9030201 doi: 10.3390/en9030201
    [20] Mohammed IY, Kazi FK, Yusup S, et al. (2016) Catalytic intermediate pyrolysis of Napier grass in a fixed bed reactor with ZSM-5, HZSM-5 and zinc-exchanged zeolite-a as the catalyst. Energies 9: 1–17. https://doi.org/10.3390/en9040246 doi: 10.3390/en9040246
    [21] Mortensen PM, Grunwaldt JD, Jensen PA, et al. (2011) A review of catalytic upgrading of bio-oil to engine fuels. Appl Catal A: Gen 407: 1–19. https://doi.org/10.1016/j.apcata.2011.08.046 doi: 10.1016/j.apcata.2011.08.046
    [22] Pattiya A, Titiloye JO, Bridgwater AV (2008) Fast pyrolysis of cassava rhizome in the presence of catalysts. J Anal Appl Pyrol 81: 72–79. http://dx.doi.org/10.1016/j.jaap.2007.09.002 doi: 10.1016/j.jaap.2007.09.002
    [23] Wei B, Jin L, Wang D, et al. (2020) Catalytic upgrading of lignite pyrolysis volatiles over modified HY zeolites. Fuel 259: 116234. https://doi.org/10.1016/j.fuel.2019.116234 doi: 10.1016/j.fuel.2019.116234
  • This article has been cited by:

    1. Qiyu Zhang, Mingmin Gong, 2024, Three-Dimensional Reconstruction of Neural Radiation Field Based on Multi-Layer Perceptron, 979-8-3503-8694-3, 216, 10.1109/ICECAI62591.2024.10674816
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1136) PDF downloads(134) Cited by(0)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog