1.
Introduction
SVM [1] is a classical method in machine learning area. Because of its excellent performance, SVM is widely used in many application scenarios, such as text classification [2,3,4], facial recognition [5], pedestrian detection [6,7], etc. However, what mostly affects the ability of SVM is the kernel function selection and kernel parameter choice. About the former, the commonly used kernel functions are linear kernel, polynomial kernel, sigmoid kernel, Gaussian kernel (also called radius basis kernel function), and Laplace kernel. Many studies have shown that kernel function selection is closely related to the characteristics of data. According to different data features, we choose the appropriate kernel function. Among the above five kernel functions, the Gaussian kernel is the most popularly used. As for kernel parameter choice, there are four main methods, i.e., cross validation technology, minimizing the upper bound on the error rate of algorithms, optimizing kernel function metrics, and optimization algorithms (like swarm intelligence optimization algorithms). Among them, using swarm intelligence optimization algorithms to find the best kernel parameters is an effective solution.
Swarm intelligence optimization algorithms are a class of simple, flexible and adaptive meta-heuristic algorithms, which were inspired by the social behavior of biological individuals. Generally speaking, there is no standard categorization of swarm intelligence algorithms. For convenience, we classify them into four categories: evolution-based algorithms, physical & mathematical-based algorithms, human-based algorithms, and animal & plant-based algorithms.
Evolution-based algorithms are a type of algorithms that randomly update and replace individual creatures by modeling the rules of selection, crossover, and mutation among genes in biological genetics. The primary algorithms are genetic algorithm (GA) [8], differential evolution algorithm (DE) [9], grey prediction evolution algorithm (GPE) [10], and geometric probabilistic evolutionary algorithm (GPEA) [11].
Physical & mathematical-based algorithms are constructed from real-life physical phenomena or mathematical principles. The primary algorithms are simulated annealing algorithm (SA) [12], sine cosine algorithm (SCA) [13], gravitational search algorithm (GSA) [14], atomic search optimization algorithm (ASO) [15], artificial electric field algorithm (AEFA) [16], and optical microscope algorithm (OMA) [17].
Human-based algorithms are based on human mental activity or social behavior. The primary algorithms are teaching-learning based optimization (TLBO) [18], socio evolution and learning optimization (SELO) [19], human learning optimization algorithm (HLO) [20], and student psychology based optimization algorithm (SPBO) [21].
Animal & plant-based algorithms are the most numerous swarm intelligence algorithms. These algorithms are founded on the behavior of biological populations in nature, such as predation, reproduction, and competition for territory. The primary algorithms are gray wolf optimization algorithm (GWO) [22], tree seed algorithm (TSA) [23], whale optimizer algorithm (WOA) [24], Harris hawks optimization (HHO) [25], golden jackal optimization (GJO) [26], and northern goshawk optimization (NGO) [27]. To date, these kinds of algorithms have been improved by many scholars to solve various problems. For instance, Ma et al. [28] proposed a gray wolf optimizer based on Aquila exploration method (AGWO), which can expand the search range to improve the global search ability and reduce the possibility of falling into the local optimum. Kang et al. [29] combined HHO with Brownian motion-based mutant strategy to generate a new optimization algorithm (HHOBM), which helps HHO avoid the local optimum trap problem when optimizing non-convex functions. Lou et al. [30] introduced a hybrid strategy-based golden jackal optimizer algorithm (HGJO) to balance the global and local search capabilities, and applied it to robot path planning successfully. Li et al. [31] came up with a multi-strategy enhanced northern goshawk optimization algorithm (MENGO) to solve NGO's problems of slow convergence rate and tendency to fall into local optimization in some cases. Lin et al. [32] suggested a niching hybrid heuristic whale optimization algorithm (NHWOA) to enhance convergence speed and search coverage, and experimental results showed it has good performance in the global computations.
In 2022, Hashim and Hussien proposed the SO [33], which has good optimization capability, fast convergence speed, and wide search range [34,35], and many improved SO algorithms have been developed [36,37,38,39,40]. SO is used in a wide range of applications. For instance, Li et al. [41] proposed a snake optimization-based variable-step multi-scale single threshold slope entropy for classifying different categories of real-world signals. Nevertheless, it also has some disadvantages such as converging slowly in the early stage and easily falling into local optimization. At present, some scholars have made improvements to SO in view of these problems. For example, Hu et al. [42] proposed a multi-strategy boosted snake-inspired optimizer (BEESO), which was formed upon three improved methods, i.e., bidirectional search, modified evolutionary population dynamics, and elite opposition-based learning. However, it still has a tendency to fall into local optimum on some test functions. Yao et al. [43] proposed an enhanced snake optimizer (ESO) that utilizes four strategies, i.e., mirror imaging strategy based on convex lens imaging, parameter dynamic update strategy, sine-cosine composite perturbation factors, and tent-chaos & Cauchy mutation. Its experimental results are much better than BEESO. However, the convergence speed in the early stage of ESO can be raised a little bit.
In this paper, we combine the advantages of BEESO and ESO to propose an ISO algorithm based on MOBL, NEPD, and DES. In order to validate the effectiveness of ISO, a classical benchmark function test experiment, CEC2022 test experiment, ablation experiment, and SVM parameter optimization experiment are conducted, respectively. The main contributions of this paper are shown as follows:
• Based on ESO, BEESO, and DE algorithms, we propose an ISO, which mainly focuses on the population initialization before the exploration phase of SO, egg hatching mode during the exploitation phase, and the final elimination process.
• To illustrate the effectiveness of ISO, we test it against 12 other algorithms for comparison on 23 classical benchmark functions and CEC2022. We also perform the ablation experiment to discuss the impact of three improvement strategies and their combinations on SO.
• We apply ISO to the problem of parameter optimization for SVM and compare it with optimization methods based on other animal & plant-based algorithms.
The remainder of this article is organized as follows. Section 2 briefly introduces the SVM and SO algorithm. Section 3 describes the improved SO algorithm, which includes the mentioned three improvement strategies above. Section 4 summarizes the performances of ISO and other swarm intelligence optimization algorithms. The conclusion is shown in Section 5.
2.
Preliminaries
2.1. SVM
SVM is a machine learning method based on statistical learning theory, mainly aiming at classification and regression problems. It has attracted more and more attention from scholars, and has become a mainstream technology and standard tool in the field of machine learning. Formally, for a set of training samples (xi,yi), xi∈Rn, yi∈{+1,−1}, i=1,⋅⋅⋅,m, if the classification surface ωTx+b=0 (where ω is the normal vector and b is the bias term) can correctly classify the training samples into two categories, then the sum of the minimum distances from the two categories to the optimal classification surface should be maximized. The optimal classification surface can be obtained by solving the following optimization problem:
where C is the penalty factor, the role of which is to strike a balance between model complexity and learning capacity, and ξi is the error term.
Using the Lagrange multiplier method, we can solve the above quadratic programming problem with linear constraints, and get the Wolfe dyadic problem:
where αi is the Lagrange multiplier.
After solving the above optimization problem, we obtain the decision function:
For nonlinear problems, assume that there is a nonlinear mapping φ:Χ→F, which maps samples from the input space to a high-dimensional feature space F and is implicitly defined by a kernel function. In this article, we choose to use the Gaussian kernel function because of its better generalization ability. The definition equation takes the following form:
where σ is the width of Gaussian kernel.
After selecting the type of kernel function, the dyadic problem becomes:
and the corresponding decision function is as follows:
In order to obtain a better generalization performance of SVM, we need to optimize the penalty factor C and the width of Gaussian kernel σ.
2.2. Snake optimizer
SO is inspired by the hunting and mating behavior of snakes, and its search process can be divided into two phases: exploration and exploitation. The exploration phase describes the environmental factors, i.e., temperature and food, and there is no situation in this phase where snakes only search for food in its surroundings. It ensures that SO is able to search as wide as possible. The exploitation phase consists of two transitional modes, i.e., fight mode and mate mode, which are used to improve the search efficiency of SO. In the fight mode, each male snake will battle among themselves to get the best female snake, and each female snake will select the best male snake. In the mate mode, the occurrence of mating behavior depends on the amount of food and temperature. If the mating behavior happens, the worst positions of snakes are updated for the next round of iteration.
The following mathematical model represents the basic process of SO.
(Ⅰ) Population initialization
Randomly initialize the population to get the initial position:
where Ui is the position of ith snake, rand is a random number between 0 and 1, and Umax and Umin are the upper and lower bounds for the solution problem.
For the population, we first divide them into two groups: male snake group and female snake group, then calculate the fitness of two groups and find the best individual in each group. The best individual in the male snake group is Ubest,m, and the best individual in the female snake group is Ubest,f. Lastly, choose the global best individual Ubest between Ubest,m and Ubest,f.
(Ⅱ) Exploration phase
Exploration and exploitation phase are determined by food Q and temperature Temp, which are defined by the following formulas:
where c1=0.5, t is the current number of iterations, and T is the maximum number of iterations.
When Q<0.25, snakes start to randomly update their positions to find food.
where Ui,m and Ui,f are the updated positions in the male snake group and female snake group, respectively, Urand,m and Urand,f are the random positions in the male snake group and female snake group, respectively, c2=0.05, the sign ± can help the SO explore all possible directions and ensure a certain traversal, and Am and Af are the ability to find food of male snakes and female snakes, respectively.
where frand,m is the fitness of Urand,m and fi,m is the fitness of the ith snake in the male snake group, and frand,f is the fitness of Urand,f and fi,f is the fitness of the ith snake in the female snake group.
(Ⅲ) Exploitation phase
Under the condition of Q>0.25, if Temp>0.6, snakes do not reproduce; instead, they continue to search for food.
where Ui.m and Ui.f are the positions of male snakes and female snakes, respectively, and c3=2. If Temp<0.6, the male and female snakes will begin to select each other and mate. Two situations exist during this period.
(ⅰ) Fight mode
The male snakes will compete with each other, as do female snakes.
where Fm and Ff are the fighting ability of male snakes and female snakes, respectively.
where fbest,f is the fitness of Ubest,f and fbest,m is the fitness of Ubest,m.
(ⅱ) Mate mode
Mating behavior begins between selected male and female snakes.
where Mm and Mf are the mating ability of male snakes and female snakes, respectively.
After mate mode was completed, SO has a certain probability to enter the egg-laying period. The session can help the worst male snake and female snake update their positions again.
The pseudo code of SO is given in Algorithm 1.
3.
Proposed method
Compared with other swarm intelligence algorithms, SO has good optimization ability and fast convergence speed. However, it still has some limitations. For instance, its convergence speed is a little slow in the early stage, and it has the tendency to fall into local optimization. To address these shortcomings, this section presents the improved SO algorithm based on multiple improvement strategies.
3.1. MOBL
In many cases, the problem solving process generally starts from zero or a random value and approaches toward the optimal solution. Examples include the weights of a neural network, the population parameters of a swarm intelligence algorithm, the kernel parameters of SVM, and so on. If the random value is near the optimal solution at the beginning, the problem can be solved quickly. However, there is the worst case where the random value appears opposite to the optimal solution, and the solution process will take a lot of time.
Generally, it is impossible to get a better random value initially without previous knowledge. In the perspective of logic, the solutions of a problem can be searched from all directions. If the solution produced during the search process and its opposite solution introduced together as feasible solutions to the problem, the efficiency of searching for the optimal solution will be higher. This is the core idea of opposition-based learning [44], and it can be defined as:
where RUi is the opposite value of Ui.
Based on this theory, Yao et al. [43] proposed a new opposition-based learning, i.e., mirror imaging strategy based on convex lens imaging. The strategy not only improves the optimization accuracy, but also ensures the convergence speed. The definition equation is given as follows:
where q is the mirroring factor, and its definition formula is q=10×(1−2×(t/T)2).
3.2. NEPD
By repeatedly testing the performance of SO, we found that the egg-laying period has some impact on optimization search accuracy, and sometimes causes SO to fall into a local optimum. Hu et al. [42] offered a solution, which modifies evolutionary population dynamics. The method provides two different optimization schemes for population.
Having sorted the population from best to worst, the top half of the individuals are recognized as the better individuals and the bottom half as the worse individuals.
(Ⅰ) For the better individuals, we perform evolutionary operation.
where NEUi is the evolutionary value of Ui, Ur1 and Ur2 are different individuals in the current population excluding Ui, E is the scaling factor and its formula is E=(sin(2π×freq×t)×(t+T)+1)/2, and freq is the vibration frequency of the sinusoidal function, which is defined as freq=1/Dim.
(Ⅱ) As for the worse individuals, we also evolve them, or else eliminate them.
where r is a random value ranging from 0 to 1.
where fEUi is the fitness of EUi and fUi is the fitness of Ui.
3.3. DES
DES is derived from the differential evolution algorithm. To be brief, through continuous evolution, the superior individuals are retained to guide the search process toward the optimal solution. The specific steps are as follows: first, we randomly select two different individuals (Ua and Ub) in the population, and subtract each other to produce the difference individual. Second, the difference individual is assigned with a weight and added with the third individual to produce the variant individual. If the fitness value of the variant individual is better than that of the parent individual, the variant individual is selected to enter the next iteration, otherwise the parent individual is retained. The defining equation is given as follows:
where beta is the scale factor and its formula is beta=betamax−t×(betamax−betamin)/T, betamax=0.8, and betamin=0.2.
3.4. ISO
By introducing the above three strategies, we propose the ISO. The pseudo code of ISO is given in Algorithm 2.
The computational time complexity of SO is jointly determined by the population size N, dimension Dim and maximum number of iteration T, therefore its value is O(N×Dim×T). As for ISO, we first analyze the time complexity involved in the three strategies. The MOBL mechanism requires time O(2×Dim). Considering that the NEPD model requires different evolutionary operations for the first and second halves of the population, the time complexity is O(N×Dim). The computational complexity of DES is O(N×Dim). Therefore, the time complexity of ISO is O(N×Dim×T). It is clear that there is no increase in the complexity required by ISO compared to SO.
4.
Experiments
In this section, we verify the feasibility and effectiveness of ISO through four experiments, i.e., experiments on classical benchmark function, experiments on CEC2022, ablation experiment, and SVM parameter optimization experiment. ISO is compared with 12 other algorithms, including SO [33], GWO [22], GJO [26], HHO [25], NGO [27], WOA [24], ESO [43], AGWO [28], HGJO [30], HHOBM [29], MENGO [31], and NHWOA [32]. The first six are selected original algorithms because they have better optimization capabilities and are more widely used, and the last six are their improved algorithms, respectively. For a fair comparison, all algorithms use the parameter values used in the respective literature. The best results of test experiments are shown in bold format.
All experiments are conducted in an Intel Core i5-6200U, 2.4GHz CPU, and 8GB RAM laptop using the MATLAB 2021b under Windows 10 operating system.
4.1. Function test sets and UCI datasets
Table 1 shows the information of the 23 classical benchmark functions. According to their characteristics and information, we divide them into unimodal test functions (F1–F7), multimodal test functions (F8–F13), and multimodal test functions with fixed dimension (F14–F23).
Table 2 lists the basic information of CEC2022. According to the characteristics and information of the CEC2022, we divide them into unimodal test function (F1), multimodal test functions (F2–F5), hybrid test functions (F6–F8), and composition test functions (F9–F12).
We select five datasets from UCI machine learning repository. Table 3 shows the information of datasets.
4.2. SVM optimization process
The flow chart of the SVM optimization process is shown in Figure 1. Taking the ISO and Iris dataset as an example, 50% of the data samples from three types of samples (Iris setosa, Iris versicolour, and Iris virginica) are drawn as the training set and the remaining data samples are used as the test set. Next, the data in the training and test sets are normalized to the interval [0, 1] to facilitate SVM model training. Then, the best combination of SVM parameters are selected by ISO, and the optimal parameters are used to train and test the SVM model.
4.3. Results and analysis
4.3.1. Experiment on classical benchmark function
In order to exclude the influence of other factors, all algorithms use uniform common parameter settings. We uniformly initialize the population, setting the population size (N=50) and the maximum number of iterations (T=500). Each algorithm is tested individually 30 times, and the best results are recorded. We calculate the average (Ave) and standard deviation (Std) as measure criteria from these optimal values. Eq (22) is the definition equation.
The comparison results are shown in Table 4, where "−/ = /+" indicates that the algorithm is lower, equal to or better than ISO, respectively. Meanwhile, we utilize statistical methods such as the Wilcoxon rank sum test to evaluate the rank of 13 algorithms.
We first discuss the performances of ISO, ESO, and SO. From Table 4, we can see that on the unimodal test functions, both ISO and ESO can get the best value on functions F1–F4. On functions F5–F7, ISO performs better than ESO and SO. On the multimodal test functions, SO performs worst of three algorithms. Although ISO and ESO obtain the same average value on function F8, ISO has a lower standard deviation. Both ISO and ESO have the same best values on functions F9–F11, and SO takes worse values. On functions F12 and F13, ISO has a significant advantage in optimization results. In the multimodal test functions with fixed dimension, ISO has the best results on functions F14, F15, F18, F20–F23 compared to ESO and SO. On functions F16, F17, F19, and F20, all three algorithms achieve the optimal mean, but they have different standard deviations. Among them, SO gets the optimal value, which is more stable than ISO and ESO on functions F16 and F19. Except that, they achieve equal results on function F17.
Compared to the other 10 algorithms, ISO is more robust on the unimodal test functions. It achieves the best optimization results on six functions, while ESO and MENGO achieve the best results only on four functions and HHOBM achieves the best result on function F5. On the multimodal test functions, the optimization ability of ISO is more significant. It can be found that ISO achieves the best results on five functions. On function F8, though the standard deviation of ISO is smaller than ESO, HHOBM is more suitable for optimization. On functions F9–F11, about half and more of these algorithms can find out the optimum. It can be seen that these three functions are easier to optimize. On the multimodal test functions with fixed dimension, ISO takes a greater advantage because it achieves the best results on seven functions, followed by NGO. On functions F14 and F20, ISO and NGO achieves the best results. Almost all algorithms can find out the best values on functions F16–F19, which indicates that the optimization search on these four benchmark functions is extremely simple. On functions F21–F23, NGO still possesses a stronger capability to find out the best value, as ISO and ESO do.
Figure 2 plots the convergence curves of the 13 algorithms on 23 classical benchmark functions. From Figure 2, the convergence rate of ISO is clearly better than ESO and SO. Only on function F14, ISO fails to converge quickly and traps in a local optimum for a longer period of time, but eventually it is able to find out the optimum value.
For other comparison algorithms, MENGO converges nearly as fast as ISO on the unimodal test functions. In particular, on functions F1–F4, MENGO converges slightly faster than ISO. However, on function F5, MENGO falls into a local optimum at the later stage, resulting in failure to reach a value near the optimum. HHOBM, on the contrary, searches for a better optimum on function F5, and maintains a higher rate. On functions F6 and F7, ISO is more competitive. On the multimodal test functions, most of the comparison algorithms converge well. Only SO lags a little behind. On function F8, HHOBM converges better in the early stage. Although HHO also performs better than ISO, it falls into a local optimum. On the other five functions, the slow convergence with SO becomes apparent. On the multimodal test functions with fixed dimension, ISO has the fastest convergence rate, except on the function F14, which falls into a local optimum, and NGO has the fastest rate of convergence at this point.
4.3.2. Experiment on CEC2022
CEC2022 is a newer test set for evaluating swarm intelligence algorithms. In this experiment, we also set the population size (N=50) and the maximum number of iterations (T=500), and run each algorithm 30 times individually and the corresponding optimal values are recorded. Tables 5 and 6 show the experimental results of 13 algorithms on these functions with different dimensions, where "−/ = /+" indicates that the algorithm is lower, equal to, or better than ISO, respectively. Likewise, we use the Wilcoxon's rank sum test to evaluate the rank of these algorithms.
According to Table 5, ISO ranks second among all algorithms with an average rank of 2.5, while NGO ranks first with an average rank of 2.33. The main reason for ISO lagging behind NGO may be that ISO ranks 10th on function F4, which is far behind other algorithms due to its poor optimization on this function. The second reason is that ISO ranks fourth on functions F2 and F10. Nevertheless, ISO achieves the best results on other nine functions.
Figure 3 plots the convergence curves of the 13 algorithms on CEC2022 when the dimension is set to 10. From Figure 3, we can see that ISO converges at a faster rate than other algorithms, except for the poor performance on functions F4 and F10. In particular, ESO has the best optimization capability on function F4, and NGO achieves the best results on function F10. On function F2, although ISO converges quickly in the early stage, the optimization ability decreases in the late stage.
From Table 6, when the dimension of CEC2022 functions is set to 20, NGO shows weakness and ISO gradually gets the upper hand. Generally, ISO obtains the first rank with an average rank of 1.83, followed by ESO, and NGO ranks third. Among the 12 CEC2022 functions, ISO ranks fifth on functions F4 and F5, while ranking in the top two on the other 10 functions. It can be seen that as the dimension increases, the performance of ISO improves.
Figure 4 plots the convergence curves of the 13 algorithms on CEC2022, when the dimension is set to 20. As can be seen from Figure 4, ISO falls into a local optimum on functions F4 and F5, and SO converges faster than ISO on both functions. On function F7, ESO performs the best and ISO is slightly weaker in the later iterations. All in all, ISO keeps a faster rate to seek out the optimal value.
4.3.3. Ablation experiment
Ablation experiment is performed on 23 classical benchmark test functions, and there exist some differences in error between this experiment and the experiment in Subsection 4.3.1, because we mainly investigate the influence of three strategies and their portfolios on SO. Table 7 lists all possible SO variants under three strategies (seven in total), where 1 or 0 indicates whether the strategy is included or not. For example, MSO only has the mirror opposition-based learning mechanism, and NDSO contains the novel evolutionary population dynamics model and differential evolution strategy. Table 8 shows the experimental results, where "−/ = /+" denotes that the improved algorithm is lower, equal to or better than SO, respectively. The convergence curves are given in Figure 5.
From Table 8, it can be found that ISO has the best optimization capability among the seven SO variants, and MDSO and MNSO rank the second and third, respectively. This indicates that the mirror opposition-based learning mechanism is the most effective factor in improving SO, and the other two strategies can also play a supportive role.
It is worth noting that the NSO and DSO, which contain only one strategy, are ranked lower than SO. This represents that it is not always possible to achieve better results by improving the original algorithm. A strategy may be very effective for a certain problem or function, but for other problems, it may perform poorly. For example, DSO achieves the best optimization results with the first rank on function F18, but it ranks last on function F21.
On the unimodal test functions, the MSO, MNSO, MDSO, and ISO algorithms which contain MOBL achieve higher convergence accuracy with fewer iterations. While the optimization accuracy of DSO and NDSO is reduced, they converge a little faster than SO. The result is a little worse for NSO, which does not converge as fast or accurately as SO. On the multimodal test functions, the situation is not much changed from the previous ones by and large. The only difference is that on function F8, the convergence profiles of various SO variants become easily distinguishable. It can be seen from the figure that the convergence speed of MSO, NSO and MNSO in the early stage is not improved effectively. On the multimodal test functions with fixed dimension, eight algorithms converge more complexly than the previous 13 test functions. On function F14, ISO, MSO, MNSO, and MDSO are slower than SO in the early stages and also all fall into local optimum, but ISO and MNSO converge continuously to get closer to the optimal value in the later iterations. Meanwhile, although the convergence rate of DSO is faster on function F14, it is a bit weak on the late optimization search process and is caught up by other SO variants. On functions F15–F20, ISO and MDSO still hold the lead, followed by DSO and NDSO. MSO and MNSO are more unstable. For example, they perform better than SO on function F15, whereas worse on function F17. NSO is the worst one, which barely beats SO. On functions F21-F23, MNSO takes the first place on function F21, and ISO and MDSO rank first on the other two functions.
From the convergence curves in Figure 5, we notice that the convergence processes of ISO and MDSO have no significant difference on most of the classical test set, and they both possess excellent optimization capabilities. Only on functions F5, F14, and F21, MDSO performs worse than ISO. Other SO variants have a mediocre convergence effect, and may even converge worse than SO on some functions. Therefore, we conclude that ISO exhibits better convergence behavior.
4.3.4. SVM parameter optimization experiment
SVM integrates several standard machine learning techniques to overcome the problems of local minima, dimensional catastrophe and so on, and it provides better generalization capabilities than traditional neural network learning algorithms. However, the problem of parameter selection that affects its performance has not been solved.
The optimal selection of the SVM parameters is a key step that directly affects the performance of SVM, which has become a major research field in the area of kernel methods. For instance, Kalita et al. [45] proposed a dynamic framework based on moth flame optimization (MFO) and knowledge-based-search (KBS) to optimize the penalty factor C and Gaussian kernel parameter σ. Their experiments showed that KBS helps in controlling the exponential growth of time complexity when MFO is used to optimize these two parameters. Huang et al. [46] introduced an improved black widow optimization (IBWO) algorithm and constructed the IBWO-SVM to select better parameters. From these two articles, it can be seen that optimizing model parameters by swarm intelligence algorithms can effectively improve the prediction accuracy. Literature [47,48,49] also utilized this type of method and achieved good results.
The parameter setting of the SVM classification experiment is shown in Table 9. The classification accuracy in the test set is used as the measure criterion. In order to reduce the random error of the experiment results, we conduct 10 rounds of experiments and the average values are shown in Table 10.
Analyzing the data in Table 9, it can be seen that the classification accuracy based on ISO is higher than SO in all five datasets. Compared with other algorithms, ISO has the same classification result as HHO on the Iris dataset, and for the remaining four datasets, ISO achieves the highest accuracy. It can be concluded that ISO is more effective and stronger than SO, and when applied to the optimization of SVM parameters, it can achieve significant results.
4.4. Further discussions
In the function comparison experiments with the above 12 algorithms, we can see that ISO has better optimization performance and convergence speed. However, it has some weaknesses compared to other evolutionary algorithms. First, ISO may not be as effective as some evolutionary algorithms in global search, resulting in the search result to hover around a local optimum. Second, due to the fact that the performance of ISO is more sensitive to parameter settings, different problems may require different parameter configurations, adding a certain degree of complexity.
When dealing with nonlinear optimization problems, the advantage of ISO lies in its flexibility and adaptability (i.e., robustness), allowing it to adapt to complex changes. Non-convex optimization problems usually involve complex objective functions that may contain multiple local minimums and complex structures [50]. However, the three improvement strategies of ISO can greatly improve the algorithm's ability to jump out of local optima. In practice, noisy is a common problem; it contains uncertain variables, both multivariate and multi-objective. The key challenge for ISO in optimization problems involving uncertain variables is how to deal with these uncertainties. As it stands now, ISO has no clear superiority. We have two ideas about the solution to this problem. The first is the decomposition mechanism proposed by Deng et al. [51]. This mechanism can resolve such uncertain variables separately to further improve the optimization performance. The second combines the ISO with deep learning to create a new hybrid model. Literature [52,53,54] describes the advantages of approaches based on deep learning. ISO has strong potential in multivariate and multi-objective optimization problems. This is due to the fact that in the function test experiment of CEC2022, ISO ranks second among all the 13 algorithms with an average ranking of 2.5 when the dimension is set to 10, whereas when the dimension is increased to 20, ISO takes the first place with an average ranking of 1.83. Overall, the ISO has certain advantages in dealing with complex optimization problems, although it also faces many challenges.
5.
Conclusions
In this article, ISO is presented with three improvement strategies, and combined with SVM for the first time to provide an effective solution to the SVM parameter optimization problem. From the experimental results, it can be seen that the proposed ISO is effective and the ISO-optimized SVM parameters improves the classification accuracy by 0.2–0.5% over the SO-optimized SVM parameters. However, the proposed method still has the potential to upgrade. For example, although ISO and SO are consistent in time complexity, ISO may take more time in practice. This requires attention to whether mechanisms with high time overhead in the algorithm can be replaced with simple and efficient strategies. Overall, in future research, we will focus mainly on the improvement of ISO and apply it to solve more complex problems in a wider range of fields.
Use of AI tools declaration
The author declares he has not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
This paper is supported in part by the Natural Science Foundation of Jiangxi Province of China (No. 20242BAB26024) and Program of Academic Degree and Graduate Education and Teaching Reform in Jiangxi Province of China (No. JXYJG-2022-172).
Conflict of interest
The authors declare there is no conflict of interest.