Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

A selective evolutionary heterogeneous ensemble algorithm for classifying imbalanced data

  • Learning from imbalanced data is a challenging task, as with this type of data, most conventional supervised learning algorithms tend to favor the majority class, which has significantly more instances than the other classes. Ensemble learning is a robust solution for addressing the imbalanced classification problem. To construct a successful ensemble classifier, the diversity of base classifiers should receive specific attention. In this paper, we present a novel ensemble learning algorithm called Selective Evolutionary Heterogeneous Ensemble (SEHE), which produces diversity by two ways, as follows: 1) adopting multiple different sampling strategies to generate diverse training subsets and 2) training multiple heterogeneous base classifiers to construct an ensemble. In addition, considering that some low-quality base classifiers may pull down the performance of an ensemble and that it is difficult to estimate the potential of each base classifier directly, we profit from the idea of a selective ensemble to adaptively select base classifiers for constructing an ensemble. In particular, an evolutionary algorithm is adopted to conduct the procedure of adaptive selection in SEHE. The experimental results on 42 imbalanced data sets show that the SEHE is significantly superior to some state-of-the-art ensemble learning algorithms which are specifically designed for addressing the class imbalance problem, indicating its effectiveness and superiority.

    Citation: Xiaomeng An, Sen Xu. A selective evolutionary heterogeneous ensemble algorithm for classifying imbalanced data[J]. Electronic Research Archive, 2023, 31(5): 2733-2757. doi: 10.3934/era.2023138

    Related Papers:

    [1] Yiyuan Qian, Haiming Song, Xiaoshen Wang, Kai Zhang . Primal-dual active-set method for solving the unilateral pricing problem of American better-of options on two assets. Electronic Research Archive, 2022, 30(1): 90-115. doi: 10.3934/era.2022005
    [2] Raúl M. Falcón, Víctor Álvarez, José Andrés Armario, María Dolores Frau, Félix Gudiel, María Belén Güemes . A computational approach to analyze the Hadamard quasigroup product. Electronic Research Archive, 2023, 31(6): 3245-3263. doi: 10.3934/era.2023164
    [3] Xianfei Hui, Baiqing Sun, Indranil SenGupta, Yan Zhou, Hui Jiang . Stochastic volatility modeling of high-frequency CSI 300 index and dynamic jump prediction driven by machine learning. Electronic Research Archive, 2023, 31(3): 1365-1386. doi: 10.3934/era.2023070
    [4] Zhaoyong Huang . On the C-flatness and injectivity of character modules. Electronic Research Archive, 2022, 30(8): 2899-2910. doi: 10.3934/era.2022147
    [5] Abdelkader Lamamri, Mohammed Hachama . Approximate solution of the shortest path problem with resource constraints and applications to vehicle routing problems. Electronic Research Archive, 2023, 31(2): 615-632. doi: 10.3934/era.2023030
    [6] Kaiyu Zhang . Sobolev estimates and inverse Hölder estimates on a class of non-divergence variation-inequality problem arising in American option pricing. Electronic Research Archive, 2024, 32(11): 5975-5987. doi: 10.3934/era.2024277
    [7] Xingyan Fei, Yanchuang Hou, Yuting Ding . Modeling and analysis of carbon emission-absorption model associated with urbanization process of China. Electronic Research Archive, 2023, 31(2): 985-1003. doi: 10.3934/era.2023049
    [8] Jicheng Li, Beibei Liu, Hao-Tian Wu, Yongjian Hu, Chang-Tsun Li . Jointly learning and training: using style diversification to improve domain generalization for deepfake detection. Electronic Research Archive, 2024, 32(3): 1973-1997. doi: 10.3934/era.2024090
    [9] Mingtao Cui, Min Pan, Jie Wang, Pengjie Li . A parameterized level set method for structural topology optimization based on reaction diffusion equation and fuzzy PID control algorithm. Electronic Research Archive, 2022, 30(7): 2568-2599. doi: 10.3934/era.2022132
    [10] Wangwei Zhang, Hao Sun, Bin Zhou . TBRAFusion: Infrared and visible image fusion based on two-branch residual attention Transformer. Electronic Research Archive, 2025, 33(1): 158-180. doi: 10.3934/era.2025009
  • Learning from imbalanced data is a challenging task, as with this type of data, most conventional supervised learning algorithms tend to favor the majority class, which has significantly more instances than the other classes. Ensemble learning is a robust solution for addressing the imbalanced classification problem. To construct a successful ensemble classifier, the diversity of base classifiers should receive specific attention. In this paper, we present a novel ensemble learning algorithm called Selective Evolutionary Heterogeneous Ensemble (SEHE), which produces diversity by two ways, as follows: 1) adopting multiple different sampling strategies to generate diverse training subsets and 2) training multiple heterogeneous base classifiers to construct an ensemble. In addition, considering that some low-quality base classifiers may pull down the performance of an ensemble and that it is difficult to estimate the potential of each base classifier directly, we profit from the idea of a selective ensemble to adaptively select base classifiers for constructing an ensemble. In particular, an evolutionary algorithm is adopted to conduct the procedure of adaptive selection in SEHE. The experimental results on 42 imbalanced data sets show that the SEHE is significantly superior to some state-of-the-art ensemble learning algorithms which are specifically designed for addressing the class imbalance problem, indicating its effectiveness and superiority.



    In recent years moving boundary problems for rarefied gas dynamics have been extensively investigated in the connection with Micro-Electro-Mechanical-Systems (MEMS), see [4,10,11,16,22,24,25,29,30,31]. In micro scale geometries the mean free path is often of the order or larger than the characteristic length of the geometry, requiring the solution of kinetic equations. Usually, these flows have low Mach numbers, therefore, stochastic methods like DSMC are not the optimal choice, since statistical noise dominates the flow quantities. Moreover, when one considers moving rigid body, the gas domain will change in time and one has to encounter unsteady flow problems such that averages over long runs cannot be taken. Instead, one has to perform many independent runs in order to get smooth solutions. Although some attempts have been made to reduce the statistical noise, see, for example, [12], many works rather employ deterministic approaches for simplified models of the Boltzmann equation, like the Bhatnagar-Gross-Krook (BGK) model, see [10,22,30].

    In this paper we are following deterministic approach to solve the BGK model and extend the semi-Lagrangian method suggested in [22] to two dimensions in physical space and three dimensions in velocity space. Since the rigid body moves in time, classical interpolation procedures near the rigid body become complicated and possibly inaccurate because of the arbitrary intersection of cells by the rigid body. We note that a cartesian cut cell method has been introduced in [11] to handle the moving object in the rarefied gas. A different technique has been used in [8], where the authors have used ghost point methods in a finite difference framework to treat moving boundaries. We refer also to the treatment of interfaces, for example, for multiphase flow problems in the framework of Lattice-Boltzmann schemes, see [18,23] for a review and further references.

    We use an immersed boundary type approach [21] to simulate the fluid-rigid body interactions. However, at variance with the original immersed boundary method, which dealt with an incompressible fluid, here we treat the interaction of a rarefied gas with a rigid body, see [2,10] for immersed boundary approaches applied to kinetic equations. This means that we use a kinetic description of the gas, which is defined by a distribution function, and has therefore many more degrees of freedom than a (compressible or incompressible) fluid. The interaction with the boundary is based on mass conservation, and exchange of momentum and energy. About the energy exchange, we assume that the heat capacity of the solid is much larger than the one of the gas, so that the temperature of the solid object will be assumed constant in time. The approach is based on the combination between grid-based and mesh-free methods: the information about the distribution function is stored in an arbitrary fixed grid on a given domain. For this, the computational domain is discretized by a discrete set of fixed grid points which do not need to be regularly distributed. Moreover, the boundaries are also approximated by a discrete set of boundary points. On the boundary points we apply the boundary conditions. If boundaries move, the boundary particles also move with them.

    In the present approach the rigid body overlaps the gas grid points. We do not consider those gas-grid-points which are overlapped by the rigid body in the computation and define them as in-active points. The non-overlapped points are defined as active points. All boundary points are defined as active points (refer to Figure 1). Therefore, the distribution of the active grid points is not uniform in the vicinity of a rigid boundary, even if we use a regular lattice for the gas grid points. Moreover, it is varying over time as the rigid body moves. A moving least squares approach (later on referred as MLS) is a particularly suitable interpolation procedure in such a situation not requiring any special treatment. One only has to determine the overlapping and non-overlapping points and to update the kinetic distribution function from the active points with the help of the MLS-interpolation procedure. This process continues until the end of the simulation.

    Figure 1.  Rigid body S with boundary ΓS immersed in the gas. Black and gray circles are active and non-active interior grid points, respectively and red circles are boundary grid points, which are always active.

    We finally note that in the present paper we restrict to a first order algorithm. Higher order methods are under construction.

    The paper is organised as follows. In Section 2 we present the BGK model for the Boltzmann equation and the Newton-Euler equations for rigid body motions. In Section 3 we present the semi-Lagrangian scheme for the BGK model. Moreover, in the same section we present the moving least squares approximation, activating/deactivating grid points, boundary conditions and the coupling algorithm for the rigid body motion and rarefied gases. In Section 4 we present numerical results in one and two space dimensions. Finally, in Section 5 some conclusion and an outlook are presented.

    We consider the BGK-model for rarefied gas dynamics and the Newton-Euler equations for the motion of the rigid body inside the gas.

    Consider first the BGK equation for the distribution function of gas molecules denoted by f=f(t,x,v),t0,xΩRd,(d=1,2) and v=(vx,vy,vz)R3. It is given by

    ft+vf=1τ(M[f]f) (2.1)

    with initial value f(0,x,v)=f0(x,v) and boundary conditions discussed later. For the numerical examples we consider one- and two-dimensional spatial geometries and use suitable reduction procedures for the BGK equation, see [9,14].

    Here τ is the relaxation time and M[f] is the local Maxwellian given by

    M[f]=ρ(2πRT)3/2exp(|vU|22RT), (2.2)

    ρ,U,T are macroscopic quantities: Density, mean velocity and temperature, R is the universal gas constant.

    These macroscopic quantities are computed from f(t,x,v) in the following way. Let ϕ(v)=(1,v,|v|22) be the collision invariants. The moments are defined by

    (ρ,ρU,E)=R3ϕ(v)f(t,x,v)dv. (2.3)

    E is the total energy density and it is related to the temperature through the internal energy

    e(t,x)=32RT,ρe=E12ρ|U|2.

    The mechanical properties of the rigid body are uniquely defined by its mass and its moment of inertia, and its displacement is determined by the position of the center of mass and its orientation. The dynamics of the rigid body is determined by the Newton-Euler equations

    MdVdt=F,[I]dωdt+ω×([I]ω)=T, (2.4)

    where M is the total mass of the body S with center of mass Xc, V is the translational velocity of the center of mass, while ω denotes the angular velocity vector describing the rotation around an axis passing through Xc. F is the translation force, T is the torque and [I] is the barycentric moment of inertia.

    The center of mass of the rigid body are obtained by

    dXcdt=V. (2.5)

    Finally, the velocity of the rigid body is given by Uw=V+(xXc)×ω,xS.

    The force F and torque T are computed according to

    F=ΓS(φn)dA,T=ΓS(xXc)×(φn)dA, (2.6)

    where n is the unit boundary normal vector of the rigid body pointing towards the gas domain and φ is the pressure tensor given by

    φ=R3(vUw)(vUw)f(t,x,v)dv. (2.7)

    ΓS denotes the boundary of S, see Figure 1.

    In this paper we limit to study one and two dimensional problems. In 1D the rigid body does not rotate, so the moment of inertia does not play any role. In 2D the center of mass is determined by two coordinates, Xc=(Xc,Yc), the only non zero component of the angular velocity vector is the out of plane z-component ω, and the barycentric moment of inertia is a scalar which can be computed from the mass distribution of the object:

    I=S(xXc)2ρ(x)dx.

    In all our tests we assume the mass is uniformly distributed in the body of the object (which is a surface in 2D), therefore the moment of inertia depends only on the mass and the geometry of the object.

    We describe a Semi-Lagrangian scheme with least squares interpolation for the BGK equation for three dimensional velocity space and two dimensional physical space.

    We consider a constant time step Δt, a uniform mesh in velocity space with mesh size Δv and a, in general, non-uniform mesh with average spacing Δx in physical space. The time discretization is denoted tn=nΔt,n=0,1,. The space discretization is obtained by filling (regular or irregular) grid points xi=(xi,yi)ΩR2,i=1,,Nx, where Nx is the total number of grid points in physical space. We note that the Nx grid points include interior as well boundary points. The interior grid points are fixed and located inside the whole computational domain including the moving object. In contrast, the boundary points are fixed to the boundaries, that means moving on the boundaries of the moving object and fixed on the fixed boundaries. The interior grid points are distinguished according to whether they are overlapping with the moving body or not. In the first case they are called non-active points, otherwise active points. See Figure 1 for an illustration. Moreover, we consider an even number Nv of velocity grid points in each direction and a uniform velocity grid size Δv in all directions. We assume the distribution function is negligible for |vx,y,z|>vmax=NvΔv2. The uniform velocity grids are denoted by vj, vk and vl in x,y and z directions, respectively, where vj=vmax+(j1)Δv,j=1,,Nv+1. Similarly, we define vk and vl for k,l=1,Nv+1.

    Let fjkl=fjkl(t,x)=f(t,x,vj,vk,vl) and fijkl=fijkl(t)=f(t,xi,vj,vk,vl). The evolution equation of fjkl(t,x) along the characteristics between time steps n and n+1, i.e., for t[tn,tn+1], is calculated from the Lagrangian form of the discrete-velocity BGK model

    dfjkldt=1τ(Mjkl[f]fjkl) (3.1)
    dxdt=vj, (3.2)
    dydt=vk, (3.3)

    with final conditions

    (x,y)(tn)=(˜x,˜y),fjkl(tn)=fnjkl(˜x,˜y)=˜fnjkl (3.4)

    together with appropriate boundary conditions for fjkl at boundary points.

    Here Mjkl[f] is still the local Maxwellian having the moments of fjkl.

    We consider the implicit Euler scheme for the above equations, which reads

    fn+1ijkl=˜fnijkl+Δtτ(Mn+1ijkl[f]fn+1ijkl), (3.5)

    and

    xn+1i=˜x+vjΔt,yn+1i=˜y+vkΔt (3.6)

    for j,k,l=1,Nv+1 and all active interior points.

    The semi-Lagrangian method now consists of three steps:

    () First, we determine ˜x and ˜y from the backward characteristics ˜x=xn+1ivjΔt, ˜y=yn+1ivkΔt. Then reconstruct the function ˜fnjkl at (˜x,˜y). At tn all values fnijkl are known for all active points and boundary points. At (˜x,˜y) we have to interpolate ˜fnijkl. One can use any interpolation formula. In this paper we use a least squares approximation for the reconstruction. This is presented in the next section.

    () In the second step we obtain Mn+1ijkl. Since Mn+1i and fn+1i give the same conservative moments, we multiply the above discrete equation by the collisional invariants ϕ(v) and sum over all velocities. We get

    ρn+1i=Nv+1j,k,l=1˜fnijklΔv3,(ρiUi)n+1=Nv+1j,k,l=1vj˜fnijklΔv3, (3.7)
    En+1i=12Nv+1j,k,l=1(v2j+v2k+v2l)˜fnijΔv3. (3.8)

    Once the moments are known, we can compute the Maxwellian at the new time.

    () Finally, we update the density function by

    fn+1ijkl=τ˜fnijkl+ΔtMn+1ijklτ+Δt. (3.9)

    We solve the Newton-Euler equations by the explicit Euler method in time. The time step is the same as the time step of the BGK model. This means in particular, that the time step in the BGK model is chosen according to the stability requirements for the explicit Euler scheme for the Newton-Euler equations.

    In this subsection we describe the least squares approximation of a function in a two-dimensional computational domain ΩΓR2, where Γ is the boundary. As described above, we distinguish between the grid points on the boundary Γ and the interior grid points in Ω. The interior and boundary grid points are distinguished by assigning different flags, see Figure 1. Consider first the interior grid points (xi,yi) in Ω with average spacing Δx. They are chosen at the beginning of the calculation and are not moved. Those grid points overlapping with the moving body are non-active, the others are active.

    Let f(x,y) be a scalar function and fi its values at (xi,yi). We consider the problem of approximating the function ˜f=f(˜x,˜y) at (˜x,˜y) from the values of its neighboring points. We associate a weight function such that nearby particles have more and far away particles have less influence. Therefore, one can choose any distance function as a weight function which decays as the distance goes to infinity. In this paper we have considered a Gaussian function, but other choices are possible (see for example [26,33] for other classes of weight functions). In order to limit the number of neighboring points we consider only the neighbors inside a circle of radius h with center (˜x,˜y). We choose as radius h some factor of the average spacing Δx, such that we have at least a minimum number of neighbors for the least-squares approximation, even next to the boundary. In case of regular grid and far from the boundary, one might consider using smaller values of h. Such adaptive choice of h has been considered, for example, in [17]. For the sake of simplicity, we have chosen a constant h=3.1Δx in this paper, which gives a sufficiently large number of neighbours even near concave boundaries (as is the case of the Example 6 in the last section). The use of adaptive values of h is left to future investigation. We sort the neighboring points from 1 to m with respect to distance, such that the neighbour index 1 is the nearest neighbor of (˜x,˜y). With a slight abuse of notation, let P(˜x,˜y;h)={(xj,yj),j=1,,m(h)} denote the set of neighbor points of (˜x,˜y) inside the disc of radius h. We note that the number m of nighbours depends on (˜x,˜y) and h. In all calculations we have considered the following truncated Gaussian weight function

    w(xi˜x,yi˜y;h)={exp(α(xi˜x)2+(yi˜y)2h2),if (xi˜x)2+(yi˜y)2h10,else,

    with α a user defined positive constant, chosen here as α=6, so that the the influence of far neighbor grid points is negligible. This choice has been suggested from previous experience [17,28]. It would be interesting to investigate what is the optimal choice of the parameters, or even to adopt a different class of weight functions. This is left to future investigation.

    In order to approximate the function we consider the m Taylor's expansions of f(xj,yj) around (˜x,˜y)

    f(xj,yj)=f(˜x,˜y)+xf(˜x,˜y)(xj˜x)+yf(˜x,˜y)(yj˜y)+ej, (3.10)

    for j=1,,m, where ej is the error in the Taylor's expansion. We first assume that ˜f approximates the nearest point f1. In other words e1=0. The unknowns ˜f,˜fx,˜fy are computed by minimizing the error ej for j=2,,m and setting the constraint e1 = 0. To solve this constraint least-squares problem, we use the constraint to rewrite the equations in the form

    f2f1=˜fx(x2x1)+˜fy(y2y1)+e2=fmf1=˜fx(xmx1)+˜fy(ymy1)+em (3.11)

    The system of equations can be written in matrix form as

    e=bMa, (3.12)

    where e=[e2,,em]T, a=[˜fx,˜fy]T,b=[f2f1,,fmf1]T and

    M=(x2x1y2y1xmx1ymy1).

    For m>3, this system of equations is over-determined for two unknowns [˜fx,˜fy]T. The unknowns a are obtained from the weighted least squares method by minimizing the quadratic form

    J=mj=2wje2j=(Mab)TW(Mab), (3.13)

    where W=wjδjk,k=2,,m is the diagonal matrix. The minimization of J yields

    a=(MTWM)1(MTW)b. (3.14)

    Now from Eq. (3.10) with e1=0 for the closest point x1 we can compute the value of f(˜x,˜y) at (˜x,˜y) as

    f(˜x,˜y)=f(x1,y1)˜fx(x1˜x)˜fy(y1˜y) (3.15)

    since ˜fx and ˜fy are now known. We note that higher order approximations are obtained by using higher order Taylor's expansion in (3.11). We refer to [28] for details.

    In the above least-squares approximation a function is approximated at an arbitrary point from its neighboring points and the distribution of these points can be arbitrary. Such a straightforward least-squares approximation leads to a central difference scheme. In case of discontinuities in the solution, this will lead to numerical oscillations and one has to introduce additional numerical viscosity. This can be done in the least squares framework by adopting a suitably modified version of that approach using an upwind reconstruction.

    Moreover, we note that for the stabalization of higher order approximations a WENO-type reconstruction can be used, see e.g., [1,32,33], where WENO approximations with least squares approaches have been developed for regular and irregular grids.

    For the simulation of the interaction of the rigid body motion with the gas, we overlap the region defined by the rigid body and the region where the BGK model is computed. Those grid points in the gas phase which are overlapped by the rigid body during the motion are assigned as non-active grids and the others as active grids. The non-active grids are taken out of the numerical process and sorted out from the neighboring lists in the least-squares approximations. After the rigid body movement, some of the active grids will be overlapped by the rigid body and then redefined as non-active grids. In turn, some of non-active grids will be out of the overlapping zone of the rigid body and will be reactivated again for the numerical process. During this process we need to update the distribution function f(t,x,v) on the newly activated grid. This can be obtained from its neighboring active grid points using the least squares method from above.

    On the solid boundary as well as on the moving rigid object boundaries we apply diffuse reflection boundary conditions. The boundary particles are sitting on the boundaries and all boundary points having contact with the gas phase are defined as active points. The boundary particles move with the boundary velocities. The boundary conditions are applied on the boundaries of the computational domain as well as on the surface of rigid body. Let ρw,Tw,Uw and n be the density, temperature, velocity and n unit normal of the wall and the surface of the rigid body. The wall normal vector n points towards the gas domain.

    For (vUw)n<0 we obtain the distribution function on the wall fn+1w from the evolution equation. For (vUw)n>0 the distribution function is the Maxwellian with parameters ρw,Tw and Uw, which is given by

    Mn+1w=ρw(2πRTw)3/2exp(|vUw|22RTw). (3.16)

    We note that the density ρw is not known and is determined by assuming the net flux across the wall or surface is zero. This means, we have

    R3,(vUw)n>0(vUw)nMn+1wdv+R3,(vUw)n<0(vUw)nfn+1wdv=0. (3.17)

    Hence, from (3.16) and (3.17) we obtain

    ρw=R3,(vUw)n<0(vUw)nfn+1wdvR3,(vUw)n>0(vUw)n1(2πRTw)3/2exp(|vUw|22RTw)dv. (3.18)

    After calculation of the new density function fn+1j we first compute the pressure tensor (2.7) on all boundary points of the rigid body. Then we approximate the force and torque on the rigid body according to (2.6). We obtain the translational and rotational velocities and then move the boundary points and the center of mass accordingly. Finally, we update the normal vector n. The new velocity Uw is used to apply the boundary conditions for solving the BGK model. In summary, we use the following coupling algorithm:

    (ⅰ) Generate initial grid points with flags as interior and boundary grids and prescribe the initial conditions in the gas as well as in the solid phases.

    (ⅱ) Determine the active and non-active grids in the gas phase.

    (ⅲ) Update newly activated grid points in the gas phase with the help of interpolations from its active neighbors.

    (ⅳ) Solve the BGK model equation in the active grid points and apply boundary conditions on all boundary points.

    (ⅴ) Compute the force and torque on the boundary points of the rigid body.

    (ⅵ) Solve the Euler-Newton equations and then get new positions, velocity and the unit normal of the boundary points of the rigid body.

    (ⅶ) Goto (ⅱ) and repeat until the final time is reached.

    In the following we consider numerical examples in one and two space dimensions and three velocity dimensions. The test cases are given in dimensionless form but can be interpreted in SI-units.

    This problem has been considered in [10,22] in a larger domain. We consider the one-dimensional spatial domain Ω=[0,3×103]. Initially the piston is positioned at x=1.5×104. We considered the total number Nx=300 grid points in physical space and in Nv=30 grid points in every direction of velocity space. The left boundary moves with velocity

    up=10sin(t106).

    This is a one way coupling, since the motion of the piston is prescribed. We note that initially, some grid points less than x=1.5×104 are overlapped by the piston. They are non-active points and the piston position and the right boundary points are the active grid points, see Figure 2 for physical setup of the problem.

    Figure 2.  Geometrical set up for moving piston problem. The black circles are active grid points, grey circles are non-active grid points and red circles are boundary points.

    When the piston starts to move in time the process of activating and deactivating of grid points continues throughout the simulation. We have considered the final time tfinal=4×106. The time step is Δt=109. The minimum and maximum limit of the velocity are vmin=1200 and vmax=1200. We have considered the Argon gas with diameter d=0.368×109, Boltzmann constant kb=1.3806×1023 and the universal gas constant R=208. The initial temperature T0=270, initial density ρ0=0.00018 and the initial mean velocity U0=0. The corresponding Knudsen number is Kn=λ/L=0.215, based on the characteristic length L=3×1031.5×104, where λ is the mean free path defined by

    λ=kb2πρ0Rd2. (4.1)

    To validate the numerical results of the semi-Lagrangian scheme for the BGK model, we compare it with the results of a numerical solution of the full Boltzmann equation via the DSMC method [3,5,19]. For a proper comparison of the BGK model and the DSMC code for the Boltzmann equation we have to relate the relaxation time τ and the mean free path, see [7], as

    τ=4λπˉC, (4.2)

    where ˉC=8RT0π. The corresponding relaxation time τ=2.0634×106. Nevertheless, by DSMC we solve the full Boltzmann equation, therefore differences in results may be due partially to the different models, and partially to the different numerical techniques adopted in the two cases.

    Initially, the piston and the gas are at rest. The piston starts oscillating in time and disturbs the gas phase. A wave is formed which eventually creates a shock. The flow is a low Mach number flow and the DSMC results show strong fluctuations. Since the flow is unsteady one cannot take time averages of the DSMC simulations. Therefore, one has to perform several independent runs. In the DSMC simulations we have considered the same number of cells as in the BGK model. To reduce the statistical noise, we have considered 10.000 gas molecules per cells initially. Moreover, we have performed 500 independent runs. In Figures 36 we have plotted the density, temperature and velocity of the gas determined from both numerical methods. We can observe that the BGK and the DSMC solutions have very good agreements at all times. We note, that the statistical noise for the DSMC simulations is still observed even after 500 independent runs.

    Figure 3.  Comparison of BGK and DSMC at time t=1×106.
    Figure 4.  Comparison of BGK and DSMC at time t=2×106.
    Figure 5.  Comparison of BGK and DSMC at time t=3×106.
    Figure 6.  Comparison of BGK and DSMC at time t=4×106.

    In both methods we apply diffuse reflection boundary conditions on the piston with moving frame of reference Uw=(up,0,0), where up is the velocity of the piston, and the wall temperature Tw=T0. Similarly, we apply diffuse reflection boundary conditions with zero wall velocity and a wall temperature equal to the initial temperature on the right boundary.

    In Example 1, the gas flow was influenced by the motion of the piston, but there was no any influence of the gas flow on the motion of the piston. In this example, we consider a two way coupling of both phases. The force exerted on the rigid body from the surrounding gas influences the motion of the rigid body and vice versa. We again consider a one dimensional physical space and three dimensional velocity space. We consider the physical domain [(L+l),(L+l)] as described in Figure 7 with L=1 and l=0.1, where 2l is the thickness of a plate which is driven by the pressure difference at its edges.

    Figure 7.  Schematic view of a plate separating to subdomains with different temperature. Like in the piston problem, the black circles are active grids, the grey circles are non-active grids and the red circles are boundary points.

    Initially the plate is located at (0.1,0.1) with center of mass Xc=0, where the gas and the plate are at rest. This problem has been studied in [10]. We reconsider it as a benchmark problem since an analytical solution is available for the equilibrium state. We again consider a monatomic gas with parameters given in Example 1. The initial temperature is T0=270 and the initial pressures P0 are the same on both sides of the plate and are equal to 0.0386. The initial density ρ0 is obtained from the equation of state. The initial Knudsen number is 0.08 based on the characteristic length 2L and the relaxation time τ=5.398×104. Moreover, we have considered different density ratios of the gas and the plate. The other parameters are the same as in the Example 1. We prescribe a higher temperature Tw=330 on the right side of plate and on the right boundary of the computational domain. On the left boundary of the plate and on the left boundary of the computational domain the temperature is kept to T0. Due to the high temperature on the right walls, the pressure on the right hand side starts to increase and the plate starts to move to the left hand side. The motion of the plate is computed from the Euler-Newton equations, where only a translational force is computed for the one dimensional case. Since the plate has two opposite normals ±1, from Eq. (2.6) the total force is given as the difference of pressure

    F=(φleftφright)A, (4.3)

    where A is the area of plate. The plate starts oscillating and finally reaches the equilibrium position [10]

    xequi=L(T0Tw)(T0+Tw)=0.1. (4.4)

    The domain is discretized with Nx=300 cells. The velocity grid is given by Nv=20 cells for the BGK equations. The final time is 0.5. The time step is Δt=4×106. The other parameters are the same as in Example 1. The explicit Euler method with the same time step as the time step for the BGK model is used for time integration of the Newton-Euler equations.

    In this test case we have simulated a wide range of density ratios of gas and plate ranging from 1 to 10 up to 1 to 1000. In Figures 8 we have plotted the velocity of the plate with respect to time together with the exact equilibrium solution. One can observe, as expected, that a lower density plate reaches the equilibrium position earlier than the heavier plates. We remark, however, that the change of density ratio has been used just to change the mass of the plate. Indeed the dynamics depend only on the mass of the plate, not on its density. We choose to change the density just to allow a finite size of the plate (which is left unchanged in our simulations).

    Figure 8.  Comparison of velocity vs time of plate for density ratios of gas and plate 1 to 10 (top left) and 1 to 100 (top right) and 1 to 1000 (bottom) for Nx=500 and Nv=20.

    Additionally, we have performed a convergence study for the case of a density ratio 1 to 50. The results for the plate position and velocity are reported in Figure 9. We note that for Nx=300 we obtain an accurate approximation of the equilibrium value for the velocity, whereas, the equilibrium position still deviates from the analytical value. This is due to the first order error of the numerical scheme and an accumulation of very small numerical errors in velocity during the integration process. Increasing the number of grid points Nx in physical space, we obtain convergence towards the analytical equilibrium position. We remark that a velocity grid with Nv=30 gives almost the same results, see [27] for a numerical comparison.

    Figure 9.  Comparison of position and velocity vs time of plate with the density ratio 1 to 50 for number of cells Nx=300,400 and 500 and Nv=20.

    We have further compared the solutions of the BGK model with the DSMC simulations for Nx=500. In the case of the DSMC simulations we have again considered 400 gas molecules per cell initially. The boundary conditions and other parameters are the same in both methods. We have performed 50 independent runs. In Figure 10 we have plotted the position of the center of mass and the velocity of piston vs time. We note that the time evolution of the the DSMC solutions and the solutions obtained from the BGK semi Lagrangian method are very close to each other.

    Figure 10.  The solution for a Knudsen number of 0.08. Left: early-stage position of the center of mass of piston vs time. Right: velocity of the plate vs time with the density ratio 1 to 50. The red solid line represents the exact equilibrium position and the blue line represents the numerical values.

    Furthermore, in Figure 11 we have plotted the temperature obtained from both methods for time t=0.1,0.2 and 0.4. We see that at time t=0.1 the temperature is not yet reaching equilibrium state, but after t=0.2 the temperature of the gas on the left reaches the left wall temperature and the temperature of the gas on the right reaches the right wall temperature.

    Figure 11.  Temperature plots at times 0.1,0.2 and 0.4 for the density ratio 1 to 50.

    Similarly, we have plotted the velocity field of the gas on both sides of the plate in Figure 12 at different times for both methods. Here, we also observe, that the DSMC solutions fluctuate around the BGK solutions.

    Figure 12.  Velocity plots at times 0.1,0.2 and 0.4 for the density ratio 1 to 50.

    Finally, in Figure 13, we have plotted the pressure obtained from the BGK model and the DSMC simulations. In the beginning, the pressure on the right increases due to the increase of the temperature. When it reaches t=0.1 the pressure on the left is slightly larger than on the right side. At time t=0.2 the pressure on the left is still larger, which is clearly visible in the figure. It fluctuates and finally reaches the equilibrium state, where it is equal on both sides. The relative error of the computation versus the analytical solution [10] is approximately 0.7% for pressure and 0.6% for density in the stationary state for the finest discretization Nx=500.

    Figure 13.  Pressure plots at times 0.1,0.2 and 0.4 for the density ratio 1 to 50.

    Here, we consider a spatially 2-dimensional problem with three-dimensional velocity space. The flow in a cavity driven by the velocity on the top is a widely used benchmark problem for testing and comparing numerical methods. We consider a micron size square cavity. The top wall has velocity

    Ux=uw,Uy=0, (4.5)

    and on the other three walls we have Ux=Uy=0. The temperature is kept constant at T0=270 on all walls, the initial density is ρ0=1, the wall velocity uw=1 and the gas constant is R=208. The gas is again monatomic with parameters given as in the Example 1. The Knudsen number Kn=0.1 is based on the characteristic length given by the size of the wall. Diffuse reflection boundary conditions are applied on all walls. In Figure 14 we have plotted the regular and irregular gridpoints used for the simulation. Figure 15 shows the velocity fields and the vorticity obtained from the BGK equation for regular as well as irregular grid points. We use 50×50 gridpoints and approximately the same number of irregular grid points is generated. The time step is chosen as 5×1011. The simulation was stopped after time t=4×107. Moreover, Figure 16 compares the x- and y-velocity components along the center lines in y- and x-direction, respectively. We observed that the solutions obtained from the regular and irregular grids are almost the same.

    Figure 14.  Regular grids (left) and irregular grids (right) for solving BGK model.
    Figure 15.  Velocity field and out-of-plane vorticity obtained from regular(left) and irregular (right) grids from the BGK model.
    Figure 16.  Comparison of the velocity components obtained in regular as well irregular grids in the center of cavity. Left: x-component of velocity along the center line in y-direction. Right: y-component of velocity along center line in x-direction for uw=1.

    Moreover, we compare the semi-Lagrangian scheme for the BGK model with DSMC simulations for the Boltzmann equation for this example. The mean free path and the relaxation time are chosen according to Eq. (4.2). First, we choose the velocity of the upper wall as uw=1 in positive x-direction. In the DSMC simulations we have taken the same time step and the same number of cells as in the BGK model. For the DSMC simulations we have also applied diffuse reflection boundary conditions on all walls. The time steps and the gas parameters are the same as in the BGK model. In this case we look at the steady state solution. Therefore, unlike in the earlier two examples, we do not perform independent runs, but time averages.

    In Figure 17 on the left we have run the DSMC simulation up to 105 time steps, where the last 9×104 time steps are used for the averaging over the samples. In this case the fluctuations dominate the flow field. In Figure 17 on the right 3×106 time steps are used for the averaging over the samples.

    Figure 17.  DSMC simulations with 9×104 samples (left) and with 3×106 samples (right) for uw=1.

    In Figure 18 we have plotted the x-velocity component Ux along the central vertical line as well as the y-velocity component Uy along the central horizontal line for the case with 9×104 samples. We observe again the highly oscillating DSMC results compared to the BGK solutions.

    Figure 18.  Left: x-component of velocity along the center line in y-direction and Right: y-component of velocity along center line in x-direction with 9×104 sampling for uw=1.

    Similarly, in Figure 19 we have plotted the x-velocity component Ux along the central vertical line as well as y-velocity component Uy along the central horizontal line for 3×106 samples. The DSMC results are improving, but still fluctuating around the BGK solutions.

    Figure 19.  Left: x-component of velocity along the center line in y-direction and Right: y-component of velocity along center line in x-direction with 3×106 sampling for uw=1.

    This example is the direct extension of Example 1 into two space dimensions. Here we have used a Chu reduction [9,14] to reduce the dimension of the velocity space from three to two. We have taken this problem from the paper by Frangi et al. [13], where the authors have studied the biaxial accelerometer produced by STMicroelectronics with a surface micro-machining process. The authors have analysed the problem by considering a two-dimensional simplification. In Figure 20 we have sketched the computational domain in details. The shuttle lies initially in the middle of the domain. In the rest of the domain a gas flow is taking place. The shuttle oscillates with the velocity cos(2πf0t), where f0 is the frequency. The parameters mentioned in the Figure 20 are L1=19.2×106,d1=4.2×106,d2=2.6×106,d3=5×106,d4=3.9×106,d5=18.8×106. In [13] the frequency has been taken f0=4400 Hz, but with this frequency, the shuttle crosses the upper and lower boundaries. Therefore, we have chosen f0=40×4400 Hz such that the maximum amplitude of the shuttle is half of the distance d2. The initial pressure of the gas is equal to 0.1 bar, which corresponds to initial density ρ0=0.1641. The initial distribution f0 of the gas is the Maxwellian with zero mean velocity, initial temperature T0=293 and initial density ρ0. The diffuse reflection boundary condition with wall temperature T0 is applied on the solid lines and a far field boundary condition f0 is applied on the dotted lines. We note that here, we solve the real motion of the shuttle, while in [13] the authors solve the stationary equations with assigned non zero velocity on the boundary.

    Figure 20.  Geometry setup for moving 2D shuttle.

    In Figure 21 we have plotted the velocity vector fields as well as x- and y- components of the velocity at times t=1.2×106 and t=3.6×106.

    Figure 21.  First row: velocity fields at time t=1.2×106 and t=3×106. Second row: x- and y- velocity components at time t=1.2×106. Third row: x- and y- velocity components at time t=3.6×106.

    In Figure 22 we have plotted the normal stress tensor on the top wall of the shuttle at time t=1.2×106. As a reference solution we consider the one obtained at the finest resolution with cell size 4.84×108, which corresponds to 111.709 grid points including boundary points. The time step Δt=3.20×1011, which corresponds the CFL number equal to 0.92. For the convergence study we have considered the other coarser grids with sizes 7.5625×108, 1.5125×107, 3.025×107 and 6.050×107 and changed the time steps keeping the constant CFL number equal to 0.92.

    Figure 22.  The normal stress tensor on the top wall of the shuttle at t=1.2×106 for different cell sizes.

    In Figure 22 we have plotted the normal stress tensor on the top of the shuttle at time t=1.2×106 for different resolutions. In order to estimate the error, we have generated the fixed number of points with N=100 in equal distance. On this grid points we have interpolated the stress tensors from different resolutions including the reference solutions and then defined the relative errors as

    Lrelerror=Ni=1|ϕrefyy,iϕΔxyy,i|Ni=1ϕrefyy,i. (4.6)

    We note that ϕrefyy,i is the interpolated reference solution and ϕΔxyy,i is the interpolated solution for grid size Δx. In Table 1 we have presented the relative error of the normal stress tensor φyy at the same time. The errors in the table show the first order convergence of the scheme.

    Table 1.  Convergence study of the two dimensional moving shuttle at the time t=1.2×106.
    Δx Relative error
    6.025×107 9.3806×103
    3.025×107 3.3410×103
    1.5125×107 1.7921×103
    7.5625×108 6.5300×104

     | Show Table
    DownLoad: CSV

    Consider a circle immersed in a monoatomic gas in a micro square. We consider a 2D spatial and 3D velocity domain. Like in Example 3, the top wall has constant velocity in the positive x-direction. The initial and boundary conditions are the same as in Example 3. Initially, gas and rigid body are at rest. The rigid body is located at the center of the cavity.

    We proceed as in Example 2. The force and the torque are computed according to Eq. (2.6). The density of the rigid body is 10 times larger than the density of gas. We have performed the simulations for different density ratios. We experienced that if the density of the rigid body is at least ten times smaller than the density of the gas, instabilities occur in the present set up. A more quantitative comparison of the trajectories for objects with different densities requires a more accurate scheme, and will be performed in a future paper. Again, we use the explicit Euler scheme for the time discretization of the Euler-Newton equations and the same time step for the BGK model. The upper wall moves with velocity uw=30. We simulate up to the final time tfinal=4.4×107. In Figure 23 we have plotted the path of the center of mass of the body and its positions at different times.

    Figure 23.  Positions of circular particle at time t=0,5×108,1×107,1.5×107,2×107 and 2.5×107 (clockwise direction) together with the trajectory of the center of mass.

    Since there are no analytical or experimental results to validate the numerical solutions, we validate our solutions with DSMC simulations for the Boltzmann equation. We use the same initial and boundary conditions and the same parameters for both schemes. First, we consider uw=10,20 and 30 for the DSMC simulation. As we have seen in earlier examples, the DSMC results are dominated from the statistical noise for smaller Mach number flows. In Figure 24 we have compared the trajectories of the center of mass obtained from both methods. We observe that increasing the wall velocity uw gives a better agreement between the numerical solution of the BGK model and the DSMC solution. In Figure 24 we have plotted the center of mass obtained from the BGK model and DSMC simulations. In the case of DSMC simulations 10 independent runs are carried out. For larger uw the BGK solutions and DSMC solutions are getting closer.

    Figure 24.  Comparison of the trajectories of the center of mass obtained from the BGK model and DSMC simulations for u=10 (left), 20 (middle) and 30 (right).

    To show that the scheme is able to simulate the interaction of the gas with an arbitrary shaped rigid body, we have considered a 2D spatial and 2D velocity domain and three different types of bodies, which are triangular, L-shaped and chiral particles. For these three shapes, rotational effects are clearly observed. The initial and boundary conditions are as in Example 5. The density of the rigid bodies is again 10 times larger than the density of the gas. The upper wall has velocity uw=30 and the simulations are stopped after time 4.4×107 for all cases. All rigid bodies are initialised at the center of the cavity. In Figures 2527 we have plotted the positions at different times together with the trajectories of triangle, L-shaped and chiral particles, respectively. In all cases we see that the rigid bodies follow the flow path. We mention here that in [24] a general method for the simulation of arbitrary shaped object in a rarefied gas has been presented. In that paper the gas satisfies the Boltzmann transport equation, which is effectively solved by DSMC.

    Figure 25.  Positions of triangular particle at time t=0,5×108,1×107,1.5×107,2×107,2×107 and 3×107 (clockwise direction) together with the trajectory of center of mass.
    Figure 26.  Positions of L-shaped particle at time t=0,5×108,1×107 and 2.5×107 (clockwise direction) together with the trajectory of center of mass.
    Figure 27.  Positions of chiral particle at time t=0,5×108,2×107 and 3.5×107 together with the trajectory of center of mass.

    In this paper, we have presented a mesh free method for the simulation of moving rigid bodies immersed in a rarefied gas flow. The motion of the rigid body is obtained by solving the Newton-Euler equations. The force and the torque are computed from the surrounding gas. The Newton-Euler equations are solved by an explicit Euler method. The rarefied gas is simulated by solving the BGK model of the Boltzmann equation. A semi-Lagrangian method is used to solve the BGK model, where a first order least squares approximation is used for the interpolation scheme. Several numerical tests are performed in order to validate the method, both in one and two space dimensions. In particular, in 1D we consider the case of a moving plate immersed in a rarefied gas. In a first test we assume the motion of the plate is prescribed (one way coupling), while in a second test the motion of the plate is computed from Newton's equations (two way coupling). In both cases we compared the results with those obtained by DSMC solution of the Boltzmann equation. Notice that DSMC results required to take the average of a lot of runs in order to decrease statistical fluctuations. In two space dimensions we considered several test problems: some in which the motion of the object is prescribed, such as the classical driven cavity (and compared the results with DSMC) and the motion of the shuttle in a 2D model of a Micro Electro Mechanical System (and results are compared with others available in the literature [13]). Finally, some tests are performed with a rigid body of arbitrary shape immersed in a gas and driven by the flow (two way coupling). In some cases the results are compared with those obtained by DSMC. In the regimes we investigated there is a good qualitative agreement between the solutions obtained by BGK and by the full Boltzmann equation simulated by DSMC. Of course accurate DSMC solutions require a computational time which is several orders of magnitude higher than the one needed by the numerical solution of the BGK model.

    In this paper we consider a one way heat exchange: the temperature of the rigid body is assumed to be constant in space and time, which is equivalent to suppose that the heat capacity of the rigid body is much larger than the one of the gas. In future work we shall remove such an approximation and consider finite heat capacity of the rigid body. As a first step in this direction we assume a rigid body with infinite conductivity, which will make the temperature of the body constant in space. Later on we shall model heat diffusion in the solid as well.

    Moreover, the scheme will be extended to the case of gas-mixtures [15] and to three space dimensions. An interesting application of the method will include the treatment of several bodies immersed in a rarefied gas. In this way it will be possible to model a collection mesoscopic particles dispersed in a rarefied gas, thus providing a quantitative tool that can be used to validate homogenised macroscopic models of suspensions.

    From the methodological point of view, further research directions will include the use of on non-oscillatory higher order methods in space and time based on least squares approaches, see [1] for a combination of WENO and least squares approaches for fluid dynamic equations and high order approximation of boundary conditions.

    This work is supported by the DFG (German research foundation) under Grant No. KL 1105/30-1 and by the ITN-ETN Marie-Curie Horizon 2020 program ModCompShock, Modeling and computation of shocks and interfaces, Project ID: 642768.

    The authors declare no conflict of interest.



    [1] P. Branco, L. Torgo, R. P. Ribeiro, A survey of predictive modeling on imbalanced domains, ACM Comput. Surv., 49 (2016), 1–50. https://doi.org/10.1145/2907070 doi: 10.1145/2907070
    [2] H. Guo, Y. Li, J. Shang, M. Gu, Y. Huang, B. Gong, Learning from class-imbalance data: Review of methods and applications, Expert Syst. Appl., 73 (2017), 220–239. https://doi.org/10.1016/j.eswa.2016.12.035 doi: 10.1016/j.eswa.2016.12.035
    [3] Y. Qian, S. Ye, Y. Zhang, J. Zhang, SUMO-Forest: A Cascade Forest based method for the prediction of SUMOylation sites on imbalanced data, Gene, 741 (2020), 144536. https://doi.org/10.1016/j.gene.2020.144536 doi: 10.1016/j.gene.2020.144536
    [4] P. D. Mahajan, A. Maurya, A. Megahed, A. Elwany, R. Strong, J. Blomberg, Optimizing predictive precision in imbalanced datasets for actionable revenue change prediction, Eur. J. Oper. Res., 285 (2020), 1095–1113. https://doi.org/10.1016/j.ejor.2020.02.036 doi: 10.1016/j.ejor.2020.02.036
    [5] G. Chen, Z. Ge, SVM-tree and SVM-forest algorithms for imbalanced fault classification in industrial processes, IFAC J. Syst. Control, 8 (2019), 100052. https://doi.org/10.1016/j.ifacsc.2019.100052 doi: 10.1016/j.ifacsc.2019.100052
    [6] P. Wang, F. Su, Z. Zhao, Y. Guo, Y. Zhao, B. Zhuang, Deep class-skewed learning for face recognition, Neurocomputing, 363 (2019), 35–45. https://doi.org/10.1016/j.neucom.2019.04.085 doi: 10.1016/j.neucom.2019.04.085
    [7] Y. S. Li, H. Chi, X. Y. Shao, M. L. Qi, B. G. Xu, A novel random forest approach for imbalance problem in crime linkage, Knowledge-Based Syst., 195 (2020), 105738. https://doi.org/10.1016/j.knosys.2020.105738 doi: 10.1016/j.knosys.2020.105738
    [8] S. Barua, M. M. Islam, X. Yao, K. Murase, MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning, IEEE Trans. Knowl. Data Eng., 26 (2012), 405–425. https://doi.org/10.1109/TKDE.2012.232 doi: 10.1109/TKDE.2012.232
    [9] G. E. A. P. A. Batista, R. C. Prati, M. C. Monard, A study of the behavior of several methods for balancing machine learning training data, ACM SIGKDD Explorations Newsl., 6 (2004), 20–29. https://doi.org/10.1145/1007730.1007735 doi: 10.1145/1007730.1007735
    [10] K. E. Bennin, J. Keung, P. Phannachitta, A. Monden, S. Mensah, MAHAKIL: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction, IEEE Trans. Software Eng., 44 (2017), 534–550. https://doi.org/10.1109/TSE.2017.2731766 doi: 10.1109/TSE.2017.2731766
    [11] N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique, J. Artif. Intell. Res., 16 (2002), 321–357. https://doi.org/10.1613/jair.953 doi: 10.1613/jair.953
    [12] M. Zheng, T. Li, X. Zheng, Q. Yu, C. Chen, D. Zhou, et al., UFFDFR: Undersampling framework with denoising, fuzzy c-means clustering, and representative sample selection for imbalanced data classsification, Inf. Sci., 576 (2021), 658–680. https://doi.org/10.1016/j.ins.2021.07.053 doi: 10.1016/j.ins.2021.07.053
    [13] G. Ahn, Y. J. Park, S. Hur, A membership probability-based undersampling algorithm for imbalanced data, J. Classif., 38 (2021), 2–15. https://doi.org/10.1007/s00357-019-09359-9 doi: 10.1007/s00357-019-09359-9
    [14] M. Li, A. Xiong, L. Wang, S. Deng, J. Ye, ACO Resampling: Enhancing the performance of oversampling methods for class imbalance classification, Knowledge-Based Syst., 196 (2020), 105818. https://doi.org/10.1016/j.knosys.2020.105818 doi: 10.1016/j.knosys.2020.105818
    [15] T. Pan, J. Zhao, W. Wu, J. Yang, Learning imbalanced datasets based on SMOTE and Gaussian distribution, Inf. Sci., 512 (2020), 1214–1233. https://doi.org/10.1016/j.ins.2019.10.048 doi: 10.1016/j.ins.2019.10.048
    [16] T. Zhang, Y. Li, X. Wang, Gaussian prior based adaptive synthetic sampling with non-linear sample space for imbalanced learning, Knowledge-Based Syst., 191 (2020), 105231. https://doi.org/10.1016/j.knosys.2019.105231 doi: 10.1016/j.knosys.2019.105231
    [17] R. Batuwita, V. Palade, FSVM-CIL: Fuzzy support vector machines for class imbalance learning, IEEE Trans. Fuzzy Syst., 18 (2010), 558–571. https://doi.org/10.1109/TFUZZ.2010.2042721 doi: 10.1109/TFUZZ.2010.2042721
    [18] C. L. Castro, A. P. Braga, Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data, IEEE Trans. Neural Networks Learn. Syst., 24 (2013), 888–899. https://doi.org/10.1109/TNNLS.2013.2246188 doi: 10.1109/TNNLS.2013.2246188
    [19] S. Datta, S. Das, Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs, Neural Networks, 70 (2015), 39–52. https://doi.org/10.1016/j.neunet.2015.06.005 doi: 10.1016/j.neunet.2015.06.005
    [20] H. Yu, C. Mu, C. Sun, W. Yang, X. Yang, X. Zuo, Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data, Knowledge-Based Syst., 76 (2015), 67–78. https://doi.org/10.1016/j.knosys.2014.12.007 doi: 10.1016/j.knosys.2014.12.007
    [21] H. Yu, C. Sun, X. Yang, W. Yang, J. Shen, Y. Qi, ODOC-ELM: Optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data, Knowledge-Based Syst., 92 (2016), 55–70. https://doi.org/10.1016/j.knosys.2015.10.012 doi: 10.1016/j.knosys.2015.10.012
    [22] Z. H. Zhou, X. Y. Liu, Training cost-sensitive neural networks with methods addressing the class imbalance problem, IEEE Trans. Knowl. Data Eng., 18 (2006), 63–77. https://doi.org/10.1109/TKDE.2006.17 doi: 10.1109/TKDE.2006.17
    [23] D. Devi, S. K. Biswas, B. Purkayastha, Learning in presence of class imbalance and class overlapping by using one-class SVM and undersampling technique, Connect. Sci., 31 (2019), 105–142. https://doi.org/10.1080/09540091.2018.1560394 doi: 10.1080/09540091.2018.1560394
    [24] R. Barandela, R. M. Valdovinos, J. S. Sanches, New applications of ensemble of classifiers, Pattern Anal. Appl., 6 (2003), 245–256. https://doi.org/10.1007/s10044-003-0192-z doi: 10.1007/s10044-003-0192-z
    [25] N. V. Chawla, A. Lazarevic, L. O. Hall, K. W. Bowyer, SMOTEBoost: Improving prediction of the minority class in Boosting, in Knowledge Discovery in Databases: PKDD 2003, (2003), 107–119. https://doi.org/10.1007/978-3-540-39804-2_12
    [26] G. Collell, D. Prelec, K. R. Patil, A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data, Neurocomputing, 275 (2018), 330–340. https://doi.org/10.1016/j.neucom.2017.08.035 doi: 10.1016/j.neucom.2017.08.035
    [27] W. Fan, S. J. Stolfo, J. Zhang, P. K. Chan, AdaCost: Misclassification cost-sensitive boosting, in International Conference of Machine Learning, (1999), 97–105. Available from: http://ids.cs.columbia.edu/sites/default/files/Adacost_Imbalanced_classes.pdf.
    [28] M. Galar, A. Fernandez, E. Barrenechea, F. Herrera, EUSBoost: Enhancing ensembles for highly imbalanced data-sets by eevolutionary undersampling, Pattern Recognit., 46 (2013), 3460–3471. https://doi.org/10.1016/j.patcog.2013.05.006 doi: 10.1016/j.patcog.2013.05.006
    [29] P. Lim, C. K. Goh, K. C. Tan, Evolutionary Cluster-Based Synthetic Oversampling Ensemble (ECO-Ensemble) for imbalance learning, IEEE Trans. Cybern., 47 (2016), 2850–2861. https://doi.org/10.1109/TCYB.2016.2579658 doi: 10.1109/TCYB.2016.2579658
    [30] X. Y. Liu, J. Wu, Z. H. Zhou, Exploratory undersampling for class-imbalance learning, IEEE Trans. Syst. Man Cybern. Part B Cybern., 39 (2008), 539–550. https://doi.org/10.1109/TSMCB.2008.2007853 doi: 10.1109/TSMCB.2008.2007853
    [31] S. E. Roshan, S. Asadi, Improvement of Bagging performance for classification of imbalanceed datasets using evolutionary multi-objective optimization, Eng. Appl. Artif. Intell., 87 (2020), 103319. https://doi.org/10.1016/j.engappai.2019.103319 doi: 10.1016/j.engappai.2019.103319
    [32] A. Roy, R. M. O. Cruz, R. Sabourin, G. D. C. Cavalcanti, A study on combining dynamic selection and data preprocessing for imbalance learning, Neurocomputing, 286 (2018), 179–192. https://doi.org/10.1016/j.neucom.2018.01.060 doi: 10.1016/j.neucom.2018.01.060
    [33] C. Seiffert, T. M. Khoshgoftaar, J. V. Hulse, A. Napolitano, RUSBoost: A hybrid approach to alleviating class imbalance, IEEE Trans. Syst. Man Cybern. Part A Syst. Humans, 40 (2009), 185–197. https://doi.org/10.1109/TSMCA.2009.2029559 doi: 10.1109/TSMCA.2009.2029559
    [34] Y. Sun, M. S. Kamel, A. K. C. Wong, Y. Wang, Cost-sensitive boosting for classification of imbalanced data, Pattern Recognit., 40 (2007), 3358–3378. https://doi.org/10.1016/j.patcog.2007.04.009 doi: 10.1016/j.patcog.2007.04.009
    [35] B. Tang, H. He, GIR-based ensemble sampling approaches for imbalanced learning, Pattern Recognit., 71 (2017), 306–319. https://doi.org/10.1016/j.patcog.2017.06.019 doi: 10.1016/j.patcog.2017.06.019
    [36] D. Tao, X. Tang, X. Li, X. Wu, Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval, IEEE Trans. Pattern Anal. Mach. Intell., 28 (2006), 1088–1099. https://doi.org/10.1109/TPAMI.2006.134 doi: 10.1109/TPAMI.2006.134
    [37] S. Wang, X. Yao, Diversity analysis on imbalanced data sets by using ensemble models, in 2009 IEEE Symposium on Computational Intelligence and Data Mining, (2009), 324–331. https://doi.org/10.1109/CIDM.2009.4938667
    [38] H. Yu, J. Ni, An improved ensemble learning method for classifying high-dimensional and imbalanced biomedicine data, IEEE/ACM Trans. Comput. Biol. Bioinf., 11 (2014), 657–666. https://doi.org/10.1109/TCBB.2014.2306838 doi: 10.1109/TCBB.2014.2306838
    [39] H. G. Zefrehi, H. Altincay, Imbalance learning using heterogeneous ensembles, Expert Syst. Appl., 142 (2020), 113005. https://doi.org/10.1016/j.eswa.2019.113005 doi: 10.1016/j.eswa.2019.113005
    [40] J. F. Díez-Pastor, J. J. Rodríguez, C. I. García-Osorio, L. I. Kuncheva, Diversity techniques improve the performance of the best imbalance learning ensembles, Inf. Sci., 325 (2015), 98–117. https://doi.org/10.1016/j.ins.2015.07.025 doi: 10.1016/j.ins.2015.07.025
    [41] Z. H. Zhou, J. Wu, W. Tang, Ensembling neural networks: many could be better than all, Artif. Intell., 137 (2002), 239–263. https://doi.org/10.1016/S0004-3702(02)00190-X doi: 10.1016/S0004-3702(02)00190-X
    [42] I. Triguero, S. González, J. M. Moyano, S. García, J. Alcalá-Fdez, J. Luengo, et al., KEEL 3.0: An open source software for multi-stage analysis in data mining, Int. J. Comput. Intell. Syst., 10 (2017), 1238–1249. https://doi.org/10.2991/ijcis.10.1.82 doi: 10.2991/ijcis.10.1.82
    [43] C. Blake, E. Keogh, C. J. Merz, UCI repository of machine learning databases, 1998. Available from: https://cir.nii.ac.jp/crid/1572543025422228096#citations_container.
    [44] L. Breiman, Bagging predictors, Mach. Learn., 24 (1996), 123–140. https://doi.org/10.1007/BF00058655 doi: 10.1007/BF00058655
    [45] R. E. Schapire, A brief introduction to boosting, in Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, (1999), 1401–1406. Available from: https://citeseerx.ist.psu.edu/document?repid = rep1 & type = pdf & doi = fa329f834e834108ccdc536db85ce368fee227ce.
    [46] L. Breiman, Random forests, Mach. Learn., 45 (2001), 5–32. https://doi.org/10.1023/A:1010933404324 doi: 10.1023/A:1010933404324
    [47] T. K. Ho, The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Mach. Intell., 20 (1998), 832–844. https://doi.org/10.1109/34.709601 doi: 10.1109/34.709601
    [48] S. A. Gilpin, D. M. Dunlavy, Relationships between accuracy and diversity in heterogeneous ensemble classifiers, 2009.
    [49] K. W. Hsu, J. Srivastava, Diversity in combinations of heterogeneous classifiers, in PAKDD 2009: Advances in Knowledge Discovery and Data Mining, (2009), 923–932. https://doi.org/10.1007/978-3-642-01307-2_97
    [50] R. M. O. Cruz, R. Sabourin, G. D. C. Cavalcanti, Dynamic classifier selection: Recent advances and perspectives, Inf. Fusion, 41 (2018), 195–216. https://doi.org/10.1016/j.inffus.2017.09.010 doi: 10.1016/j.inffus.2017.09.010
    [51] É. N. de Souza, S. Matwin, Extending adaboost to iteratively vary its base classifiers, in Canadian AI 2011: Advances in Artificial Intelligence, (2011), 384–389. https://doi.org/10.1007/978-3-642-21043-3_46
    [52] D. Whitley, A genetic algorithm tutorial, Stat. Comput., 4 (1994), 65–85. https://doi.org/10.1007/BF00175354 doi: 10.1007/BF00175354
    [53] J. Demsar, Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res., 7 (2006), 1–30. Available from: https://www.jmlr.org/papers/volume7/demsar06a/demsar06a.pdf.
    [54] S. García, A. Fernández, J. Luengo, F. Herrera, Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power, Inf. Sci., 180 (2010), 2044–2064. https://doi.org/10.1016/j.ins.2009.12.010 doi: 10.1016/j.ins.2009.12.010
  • This article has been cited by:

    1. Vasily Kosyanchuk, Separation of binary gas mixture in a microchannel with oscillating barriers, 2022, 34, 1070-6631, 102006, 10.1063/5.0117665
    2. Sudarshan Tiwari, Axel Klar, Giovanni Russo, A meshfree arbitrary Lagrangian-Eulerian method for the BGK model of the Boltzmann equation with moving boundaries, 2022, 458, 00219991, 111088, 10.1016/j.jcp.2022.111088
    3. Sudarshan Tiwari, Axel Klar, Giovanni Russo, Modelling and Simulations of Moving Droplet in a Rarefied Gas, 2021, 35, 1061-8562, 666, 10.1080/10618562.2021.2024520
    4. Vasily Kosyanchuk, Stepan Konakov, Numerical simulation of novel gas separation microdevice with oscillating elements, 2023, 144, 07351933, 106744, 10.1016/j.icheatmasstransfer.2023.106744
    5. Vasily Kosyanchuk, Numerical study of microscale gas pump based on surface acoustic waves, 2024, 36, 1070-6631, 10.1063/5.0202744
    6. Qing He, Shi Tao, Gaojie Liu, Liang Wang, Ya Ge, Jiechao Chen, Xiaoping Yang, Thermal rarefied gas flow simulations with moving boundaries based on discrete unified gas kinetic scheme and immersed boundary method, 2024, 226, 00179310, 125508, 10.1016/j.ijheatmasstransfer.2024.125508
    7. Vasily Kosyanchuk, Numerical study of microdevice with surface acoustic waves for separation of gas mixtures, 2023, 35, 1070-6631, 10.1063/5.0157497
    8. Jianan Zeng, Yanbing Zhang, Lei Wu, GSIS-ALE for moving boundary problems in rarefied gas flows, 2025, 00219991, 113761, 10.1016/j.jcp.2025.113761
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1585) PDF downloads(73) Cited by(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog