
This work describes the implementation of a data-driven approach for the reduction of the complexity of parametrical partial differential equations (PDEs) employing Proper Orthogonal Decomposition (POD) and Gaussian Process Regression (GPR). This approach is applied initially to a literature case, the simulation of the Stokes problem, and in the following to a real-world industrial problem, within a shape optimization pipeline for a naval engineering problem.
Citation: Giulio Ortali, Nicola Demo, Gianluigi Rozza. A Gaussian Process Regression approach within a data-driven POD framework for engineering problems in fluid dynamics[J]. Mathematics in Engineering, 2022, 4(3): 1-16. doi: 10.3934/mine.2022021
[1] | Stefania Fresca, Federico Fatone, Andrea Manzoni . Long-time prediction of nonlinear parametrized dynamical systems by deep learning-based reduced order models. Mathematics in Engineering, 2023, 5(6): 1-36. doi: 10.3934/mine.2023096 |
[2] | Francesca Tedeschi, Giulio G. Giusteri, Leonid Yelash, Mária Lukáčová-Medvid'ová . A multi-scale method for complex flows of non-Newtonian fluids. Mathematics in Engineering, 2022, 4(6): 1-22. doi: 10.3934/mine.2022050 |
[3] | Wenjing Liao, Mauro Maggioni, Stefano Vigogna . Multiscale regression on unknown manifolds. Mathematics in Engineering, 2022, 4(4): 1-25. doi: 10.3934/mine.2022028 |
[4] | Luigi C. Berselli, Traian Iliescu, Birgul Koc, Roger Lewandowski . Long-time Reynolds averaging of reduced order models for fluid flows: Preliminary results. Mathematics in Engineering, 2020, 2(1): 1-25. doi: 10.3934/mine.2020001 |
[5] | Siddhartha Mishra . A machine learning framework for data driven acceleration of computations of differential equations. Mathematics in Engineering, 2019, 1(1): 118-146. doi: 10.3934/Mine.2018.1.118 |
[6] | Zeljko Kereta, Valeriya Naumova . On an unsupervised method for parameter selection for the elastic net. Mathematics in Engineering, 2022, 4(6): 1-36. doi: 10.3934/mine.2022053 |
[7] | Andrea Lo Giudice, Roberto Nuca, Luigi Preziosi, Nicolas Coste . Wind-blown particulate transport: A review of computational fluid dynamics models. Mathematics in Engineering, 2019, 1(3): 508-547. doi: 10.3934/mine.2019.3.508 |
[8] | Benoît Perthame, Edouard Ribes, Delphine Salort . Career plans and wage structures: a mean field game approach. Mathematics in Engineering, 2019, 1(1): 38-54. doi: 10.3934/Mine.2018.1.38 |
[9] | Liliane Maia, Gabrielle Nornberg . Radial solutions for Hénon type fully nonlinear equations in annuli and exterior domains. Mathematics in Engineering, 2022, 4(6): 1-18. doi: 10.3934/mine.2022055 |
[10] | Ludovica Cicci, Stefania Fresca, Stefano Pagani, Andrea Manzoni, Alfio Quarteroni . Projection-based reduced order models for parameterized nonlinear time-dependent problems arising in cardiac mechanics. Mathematics in Engineering, 2023, 5(2): 1-38. doi: 10.3934/mine.2023026 |
This work describes the implementation of a data-driven approach for the reduction of the complexity of parametrical partial differential equations (PDEs) employing Proper Orthogonal Decomposition (POD) and Gaussian Process Regression (GPR). This approach is applied initially to a literature case, the simulation of the Stokes problem, and in the following to a real-world industrial problem, within a shape optimization pipeline for a naval engineering problem.
Reduced Order Modeling (ROM) offers a simplification for very complex computational models [25], keeping by high accuracy for the solution we want to compute but, at the same time, reducing the computational cost to reach it. In a parametric context, this means that we are able to approximate the solution for any new unforseen parameter in a very efficient and rapid way. Especially for complex phenomena of which numerical solutions are expensive to compute — e.g., solve a Partial Differential Equations (PDEs) —, ROM makes a huge difference in terms of the global computational load, allowing for a quick (or even real-time) response and for a multiple queries approach, e.g., an optimization scenario, otherwise not viable.
In the Reduced Basis (RB) [13] method, a set of truth solutions for some selected parameters are computed using a numerical solver over the full-order model. Once these snapshots are collected, we combine them to extract the basis which spans the low-dimensional space, thus we can exploit it by using the basis as global test functions in a Galerkin method. Thanks to the low dimensionality of the reduced order model, the related system results smaller and faster to be solved. Despite this advantage, RB meets some difficulties for applications in industrial field: the necessity to extract the discrete operators precludes the use of commercial software, as well as the effort required to derive the equations onto the reduced space discourages its exploitation.
Data-driven reduced order model are then gaining popularity to tackle industrial problems, mainly for its easiness of application and for the increasing quantity of available scientific data. Within the data-driven approach, it is possible avoiding to project equations and operators onto the reduced space — that can be, depending on the equations, very demanding and complex — and approximating the solution maninfold in the reduced space with an interpolation/regression. In this contribution, we coupled the Proper Orthogonal Decomposition (POD), a well-known technique for computation of reduced space basis [2], with the Gaussian Process Regression (GPR) for a probabilistic prediction of the new solutions. While the POD technique is pretty standard for projection based snapshots method in several fields of application — e.g., structural engineering [22], fluid dynamics [15,29,30], — the GPR has been adopted only in [12] for a structural problem. The POD is involved to reduce the large dimension of the input snapshots and the GPR is applied as a probabilistic response surface for the approximation of the parametric solution manifold. The novelty of this work is in fact the creation of a computational framework by combining model order reduction and probabilistic prediction, in order to increase the accuracy of the state-of-the-art (data-based) metodologies for a very limited amount of snapshots. This framework results then particularly suited for industrial setting, where simulations can last days or even weeks, making unfeasible the computation of large database of solutions. Similar methods have be proposed in the last year to data-based modeling of parametric problems, using interpolation to approximate the solution manifold. Among all contributions in literature, we cite [7,27,32] for a good overview. An alternative to interpolation or regression for the online prediction is constituted by the Artificial Neural Network (ANN), as described in [14]. Such approach seems very accurate, but previous works [21,36] shown that typically a large dataset is required to properly reconstruct the solution manifold. Since our effort is dedicated in this work to reduce as possible the number of high-fidelity snapshots, we do not investigate the latter methodology, postponing to future works the possibility to better explore it. Approaches that exploit the Active Subspaces (AS) property for the reduction of the number of required input snapshots are presented in [9,33].
The paper is structured as follows: the POD and GPR methods are described more in details in Section 2 and Section 3, respectively, while in Section 4 we describe their integration into a single framework. The numerical results are presented in Section 5 and they are subdivided as following: first we applied the POD-GPR framework to a simply toy problem, a Stokes flow passing around a cylinder, then we move to an industrial problem where turbulent biphase Navier Stokes flow is simulated around a parametric cruise ship geometry. Finally, we summarize some conclusion and further future perspectives in Section 6.
POD is a technique used in the context of PDEs aiming at reducing the computational cost required to obtain their solution numerically. In this section we introduce the reader to the general idea behind POD and give some inshights on the mathematical theory supporting it.
We now begin with an introduction to the class of Reduced Basis (RB), presenting POD in the following as a special case. Finally, we discuss a modification of the standard POD formulation, which is equation free and it will be the one used in this work.
Reduced Basis. We begin with a formal definition of a system of discretized parametric PDEs in the form:
finduN(μ)∈XNs.t.:a(uN(μ),w;μ)=F(w;μ)∀w∈XN, | (2.1) |
where μ∈P⊂Rp is the parameter, uN is the truth solution to our problem, a(⋅,⋅;μ) is the parametric bilinear form and XN is a finite dimensional space of dimension N.
We introduce the notion of the solution manifold, that is the set of all possible solutions of our parametric problem under the variation of the parameter:
MN={uN(μ)μ∈P}. | (2.2) |
The final goal of RB is to approximate any element of this manifold using a low number of basis functions, or modes, {χi(x)}Ni=1 by setting what we call the reduced basis. These N functions are globally defined over the computational domain and are obtained using some pre-computed truth solutions for particular parameter values.
The reduced solution uNN≈uN is hence composed by a suitable linear combination of these modes:
uNN(μ)=N∑i=1ξi(μ)χi(x), | (2.3) |
and the reduced formulation of the problem (2.1) becomes:
finduNN(μ)∈XNNs.t.:a(uNN(μ),w;μ)=F(w;μ)∀w∈XNN, | (2.4) |
where XNN=span({χi(x)}Ni=1).
Hence, while the degrees of freedom associated with the truth solution N are typically high, the degrees of freedom associated with the RB approximation are only N, where N≪N.
Proper Orthogonal Decomposition. POD refers to a particular RB which considers a hierarchical orthonormal basis generated using energetic considerations, exploting them as global basis function for a Galerkin framework [2]. The mathematical tool behind POD is Singular Value Decomposition (SVD), as we will briefly discuss now.
Let us begin with some notation. We will call a possible outcome of our system y a snapshot and denote with N its dimensionality (y∈RN). We will then denote with n the number of snapshots collected, and gather them in the snapshots' matrix Y∈RN×n:
Y=[∣∣∣y1y2…yn∣∣∣]. | (2.5) |
We then call N the number of POD modes, i.e., the dimension of the reduced basis, and assume N≤n<N.
A formal definition of the POD basis can be then given:
Theorem 1. (POD basis) Given Y∈RN×n, {χi}Ni=1, for N∈{1,..,n} is the POD basis of Y if and only if it is a solution of:
max~ψ1,~ψ2,..~ψNN∑i=1n∑j=1|<yj,~ψi>RN|2s.t.<~ψi,~ψi>RN=δi,j,for1≤i,j≤N. | (2.6) |
This can be read as: the POD basis is the one that maximizes the similarity (as measured by the square of the scalar product) between the snapshots matrix and its elements, under the constraint of orthonormality. In this sense, when we obtain the N-rank POD basis, we have the set of dimension N capable of optimally express the variance in the snapshots.
The link between POD and SVD is stated in following theorem, that can be proven using Lagrangian penalization techniques (see for example [35]):
Theorem 2. Given Y∈RN×n of rank d=n<N, its N-rank POD basis is given by the set of the first N left singular vectors {ψi}Ni=1.
As a last remark on POD, we highlight the fact that, as already shown, to compute the reduced basis we need a set of what can be called, using machine learning nomenclature, training data, that is a set of truth solutions for particular values of the parameters computed using the high fidelity solver. This is usually the most expensive phase in the ROM pipeline, and the applicability of ROMs techniques is often related to the size of the sampling needed to obtain certain performances.
Connected to this last aspect, we mention here another important feature of ROMs, that is the offline-online decomposition. By this, we mean that the use of RB techniques can be split, as we have already seen, in two phases:
● an offline-phase, in which a set of high fidelity solutions are collected and the reduced basis is computed by combining them;
● an online-phase, in which we project the problem onto the reduced space and the solutions for new parameters are computed in an efficient manner.
The first phase requires typically more computational time, having to rely on the high fidelity solver, and is usually executed in high performance computing clusters using parallel programming techniques. The second phase, however, is quite inexpensive from the computational point of view and could be done in low-end terminal or even tablet/smartphones, widening the field of applicability of this method.
Data-driven Proper Orthogonal Decomposition. Data-driven POD is an alternative to standard POD that is more used in the industrial setting because of its ease of use and its flexibility. We define here the data-driven POD as the class of methods that exploit POD for dimensional reduction and any non-intrusive approximation for the reconstruction of the solution method. It includes, for example, the POD with interpolation (PODI) procedure [7], as well as the other possible techniques already discussed in Section 1. Within this method, the first phase, the generation of the reduced basis, is performed in the same way as the standard approach. A significant difference, however, is that while in intrusive POD-Galerkin we need to rely upon an open-source software to compute the truth solutions (since in the online phase we will need to have access to the source code in order to project the equations). In the data-driven approach, this is not necessary, and we can use commercial software or even experimental data to train our model. The reason why in data-driven POD methods we have this freedom in the offline phase is that the online phase is completely different from the standard POD.
We recall here that the assumption of RB-ROMs is that the truth solution of our problem uN can be approximated by the reduced solution uNN composed by linear combination of spatial modes χi(x) multiplied by coefficients ξi(μ), that is:
uN(μ)≈uNN(μ)=N∑i=1ξi(μ)χi(x). | (2.7) |
In the data-driven POD, the online phase consists of building an interpolation, or a regression, using high-fidelity data coming from offline phase, approximating the function that associates the value of the parameter μ to the modal coefficients of the related solution {ξi(μ)}Ni=1. We can use the interpolator to infer the value of the coefficients associated with new parameters. The values of the coefficients are then used to reconstruct the approximated truth solution using (2.7).
This approach is entirely data-driven and is independent both on the equations and on the physics of the problem. This has its own advantages and disadvantages. The ease of use and the complete freedom in the generation of the snapshots, that are crucial in an industrial setting in which commercial software are widely spread, correspond to a lower accuracy associated with the reduced model.
We introduce in this section the Gaussian Process Regression (GPR), the supervised learning technique we adopt to approximate the modal coefficients. We provide in the following lines a summary of the method, in order to let the reader understand the entire pipeline, but avoiding to touch more details; for a complete description of such framework, we suggest [24].
A Gaussian Process (GP) is a stochastic process whose finite-dimensional distributions are Gaussian. In supervised learning, this process is usually adopted for regression manner, by constructing a stochastic model from a set of input data capable to predict quantities of interest. We define the set* D={(¯xi,¯yi)} for i=1,2,…,M as the set containing all the input-output pairs, where ¯xi∈P⊂Rm is the input parameter and ¯yi∈R is the corresponding output. We assume the output follows an unknown regression function f:P→R we want to approximate (we avoid in this simple overview to deal with noise), such that ¯yi=f(¯xi) for i=1,2,…,M. Thus, we define a GP such that:
* we use ¯xi and ¯yi here for input and output data, in order to distinguish, respectively, from spatial coordinates and snapshots.
f(x)∼GP(μ(x),K(x)), | (3.1) |
where μ(x) refers to the mean of the distribution and K(x) to its covariance, defined as Kij=K(¯xi,¯xj). The covariance function, also called kernel, K:P×P→R should be properly selected depending on the problem, for this work we use the squared esponential:
K(¯xi,¯xj)=σ2exp(−‖¯xi−¯xj‖22l). | (3.2) |
The prior joint Gaussian distribution for the outputs y results then†
† for notational simplicity, we exclusively use zero-mean priors.
y|X∼N(0,KXX), |
where y∈RM is the output vector and X∈Rm×M is the matrix whose columns are the input parameters. We specify that, for sake of compactness, we adopt a compact notation for the covariance matrices, such that KXX=(K(¯xi,¯xj)) with i,j=1,…,M, where ¯xi and ¯xj are the i-th and j-th columns of X.
We are now interested in predicting the output for new test input ˉX by exploiting the joint distribution based on the above prior, that is:
[f(X)f(ˉX)]∼N([00],[KXXKXˉXKˉXXKˉXˉX]). | (3.3) |
Now, the ouputs ˉy=f(ˉX) can be expressed in probabilistic terms by simply using the conditional Gaussian distribution:
ˉy|ˉX,X,y∼N(m,C),m=KˉXXK−1XXy,C=KˉXˉX−KˉXXK−1XXKXˉX. | (3.4) |
It is important to remark that, for a good result regarding the prediction, we need to optimize the hyperparameters within the covariance function. In this work, since a squared exponential kernel (Eq 3.2) has been adopted, we optimize the hyperparameters σ and l, respectively the variance and the lenghtscale, by maximizing the likelihood through a standard multi-start grandient-based optimization algorithm.
In this work, concerning the GPR, we use the Python open source package called GPy [11].
In this section we integrate the POD and GPR methods for a data-driven probabilistic reduced order modeling framework. The general idea is to compute the low-dimensional representation of the input snapshots through the POD, then infer their distribution in the parameters space in order to approximate the solution manifold.
The only required ingredients for the method are then the snapshots matrix Y=[y1…yn]∈RN×n and the corresponding parameters matrix X=[μ1…μn]∈Rm×n, where μi refers to the parametric configuration of the system for the snapshots yi for i=1,…,n. The parameter points are collected by randomly sampling the parameter space. As introduced in Section 2, we use the POD algorithm in order to obtain the modes and {reduce} the snapshots dimension such that Ξ=[ξ1…ξn]=UTY∈RN×n, where U=[χ1…χN] is the matrix whose columns are the POD modes. Simplifying, the columns of Ξ are the N-dimensional representation of the corresponding column of the snapshots matrix.
We can now exploit all this information by treating all the different components of the modal coefficients as indipendent scalar outputs. Formally, we define the i-th components of the j-th modal coefficient ξij, such that ξj=[ξ1j…ξNj]T with i=1,…,N and j=1,…,n. Analyzing the components independentely by considering each components as a scalar output we are able to construct a probabilistic response surface for each component, then exploiting it to approximate the modal coefficients for unseen parameters. We define N train ouputs Ξk=[ξk1…ξkn] for k=1,…,N, whereas the train and test parameters are defined X and ˉX, respectively. Using Eq 3.4 we can approximate the outputs — the modal coefficient — for any new parameters:
ξ∗=[ξ1∗…ξN∗]T=[ˉΞ1|ˉX,X,Ξ1…ˉΞN|ˉX,X,ΞN]T∈RN, | (4.1) |
where the ξ∗ indicates the test modal coefficient — with the j component that indicates the coefficient with respect to the j POD mode — and ˉΞk for k=1,…,N refers then to the predicted k coefficient. We remark here that any component of the modal coefficients has been treated in an independent way. We remark that in this work a response surface is constructed for each of the output components, treating these latter in an independent manner with N different GPs. The POD modes are finally exploited in order to return to the high-dimensional space and obtain the final approximation. In matrix form:
y∗=Uξ∗, | (4.2) |
where y∗∈XN is the solution (for the test parameter ˉX).
We present in this section the numerical results obtained by applying the described method to two different computational fluid dynamics simulations. To emphasize the versatility of the proposed framework with respect to the high-fidelity solver, we firstly apply it to a finite element (FE) problem then to a finite volume (FV) one. The coupling between the original model and the reduced one is in fact based only on data, resulting in a complete modular pipeline. In the next sections, we briefly introduce both the full-order models before providing a discussion about the reduction accuracy, even if in this case the original model has not a strong impact in the generation of the reduced order space, in order to give to the reader the possibility to reproduce the results.
The first numerical experiment where we use the POD-GPR framework is the simulation of a parametric Stokes flow passing around a circular cylinder. We define u(μ):Ω→R2 and p(μ):Ω→R the velocity and the pressure parametric fields, respectively, such that:
μ(μ0)Δu(μ)+∇p(μ)=0inΩ,divu(μ)=0inΩ,u(μ)=μ1inΓIN,u(μ)=0inΓS∪ΓN∪ΓCYL,p(μ)=0inΓOUT. | (5.1) |
As we can see from Eq 5.1, we consider two physical parameters μ=(μ0,μ1)∈P⊂R2 controlling the viscosity μ of the fluid and the velocity of the fluid at the inlet boundary ΓIN. The 2-dimensional domain Ω (sketched in Figure 1) is a rectangle with a circular hole, on which we impose on the physical walls (ΓS∪ΓN∪ΓCYL) no-slip boundary condition, on the inlet the parametric flow velocity μ1 and on the outlet a null pressure.
Starting from Eq 5.1, we can obtain the weak formulation of the Stokes equations by multiplying these equations to the test functions v∈[H10(Ω)]2=V and q∈L2(Ω)=Q:
a(u(μ),v;μ)+b(v,p(μ);μ)=0∀v∈V,b(u(μ),q;μ)=0∀q∈Q. | (5.2) |
where a(⋅,⋅;μ) and b(⋅,⋅;μ) are the parametric bilinear forms.
To solve the Stokes problem described above, we adopt a finite element (FE) approach. We define the finite dimensional spaces Vh=(Pk(T))2∩V and Qh=Pl(T)∩Q, where Pi(T)={v∈C(Ω):v|τ∈Pi,∀τ∈T} for k>1 indicates the piecewise polynomial space defined on the triangulation T of the domain Ω. For this work, we have chosen a second order polynomial for the velocity space and a first order one for the pressure space, i.e., (P2,P1). The use of the so-called Hood-Taylor scheme [31] is sufficient for ensuring the inf-sup condition; for the mathematical proof of the stability of such scheme, we refer to [3]. Moreover, the Reynolds number is very low (Re<1) for all the combinations of parameters, making possible to avoid additional stabilization to numerically solve the problem.
The discretized parametric Stokes problem reads as follow:
a(uh(μ),vh;μ)+b(vh,ph(μ);μ)=0∀vh∈Vh,b(uh(μ),qh;μ)=0∀qh∈Qh, | (5.3) |
where uh(μ)∈Vh and ph(μ)∈Qh are the finite dimensional unknowns we want to compute. The discretized problem has been solved using the FEniCS framework [1].
We initially create the high-fidelity database by computing the pressure and velocity fields for 50 samples, randomly generated in the parameter space P=[.1,10]×[1e−4,1e−6]. We then apply the described method to these snapshots, using initially the POD algorithm to reduce the dimensionality of the solutions, aka extract the modal coefficients.
In this case, we use 12 modes for the construction of the reduced space and apply a GP — we repeat that the used kernel is the squared exponential one — to the distribution of the modal coefficients. To test the accuracy, we create a test dataset containing 20 random samples selected in the parameter space, that of course are different to the input database. For all these test parameters, we approximate the reduced solutions and compute the realtive error between the reduced test solutions and the truth ones. Figure 2 reports the average relative error for all the analized fields in relation to the number of input snapshots, starting from only 2 snapshots. We specify that, using less than 12 snapshots, the number of POD modes is limited to the number of snapshots. As we can see, the error results very small with even few snapshots, especially for both the velocity components that are beyond the 5% with only 8 snapshots. The trend for the pressure error is less steep (we need 16 snapshots to reach an error of 5%), but it is able to reach at the end a precision of about 2.5%. A graphical visualization of the test solutions obtained using only 3 input samples is provided in Figure 3. We underline the error becomes pretty much constant using many snapshots, due to the Bayesian online phase, making the usage of GP during the online stage profitable especially in the cases where the offline stage has a limited computational budget.
The second numerical experiment we set up to test the POD-GPR framework is the simulation of the flow around the hull of a cruise ship. The reduction is then employed inside a shape optimization pipeline, enabling the use of a global optimization procedure on the design space.
We now proceed to discuss the main ingredients of the optimization pipeline in order to understand the role of the POD-GPR reduction. We avoid going into many technical details and refer the reader to [6,20] for a more extensive presentation of all the optimization pipeline.
The first step of the optimization pipeline consists of the generation of the parametric design space, where each possible shape is associated to a particular value of a pre-defined parameter. This is done by considering an initial undeformed configuration and applying to it a technique called Free Form Deformation (FFD) (for more details see [17,26,28]). FFD is a deformation technique in which the object to be deformed is initially put inside a lattice of points; then some of these points are moved, in a parametric way, and the space enclosed is deformed smoothly according to these motions.
In our case, we define the lattice shown in Figure 4. This lattice is enclosing the bottom-frontal part of the boat, being the part we want to deform, and is composed of a total of around 500 points, of which only a small part will be displaced.
In particular we define a set of movements connected to 6 independent parameters. The way in which these parameters are defined and the ranges they take values from are related to the kind of shapes we want to consider and the various constraints those shapes are subjected to. For example, the most simple and straightforward constraint we imposed is that of volume, i.e., we impose that the hulls must not have a total volume that reaches too small values. Another constraint we impose is that the deformations applied on the hull needs to connect continuously and smoothly to the undeformed parts of the boat. This latter constraint can be eimposed by keeping fixed some points in the lattice.
The FFD step is performed using PyGEM, an open-source Python library (see [23]).
The next step consists of the evaluation of the performance of the newly generated hull. This is done via numerical simulation of the flow of water and air around it. In particular, we consider a 3D biphase (water and air) incompressible flow. The physical model is based on the well known Incompressible Navier-Stokes equations, with modellation of turbulence based on a RANS approach (for more details see [5]), Finite Volume discretization (see [18]) and using the Volume of Fluid (VOF) methodology (see [16]) to take into account the biphase nature of the fluid.
More specifically, the equations considered are the following:
{∂(ρu)∂t+∇⋅(ρu⊗u)+∇p−ρg−∇⋅ν∇u−∇⋅R=0,∇⋅u=0,∂α∂t+∇⋅(uα)=0, | (5.4) |
where u, p denote the fluid velocity and pressure fields, ρ, ν and g denote respectively density, dynamic viscosity and gravity acceleration, R is the Reynolds stress tensor and α is the fraction of water.
The first equation in Eq (5.4) expresses the momentum balance in the infinitesimal volume and is essentially a reformulation of Newton'st 2nd law of dynamics expressing the Eulerian fluid's acceleration (∂u∂t) as a result of convection ((u⋅∇)u), a gradient of pressure (1ρ∇p) and diffusion (∇⋅ν∇u). The second equation instead expresses the mass conservation, emerging as solenoidality of the velocity field because of the incompressibility hypothesis.
We note that both the velocity and pressure fields in this equations represent only the mean part, decoupled from the fluctuating part using a RANS approach. All the effects of the fluctuating part are contained in the Reynolds stress tensor R, which will be in then modeled as a function of the mean velocity field u in order to close the equations.
The third equation is coming from the VOF modelling of the biphase flow.
Volume of fluid is a free-surface modeling technique that allows to describe a multiphase fluid composed of two incompressible, isothermal immiscible fluids (water and air, in our case). For a more in-depth discussion on VOF, we refer to [16].
This method is based on a phase-fraction technique: a scalar variable, denoted by α, is added to N-S equations, representing the fraction of water contained in the infinitesimal volume (or in the finite volume, when we will discretize the equations). This variable belongs to the interval [0,1], where α=1 represents a point in which water is present, α=0 a point in which air is present and α∈(0,1) represents interface points. By its nature, α is a discontinuous variable, and so the discretization method must ensure that the interface is captured a small number of cells.
In addition to the momentum and mass balance equations, we have two equations where the density ρ and the kinematic viscosity ν are defined using an algebraic formula expressing them as a convex combination of the corresponding water and air properties:
ρ=αρW+(1−α)ρA, | (5.5) |
ν=ανW+(1−α)νA. | (5.6) |
Finally, the third equation in the system (5.4) is the equation required to close the system after the addition of the variable α. In fact, this equation is simply a transport equation for the fraction of fluid in which only the convective term is considered.
As for the discretization and solution of the above equations, we used the implementation contained in the open-source CFD software OpenFOAM (see [19]).
The POD-GPR framework described above is then employed in order to reduce the computational time required for the resolution of the numerical problem associated to a new shape. Going into the details of the reduction, we consider as a snapshot, and hence as a solution, not the whole volumetric flow field but the the field of total resistance (viscous+pressure) over the hull.
The first step consists of the collection of the snapshots matrix, i.e., a set of solutions of the Full Order Model for selected parameter values. In particular we sampled the parameters space described above considering the vertices (26=64 points) and other 36 random points inside the domain, for a total of 100 snapsohts. Other 20 random snapshots have been generated for purpose of testing.
The extraction of the modes and approximation step have then been conducted using EZyRB [8], an open-source Python library for data-driven model order reduction. We now show some graphs reporting the relative L2 error between FOM and ROM:
(∫A(uN−uNN)2dA∫AuN2NdA)12. | (5.7) |
In particular, in Figure 5 (left), the error is expressed as a function of the number of modes used and as a function of the number of snapshots used. In both the figures we report in different colors the different techniques used in the approximation of the modal coefficients. We consider the GPR approach, the classical n-dimensional linear interpolation and two interpolation techniques based on radial basis functions (RBF), implemented using the Python package Scipy (see [34]). We remark we use the classical RBF interpolation method [4]. The former two methods differ one from the other for the value of one particular parameter that defines the method, called "smoothness", which quantifies how close the approximation technique is to interpolation, where a value of 0 correspond to a plain interpolation while higher values correspond to more smooth approximations. In particular, we considered values of the smoothness equal to 0 and 0.01.
We can see from Figure 5 that, using the GPR for the regression of the modal coefficients, we are able to use a very low number of snapshots obtaining an error which is comparable to the performances of different regression techniques for higher number of snapshots. This is really interesting since, as argued in the previous section, the offline phase, i.e. the generation of the snapshots, is indeed the most expensive from the computational point of view. However, when the number of snapshots increases, the performances of GPR and the other methods tend to converge to a same value.
The final step of the shape optimization pipeline consists on the exploration of the design space in order to find the optimum. To perform this, we used a evolutionary optimization algorithm, implemented inside DEAP, an open-source Python library for evolutionary optimization (see [10]).
From the point of view of the speed-up in the computational time obtained with the use of the POD-GPR framework, we mention that while in order to run the above mentioned optimization procedure using the FOM we would have needed around 140 weeks of CPU time, the use of the ROM approach restricted this time to around 5 weeks. Of these 5 weeks, the biggest partition is associated to the offline phase, i.e., the generation of the 100 training simulations, being the time associated to the evaluation of the online phase almost negligible. It is then clear that, in light of the considerations made regarding the error as a function of the number of snapshots for the POD-GPR framework, we could have augmented this speed-up of at least a factor 10 by generating fewer snapshots.
Finally we report, in Figure 6 the optimal shape found, which correspond to a reduction of drag of 3% from the initial hull. The figure represents sections of the hull on the longitudinal and transversal plane, in blue for the undeformed hull and red for the optimal deformed hull.
In this contribution, we applied the POD-GPR framework to two completely different parametric problems. To emphasize the equation-free nature of the algorithm, we demonstrate the applicability to a Stokes problem discretized using finite element method, then to a turbulent multiphase industrial problem discretized using finite volume method. In both cases, it emerges that the GPR method for the online prediction of the modal coefficients (for untested parameters) is able to make a prediction with a relative small error by using very few input snapshots. The advantage of such method is then maximazed when the offline phase results quite expensive due to the high-fidelity model.
The adoption of this method open several possibilities for future developments, by using the GPR not only for the prediction of new modal coefficients, but also for the quantification of the error, or for a greedy approach for new snapshots selection by exploiting the variance of the Gaussian distribution. Additionaly, the use of different kernels can be investigated.
We sincerely thank Prof. Claudio Canuto (Politecnico di Torino), Ing. Gianpiero Lavini and Ing. Gianluca Gustin (Fincantieri SpA) for their support.
This work was partially funded by the MIUR through the FARE-X-AROMA-CFD project, by the INdAM-GNCS 2019 project "Advanced intrusive and non- intrusive model order reduction techniques and applications" and by European Union Funding for Research and Innovation — Horizon 2020 Program — in the framework of European Research Council Executive Agency: H2020 ERC CoG 2015 AROMA-CFD project 681447 "Advanced Reduced Order Methods with Applications in Computational Fluid Dynamics" P.I. Gianluigi Rozza.
The authors declare no conflict of interest.
[1] | M. S. Alnæs, J. Blechta, J. Hake, A. Johansson, B. Kehlet, A. Logg, et al., The FEniCS project version 1.5, Archive of Numerical Software, 3 (2015), 9-23. |
[2] |
F. Ballarin, A. Manzoni, A. Quarteroni, G. Rozza, Supremizer stabilization of POD-Galerkin approximation of parametrized steady incompressible Navier-Stokes equations, Int. J. Numer. Meth. Eng., 102 (2015), 1136-1161. doi: 10.1002/nme.4772
![]() |
[3] | D. Boffi, F. Brezzi, M. Fortin, Finite elements for the stokes problem, In: Mixed finite elements, compatibility conditions, and applications, Springer, 2008, 45-100. |
[4] | M. D. Buhmann, Radial basis functions: theory and implementations, Cambridge University Press, 2003. |
[5] | P. Davidson, Turbulence: an introduction for scientists and engineers, Oxford University Press, 2015. |
[6] | N. Demo, G. Ortali, G. Gustin, G. Rozza, G. Lavini, An efficient computational framework for naval shape design and optimization problems by means of data-driven reduced order modeling techniques. Boll. Unione Mat. Ital., 14 (2021), 211-230. |
[7] | N. Demo, M. Tezzele, G. Gustin, G. Lavini, G. Rozza, Shape optimization by means of proper orthogonal decomposition and dynamic mode decomposition, In: Technology and science for the ships of the future: proceedings of NAV 2018: 19th international conference on ship & maritime research, IOS Press, 2018,212-219. |
[8] |
N. Demo, M. Tezzele, G. Rozza, EZyRB: Easy reduced basis method, JOSS, 3 (2018), 661. doi: 10.21105/joss.00661
![]() |
[9] |
N. Demo, M. Tezzele, G. Rozza, A non-intrusive approach for the reconstruction of POD modal coefficients through active subspaces, CR. Mécanique, 347 (2019), 873-881. doi: 10.1016/j.crme.2019.11.012
![]() |
[10] | F. A. Fortin, F. M. De Rainville, M. A. Gardner, M. Parizeau, C. Gagné, DEAP: Evolutionary algorithms made easy, J. Mach. Learn. Res., 13 (2012), 2171-2175. |
[11] | GPy, GPy: A Gaussian process framework in Python, 2012. Available from: http://github.com/SheffieldML/GPy. |
[12] |
M. Guo, J. S. Hesthaven, Reduced order modeling for nonlinear structural analysis using Gaussian process regression, Comput. Method. Appl. M., 341 (2018), 807-826. doi: 10.1016/j.cma.2018.07.017
![]() |
[13] | J. S. Hesthaven, G. Rozza, B. Stamm, Certified reduced basis methods for parametrized partial differential equations, 1 Eds., Switzerland: Springer, 2016. |
[14] |
J. S. Hesthaven, S. Ubbiali, Non-intrusive reduced order modeling of nonlinear problems using neural networks, J. Comput. Phys., 363 (2018), 55-78. doi: 10.1016/j.jcp.2018.02.037
![]() |
[15] |
S. Hijazi, G. Stabile, A. Mola, G. Rozza, Data-driven pod-galerkin reduced order model for turbulent flows, J. Comput. Phys., 416 (2020), 109513. doi: 10.1016/j.jcp.2020.109513
![]() |
[16] |
C. Hirt, B. Nichols, Volume of fluid (VOF) method for the dynamics of free boundaries, J. Comput. Phys., 39 (1981), 201-225. doi: 10.1016/0021-9991(81)90145-5
![]() |
[17] | A. Koshakji, A. Quarteroni, G. Rozza, Free form deformation techniques applied to 3D shape optimization problems, CAIM, 4 (2013), 1-26. |
[18] | F. Moukalled, L. Mangani, M. Darwish, The finite volume method in computational fluid dynamics: an advanced introduction with OpenFOAM and Matlab, Cham: Springer, 2015. |
[19] | OpenCFD, OpenFOAM - The Open Source CFD Toolbox - User's Guide, 2018. Available from: https://www.openfoam.com/documentation/user-guide. |
[20] | G. Ortali, A data-driven reduced order optimization approach for Cruise ship design, Master's thesis, Politecnico di Torino, 2019. |
[21] | F. Pichi, F. Ballarin, G. Rozza, J. S. Hesthaven, Artificial neural network for bifurcating phenomena modelled by nonlinear parametrized PDEs, PAMM, 20 (2021), e202000350. |
[22] |
F. Pichi, G. Rozza, Reduced basis approaches for parametrized bifurcation problems held by non-linear von kármán equations, J. Sci. Comput., 81 (2019), 112-135. doi: 10.1007/s10915-019-01003-3
![]() |
[23] | PyGeM, PyGeM: Python geometrical morphing. Available from: https://github.com/mathLab/PyGeM. |
[24] | J. Quiñonero-Candela, C. E. Rasmussen, A unifying view of sparse approximate gaussian process regression, J. Mach. Learn. Res., 6 (2005), 1939-1959. |
[25] | G. Rozza, M. H. Malik, N. Demo, M. Tezzele, M. Girfoglio, G. Stabile, et al., Advances in reduced order methods for parametric industrial problems in computational fluid dynamics, In: Proceedings of the ECCOMAS Congress 2018, 2018. |
[26] |
F. Salmoiraghi, F. Ballarin, L. Heltai, G. Rozza, Isogeometric analysis-based reduced order modelling for incompressible linear viscous flows in parametrized shapes, Adv. Model. and Simul. in Eng. Sci., 3 (2016), 21. doi: 10.1186/s40323-016-0076-6
![]() |
[27] |
F. Salmoiraghi, A. Scardigli, H. Telib, G. Rozza, Free-form deformation, mesh morphing and reduced-order methods: enablers for efficient aerodynamic shape optimisation, Int. J. Comput. Fluid Dyn., 32 (2018), 233-247. doi: 10.1080/10618562.2018.1514115
![]() |
[28] | T. W. Sederberg, S. R. Parry, Free-form deformation of solid geometric models, In: ACM SIGGRAPH Computer Graphics, 1986,151-160. |
[29] |
G. Stabile, G. Rozza, Finite volume POD-Galerkin stabilised reduced order methods for the
parametrised incompressible Navier-Stokes equations, Comput. Fluids, 173 (2018), 273-284. doi: 10.1016/j.compfluid.2018.01.035
![]() |
[30] | G. Stabile, M. Zancanaro, G. Rozza, Efficient Geometrical parametrization for finite-volume based reduced order methods. Int. J. Numer. Meth. Eng., 121 (2020), 2655-2682. |
[31] |
C. Taylor, P. Hood, A numerical solution of the navier-stokes equations using the finite element technique, Comput. Fluids, 1 (1973), 73-100. doi: 10.1016/0045-7930(73)90027-3
![]() |
[32] | M. Tezzele, N. Demo, G. Rozza, Shape optimization through proper orthogonal decomposition with interpolation and dynamic mode decomposition enhanced by active subspaces, In: Proceedings of MARINE 2019: VIII International Conference on Computational Methods in Marine Engineering, 2019,122-133. |
[33] |
M. Tezzele, N. Demo, G. Stabile, A. Mola, G. Rozza, Enhancing cfd predictions in shape design problems by model and parameter space reduction, Adv. Model. and Simul. in Eng. Sci., 7 (2020), 40. doi: 10.1186/s40323-020-00177-y
![]() |
[34] |
P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, et al., SciPy 1.0: Fundamental algorithms for scientific computing in Python, Nat. Methods, 17 (2020), 261-272. doi: 10.1038/s41592-019-0686-2
![]() |
[35] | S. Volkwein, Proper orthogonal decomposition: theory and reduced-order modelling, 2012. |
[36] |
Q. Wang, J. S. Hesthaven, D. Ray, Non-intrusive reduced order modeling of unsteady flows using artificial neural networks with application to a combustion problem, J. Comput. Phys., 384 (2019), 289-307. doi: 10.1016/j.jcp.2019.01.031
![]() |
1. | J.G. Hoffer, B.C. Geiger, R. Kern, Solving multi-objective inverse problems of chained manufacturing processes, 2023, 40, 17555817, 213, 10.1016/j.cirpj.2022.11.007 | |
2. | Yiqian Mao, Shan Zhong, Hujun Yin, Active flow control using deep reinforcement learning with time delays in Markov decision process and autoregressive policy, 2022, 34, 1070-6631, 053602, 10.1063/5.0086871 | |
3. | Johannes G. Hoffer, Bernhard C. Geiger, Roman Kern, Gaussian Process Surrogates for Modeling Uncertainties in a Use Case of Forging Superalloys, 2022, 12, 2076-3417, 1089, 10.3390/app12031089 | |
4. | Marco Tezzele, Lorenzo Fabris, Matteo Sidari, Mauro Sicchiero, Gianluigi Rozza, A multifidelity approach coupling parameter space reduction and nonintrusive POD with application to structural optimization of passenger ship hulls, 2023, 124, 0029-5981, 1193, 10.1002/nme.7159 | |
5. | Abdolvahhab Rostamijavanani, Shanwu Li, Yongchao Yang, Physics-constrained deep learning of nonlinear normal modes of spatiotemporal fluid flow dynamics, 2022, 34, 1070-6631, 127121, 10.1063/5.0124455 | |
6. | Helin Gong, Tao Zhu, Zhang Chen, Yaping Wan, Qing Li, Parameter identification and state estimation for nuclear reactor operation digital twin, 2023, 180, 03064549, 109497, 10.1016/j.anucene.2022.109497 | |
7. | B. X. Nony, M. C. Rochoux, T. Jaravel, D. Lucor, Reduced-order modeling for parameterized large-eddy simulations of atmospheric pollutant dispersion, 2023, 1436-3240, 10.1007/s00477-023-02383-7 | |
8. | Nicola Demo, Giulio Ortali, Gianluca Gustin, Gianluigi Rozza, Gianpiero Lavini, An efficient computational framework for naval shape design and optimization problems by means of data-driven reduced order modeling techniques, 2021, 14, 1972-6724, 211, 10.1007/s40574-020-00263-4 | |
9. | Teeratorn Kadeethum, Francesco Ballarin, Nikolaos Bouklas, Data-driven reduced order modeling of poroelasticity of heterogeneous media based on a discontinuous Galerkin approximation, 2021, 12, 1869-2672, 10.1007/s13137-021-00180-4 | |
10. | Michaela Reck, Marc Hilbert, René Hilhorst, Thomas Indinger, 2023, 1, 0148-7191, 10.4271/2023-01-0862 | |
11. | Ngoc Cuong Nguyen, 2024, 58, 9780443294488, 149, 10.1016/bs.aams.2024.03.005 | |
12. | Kazuo Yonekura, Hitoshi Hattori, Shohei Shikada, Kohei Maruyama, Turbine blade optimization considering smoothness of the Mach number using deep reinforcement learning, 2023, 642, 00200255, 119066, 10.1016/j.ins.2023.119066 | |
13. | R. Amaduzzi, A. Procacci, A. Piscopo, R. Malpica Galassi, A. Parente, Effect of parametric uncertainty in numerical simulations of a hydrogen-fueled flameless combustion furnace using dimensionality reduction and non-linear regression, 2024, 40, 15407489, 105551, 10.1016/j.proci.2024.105551 | |
14. | G.I. Drakoulas, T.V. Gortsas, G.C. Bourantas, V.N. Burganos, D. Polyzos, FastSVD-ML–ROM: A reduced-order modeling framework based on machine learning for real-time applications, 2023, 414, 00457825, 116155, 10.1016/j.cma.2023.116155 | |
15. | Moaad Khamlich, Giovanni Stabile, Gianluigi Rozza, László Környei, Zoltán Horváth, A physics-based reduced order model for urban air pollution prediction, 2023, 417, 00457825, 116416, 10.1016/j.cma.2023.116416 | |
16. | Hayriye Pehlivan Solak, Jeroen Wackers, Riccardo Pellegrini, Andrea Serani, Matteo Diez, Paolo Perali, Matthieu Sacher, Jean-Baptiste Leroux, Benoît Augier, Frédéric Hauville, Patrick Bot, Optimising hydrofoils using automated multi-fidelity surrogate models, 2024, 1744-5302, 1, 10.1080/17445302.2024.2422518 | |
17. | Nicola Demo, Marco Tezzele, Andrea Mola, Gianluigi Rozza, Hull Shape Design Optimization with Parameter Space and Model Reductions, and Self-Learning Mesh Morphing, 2021, 9, 2077-1312, 185, 10.3390/jmse9020185 | |
18. | Stefano Riva, Carolina Introini, Antonio Cammi, Multi-physics model bias correction with data-driven reduced order techniques: Application to nuclear case studies, 2024, 135, 0307904X, 243, 10.1016/j.apm.2024.06.040 | |
19. | M. Lo Verso, S. Riva, C. Introini, E. Cervi, F. Giacobbo, L. Savoldi, M. Di Prinzio, M. Caramello, L. Barucca, A. Cammi, Application of a non-intrusive reduced order modeling approach to magnetohydrodynamics, 2024, 36, 1070-6631, 10.1063/5.0230708 | |
20. | Nicola Cavallini, Riccardo Ferretti, Gunnar Bostrom, Stephen Croft, Aurora Fassi, Giovanni Mercurio, Stefan Nonneman, Andrea Favalli, Vanquishing the computational cost of passive gamma emission tomography simulations leveraging physics-aware reduced order modeling, 2023, 13, 2045-2322, 10.1038/s41598-023-41220-3 | |
21. | Ahmad Lawal, Yingjie Yang, Nathanael L. Baisa, Hongmei He, 2024, A Novel Framework for Reservoir Permeability Prediction Using GPR with Grey Relational Grades and Uncertainty Quantification, 979-8-3503-5089-0, 404, 10.1109/PRAI62207.2024.10827278 | |
22. | Avishek Mukherjee, Surjya Kanta Pal, Debashish Chakravarty, A high-speed numerical simulation method for diverse boundary conditions for real time applications unleashing MeshGraphNet, 2025, 175, 09557997, 106204, 10.1016/j.enganabound.2025.106204 | |
23. | Moaad Khamlich, Federico Pichi, Michele Girfoglio, Annalisa Quaini, Gianluigi Rozza, Optimal transport-based displacement interpolation with data augmentation for reduced order modeling of nonlinear dynamical systems, 2025, 531, 00219991, 113938, 10.1016/j.jcp.2025.113938 | |
24. | Matteo Zorzetto, Riccardo Torchio, Francesco Lucchini, Michele Forzan, Fabrizio Dughiero, Proper Orthogonal Decomposition for Parameterized Macromodeling of a Longitudinal Electromagnetic Levitator, 2025, 61, 0018-9464, 1, 10.1109/TMAG.2025.3542131 |