1.
Introduction
Cancer is an unexpected behavior of normal cells due to some biological reasons changing the fundamental and structural progress of normal cells. Abnormality in these cells form a lump known as tumor, and invasion causes degradation in extracellular matrix. Avascular, vascular and metastatic stages of cancer development are known. Avascular stage is the most interested stage in numerical approaches due to the simplicity of the model, easiness in validation with experimental results, and the first milestone to understand the next stages. Mathematical models for cancer invasion are developed to describe the evolution of tumor in a less time with less cost comparing to laboratory setups.
In literature, there are plenty of studies in the last decades on mathematical models and numerical observations on cancer cell (CC) invasion of tissue. Jackson et al. examined the bio-distribution, pharmacokinetics, and localization properties of monoclonal antibody (mAb)-enzyme conjugates in cancer tissue simulating a mathematical model which is numerically solved by the finite difference method (FDM) [1]. Anderson and Chaplain [2] studied on continuous and discrete mathematical models considering solid tumors and macromolecule fibronectin. Anderson et al. [3] performed the method of lines and FDM to investigate behavior of tumor cells, host tissue (extracellular matrix (ECM)), and tumor cell-associated matrix-degrading enzymes. Sherratt and Chaplain [4] examined avascular tumor growth utilizing wave front solutions in a novel model incorporating a general nutrient factor together with continuum densities of proliferative, dormant, and necrotic cells. Matzavinos and Chaplain [5] analyzed reaction-diffusion-chemotaxis equations of the development of a solid tumor in the context of an immune response using the traveling-wave method. In that study, a bifurcation analysis of ordinary differential equations (ODE) kinetics of the considered system is also made. A mathematical model that emphasizes the function of the plasminogen activation system in the invasion of CCs into tissue (ECM) is presented in [6]. The governing partial differential equations (PDEs) are numerically solved using method of lines and Gear's method from numerical algorithm group's (NAG) numerical library. Chaplain and Lolas [7] solved a mathematical model of CCs described by reaction-diffusion-taxis PDEs, which are the combination of the interactions between CCs, ECM, urokinase plasminogen activator (uPA), and using backward difference and FDM. The fundamental biological processes of cell-to-cell and cell-to-matrix adhesion are explicitly included in the unique continuous model by Gerish and Chaplain in [8]. They used the second order central FDM for the second derivative in the case of constant diffusion coefficients. In [9], mathematical models and techniques for analyzing and simulating the dynamics of tumors in opposition to the immune system is reviewed. Optimization of therapeutic actions is also included in this work. The typical scales of the phenomena are determined, and each scale's models and associated issues are studied and critically analyzed in the mathematical literature. Enderling et al. [10] presented a mathematical model for the cancer invasion of a solid tumor into breast tissue. An explicit FDM is used for numerical computation of the governing dimensionless equations. They also propose a model for surgical removal and radiation therapy. Andasari et al. [11] analyzed a mathematical model of the uPA system, emphasizing the function of CCs in spreading into tissue or the ECM. Dehghan and Mohammadi [12] used two numerical meshless approaches, multiquadric radial basis functions (RBFs) and generalized moving least squares (GMLS), to solve the four-species tumor-growth model in two and three dimensions. They concerned a constant mobility tumor-growth model and a model of tumor growth with varying mobility. A meshless approach is also utilized by Dehghan and Narimani in [13] to simulate time-dependent reaction-diffusion-taxis PDEs describing the interactions between CCs, ECM, and matrix degradation enzymes. Meral et al. [14] applied a hybrid numerical approach based on the finite difference scheme to observe the CC invasion. In that study, an iterative process is also used to demonstrate the local existence and uniqueness of the CC invasion concept theoretically. Chemotaxis-haptotaxis model is also simulated applying the dual reciprocity boundary element method (DRBEM) and FDM jointly in [15]. Nyarko et al. [16] takes into account a lag time between the microscopic and macroscopic level in their mathematical model. Numerical outcomes are obtained by implicit-explicit FDM. Franssen [17] investigated the mathematical model for metastatic cancer invasion utilizing a hybrid numerical approach in which a five point finite difference scheme is used. Hatami et al. [18] developed a solution for the chemotaxis-haptotaxis model of cancer invasion by using the new homotopy perturbation method. The obtained solution is compared with laboratory's results and a good agreement is observed. Tao and Cui [19] made a theoretical study proving the global existence of a unique solution of the chemotaxis-haptotaxis model of cancer tissue invasion by considering a priori estimate technique and logistic damping. Tao and Winkler [20] also studied the proof of the global solution of the same model. Amoddeo [21] solved six coupled PDEs of CC invasion into tissue considering an oxygen source term in governing equations. Numerical simulations are done by moving mesh PDE using the finite element method (FEM). It is reported that the presence of oxygen in tissue accelerates the CC proliferation. In [22], the same method is again used to simulate the nonlinear PDE system based on the uPA system. The same PDE system of equations are again concerned in [23] by FEM. Amoddeo also added an electric field term to the CC density equation in [24]. Ganesan et al. [25] numerically studied the cancer invasion model. They used FEM with the Crank-Nicolson scheme in time treating nonlinear terms semi-implicitly. Three dimensional FEM simulation is also carried out in [26]. Meral and Surulescu [27] made a study in both numerical and theoretical aspects proposing a model in the presence of heat shock protein effect. Numerical results are found by FDM while a local weak solution with the help of a fixed point idea is proved. In [28], a hyperbolic reaction-diffusion model for chemotaxis in accordance with the key concepts of extended thermodynamics are suggested, and linear stability analysis, Turing bifurcation, and traveling wave solutions are also considered.
A mathematical model that explains the switch from the mesenchymal-like cells' (MCs) individual invasion strategy to the epithelial like cells' (ECs) invasion strategy is developed by Sfakianakis et al. [29]. In that study, the governing equations of the considered multi-scale and hybrid model are made up of PDEs and stochastic differential equations that characterize the evolution of the ECs and the MCs considering transitions. They reported that according to traveling wave analysis, the dynamics of cell invasion are primarily influenced by velocity and growth rate, and the tumor enlarges to a dormant level. Urdal et al. [30] observed the interaction of tumor cells and fibroblasts in the presence of fluid flow and concluded the enhancement in the invasion of tumor cells. In Los et al. [31], three dimensional tumor growth simulation is made by FEM based on isogeometric L2 projection and implemented in a parallel solver. Benito et al. [32] concerned chemotaxis-haptotaxis system in view of local stability of the constant equilibrium solution. They utilized generalized FDM and proved the convergence of the discrete solution to the analytical solution. Generalized FDM is also performed for solving the cancer invasion model involving nutrient density, and a convergence analysis is also made in [33]. In [34], a theoretical study is presented to show that the haptotaxis effect can be neglected in the chemotaxis-haptotaxis model. One more theoretical analysis is encountered in Shen et al. [35].
Some biological studies may also be mentioned. He et al.[36] and Melzer et al.[37] discussed the role of stromal fibroblasts in activating the uPA-plasminogen-matrix metalloproteinase-2 (MMP-2) cascade and regulating the invasive behaviors of pancreatic and breast CCs, respectively. They found that a direct interaction between cancer cells and neighboring cells in the microenvironment is required for activating the cascade. Huang et al. [38] and Henke et al. [39] highlighted the importance of ECM stiffness and its disregulation in cancer progression. They suggest that targeting ECM components could be a promising therapeutic approach to manage cancer. Shimpi et al. [40] discussed how compositional and physical changes of the ECM contribute to tumor heterogeneity, and engineered model systems that can recapitulate both cellular and ECM heterogeneity are critical to elucidate the mechanisms through which ECM characteristics and different cellular states are linked. Dass et al. [41] and Pakneshan et al. [42] focused on the uPA system and its involvement in tumor cell invasion and metastasis. Pickup et al. [43] highlighted the essential need for matrix stiffness to drive many tumor-promoting effects of the ECM, and suggests that it is essential to determine whether this ECM property is a correlative phenotype to tumor progression or a causative factor driving tumor initiation. Holle et al. [44] discussed the poor characterization of the adhesion-mediated signaling processes between malignant cells and the ECM and suggests that simple, low-cost, label-free, image-analysis-based characterization of adhesion signatures may play a role in clinical diagnostics.
In the aforementioned studies, method of lines, FDM, DRBEM, FEM, and the meshless method are encountered. In the current study, the RBF time-space method is first taken into account in these equations. To begin, the time-space global RBF method is applied on dimensionless governing equations of CC invasion of tissue. Then, mean CC, mean ECM and mean uPA are modeled by machine learning approaches based on multilayer neural network and multivariate adaptive regression splines. The train and test datas for modeling are built from numerical results. To the best of authors' knowledge, the machine learning approach is the first embedded into this type of problem.
2.
Model equations
Interaction between CC density, normal cell density (ECM), and the concentration of the matrix degrading enzyme (uPA) is described by the following non-dimensional equations as [7]
where c is the CC density, v is the ECM density, and u is the uPA density. Basically, ECM is a structure in which cells move, grow, communicate with each other, and function. uPA is a plasminogen activator called urokinase. uPA converts the plasminogen into its active form plasmin (an essential enzyme in blood). uPA is a serine protease, which is an enzyme breaking peptid bonds of proteins. Chemotaxis is the movement of an organism reacting to a chemical stimulation. Haptotaxis is the movement of CCs from a low concentration region to a high concentration region as a result of their adhesive behavior in ECM. Proliferation refers to the the rapid rise in cell division and reproduction. Proteolysis is a process in which proteins break into small pieces.
Further, some constants in Eq (2.1) are given in Table 1.
These parameter ranges and estimations are considered as stated in [7].
Initial conditions are generated as [7]
where ϵ is taken as 0.01.
Normal gradient of all unknowns c,u are zero on the boundary.
3.
Numerical procedure
In the global RBF method, the solution is obtained in a dense system matrix. It is based on the usage of radial basis functions depending on radial distance between nodes. The differentiation matrices are obtained by the coordinate matrix formed by the chosen RBF and the matrices formed by the derivative of the chosen RBF [45,46,47]. In the current study, RBF is chosen as polyharmonic cubic spline RBF, f=r7, because this RBF does not depend on a shape parameter.
The time-space usage of RBF works block-wise in time as drawn in Figure 1.
In this configuration, x is kept in interval [0,1], and tmax is a fixed time value. bl denotes a time block and numbl is the number of blocks.
Let N be the number of nodes on the x-axis and let L be the number of time values in a block. tmax is divided into equally spaced time blocks, and Δbl is the time block increment calculated by
considering a time increment Δt in a block.
Inside a time block, the L number of Gauss-Chebyshev-Lobatto (GCL) time nodes are adopted. Note that Δt is only used for getting the uniform time block increment. Δt is not used in a time block. Instead, we have GCL time values as a time block. Each block behaves as a two dimensional geometry.
As is shown in the first time block tbl1, bi,bl, and br are boundary conditions in any block. bl and br correspond to the zero gradient boundaries in related equations. bi is the initial time at tbl1 and it is already given in the first block, but in other blocks, bi is settled as the found values on bt.
In each block, triplet Eq (2.1) is solved iteratively until a termination criterion
is satisfied.
Global RBF approximates an unknown φ as
where ˉαj's are initially unknown coefficients, f's are approximating functions formed by RBFs depending on radial distance
in which x is the field point, xj is the collocation point, and N is the number of nodes on the x-axis.
In matrix-vector form, Eq (3.2) may also be written as
where the matrix F is formed by fj's column-wise, and ˉα is the initially unknown vector.
The first and second order space derivatives of φ are derived by using F and Eq (3.3) as
So, letting
and
and using backward differentiation formula of order 2 (BDF2) in time, the iterative solution proceeds in a time block.
Further, the differentiation matrix in time is also used. The coordinate matrix Ft is found first by using time values in the current time block. Then, D′t will be
The system as Ax=b is constructed by Kronecker products [48] of these differentiation matrices. That is, the differentiation matrices for the x- and t- derivatives, respectively, used in a time block are
where kron is the Kronecker tensor product and eye(∗) denotes the identity matrix of size ∗.
After getting the differentiation matrices, the dimensionless nonlinear governing equations are iteratively solved in a time block as
where subindex d denotes the diagonal which is necessary for defined products.
4.
Machine learning approaches
In this part, a short description of the machine learning techniques, multilayer neural network (NN), and multivariate adaptive regression splines, is given.
4.1. Multilayer NN
In multilayer NN, the layers start with an input layer and end with an output layer. The layers between the input and the output one are called as hidden layers. The feedforward NN works in one direction from the input layer to the output layer. That is, flow of information goes from the previous layer to the next one. Each input data in this flow is multiplied by a weight, and a bias is added. Then, these are summed, and affected by an activation function to get an output. Weights are randomly attained, to start. An update on weights is performed based on the minimization of a cost function (or loss function) by the gradient descent method.
Mathematically, an equation between layers may be written as [49]
where ˉn is the layer number, ℓ refers to the data in layer ˉn, W is the weight matrix, b is the bias vector involving intercept terms, and f is the activation function chosen as
in the current study. Note that ˉn=0 and ˉn=4 correspond to the input and the output layers.
It is worth it to mention here that this activation function is the same basis function defined in multivariate adaptive regression splines (Mars) modeling [50]. In the current study, the trilayer neural network (TNN) based on the feedforward NN in Matlab is utilized.
4.2. Mars
In Mars, a function f is approximated by [50]
where mb is the number of basis functions, ci's are unknown coefficients, and bi's are basis functions defined by
where zi's are called knots coming from the dataset.
This model function is set up as [50,51]
where nc is the number of coefficients, I=c0 is the intercept with D=0 and K=0, nin is the number of inputs, D is the directions ±1, and K is the cuts (knots). The functions h and sign are defined as
Two stages work in Mars implementation as forward and backward stages. An algorithm based on fast search finds basis functions added to the model in forward step. This results in overfitted dataset. An overfit model is pruned in the backward step.
Mars is implemented in R-project importing "earth" library. Inside earth, some parameter options are chosen. The option "nk" is the maximum number of terms generated by the forward stage, "nprune" is the maximum number of terms generated by the backward stage, "fast.k" is the maximum number of parent terms considered at each step of the forward pass, "thresh" is the stopping tolerance for forward step, and "degree" is the maximum number of interactions.
5.
Discussions on numerical and statistical results
5.1. Numerical observations
In this part, some numerical results in some parameter variations are presented. In visualization results, since the dimensional time scale is defined as 10−4 [7], time values in reality of figures are calculated by ((t/10−4)/3600)/24. For example, t=9 means that it is approximately 1 day.
5.1.1. The choice of N and Δt
In Table 2, absolute errors between mean CC density (mean CC) and mean uPA (mean u) values at t=90 at consecutive N and Δt values are presented when the parameters are fixed at Dc=Du=0.001,χc=0.03,ξc=0.05,μ1=0.05,μ2=0.15,α=0.05,β=0.15,δ=10. According to these results in this table, Δt=0.0125 and N=101 are used at all calculations.
5.2. A validation and some numerical results
First, a comparison with [7] in different time values is illustrated in Figure 2. As can be noted in Figure 2, present results are in good agreement with reference's results.
In Table 3, some numerically observed cases are listed. Figures 3–5 also present x versus CC, ECM, and uPA density plots in these cases in some chosen time values. In these figures, black plots denote CC, red plots show ECM, and blue plots display uPA densities.
In Figure 3, in Case 1, peak plots of CC occur in the case of larger uPA values in each figure. That is, the more uPA occurs, the more CC invades. In Case 2, the diffusion coefficient of uPA density is increased. Comparing mean CC values with Case 1, the rise in mean CC values after t=18 is obviously seen as expected. That is, if uPA diffuses faster, CCs occupy fast inside ECM. At t=180(≈20 days), ECM density almost vanishes. In Case 3, the diffusion coefficient of CC density is decreased compared to Case 2. By the decline in CC diffusion, mean CC values are declined in Case 3. Also, more flattened behavior of CC plots occurs at t=90 and t=180 comparing to Case 2.
In Case 4, an increased value of χc comparing to Case 1 is concerned. As is noted, CCs move slowly to the right side at t=9(≈1 day) comparing to the behavior in Case 1, and also mean CC values are smaller. Although a similar attitude is seen at t=18, CCs remain under the ECM in case of a larger chemotaxis coefficient. After 10.5(t=90) or 20(t=180) days, mean CC is reduced comparing to mean CC values obtained in Case 1.
A decreased value of χc constant in Case 5 in Figure 4, at t=90 and t=180, the system appears to be in a pause state, with little movement observed. This may be due to the absence of ECM at t=90 and t=180, which is vital for tumor cell migration. In other words, small chemotaxis inside tissue causes CC invasion very quick after t=18(≈2 days), and mean CC values reach to a saturated value without changing anymore after 20 days as is noted in Case 5. In Case 6, haptotaxis coefficient ξc is reduced to 0.002. Comparing to Case 1, the right side peak at CC in Case 1 is not observed in Case 6 at t=9(≈ 1 day). ECM and uPA also becomes almost zero at t=180. Case 7 demonstrates the influence of a larger value of μ1 comparing to Case 1. Proliferation rate of CCs rises with the rise in μ1. As is noted from mean CC values, too, mean CC density boosts as μ1 is increased comparing to Case 1. Furthermore, wavy peaks of CC plots occur at the same wavy peaks of uPA plots. That is, at t=90 and t=180, as the leading group of CCs invades further into the domain, a new group of cells is formed just behind them due to the increased CC proliferation caused by uPA-mediated signaling pathways. Case 8 depicts the impact of a larger μ2 value comparing to Case 1. The larger μ2 is, the larger rate of proliferation in ECM occurs. At t=90 and t=180, mean CC values are smaller than the values obtained in Case 1. This points to the existence of the larger proliferated ECM in the cell.
In Figure 5, Case 9 portrays the influence of a larger growth rate of uPA comparing to Case 1. The growth rate of CC density inside tissue is slower than Case 1 as is compared with mean CC values. A little bit sharper uPA peaks are also noticed at t=90 and t=180. Case 10 reveals the influence of the larger decay rate of ECM comparing to Case 1. Since the decay rate is larger, uPA becomes zero in Case 10 faster than Case 1. ECM dominates over CCs at t=9 and t=18. So, mean CC values are smaller than Case 1. However, since the ECM becomes almost zero, CC dominates over ECM at t=90 and a peaky CC at t=180 emerges noting that the larger mean CC values at t=90 and t=180 than Case 1. Case 11 reports the impact of a smaller value of degradation rate for ECM comparing to Case 1. That is, ECM degrades slower than δ=10 now. Therefore, domination of ECM over CC continues up to t=90 while ECM degrades faster in case of δ=10 in Case 1. Mean CC values are smaller until t=90 than Case 1, but then at t=180, it boosts comparing to Case 1.
In addition to the mean CC values given in each figure, other mean values in each case at selected time values are also presented in Table 4. Some comments on this table may be done as follows:
• The rise in diffusion coefficient of uPA (Case 2) is reflected by the larger values of mean uPA at t=90 and t=180 comparing to Case 1.
• The reduction in diffusion coefficient of CC density (Case 3) and mean ECM is a little bit larger as expected, and mean uPA is smaller in Case 3 comparing to Case 2.
• Checking Cases 4 and 5 together and comparing to Case 1, the less chemotaxis occurs (Case 5), the less mean ECM at t=9 and t=18, and the more mean uPA at t=90 and t=180 are found.
• In the case of the decrease in haptotaxis coefficient (Case 6), mean CC was decreasing as time passes in figures in Case 6 comparing to Case 1. This is confirmed by the larger values of mean ECM in Case 6 in this table.
• In Case 7, the larger the proliferation rate of CC is, the larger the mean uPA than Case 1 is found.
• In Case 8, the larger the proliferation rate of ECM is, the larger the mean ECM is achieved.
• In Case 9, if the production rate of uPA is larger than Case 1, mean ECM is found smaller than in Case 1.
• In Case 10, mean uPA values support the larger degradation rate in uPA comparing to Case 1.
• The less the degradation rate in ECM is, the more mean ECM is found in Case 11 comparing to Case 1. Further, in Case 11, ECM degrades and becomes closer to zero slower than any other case, too.
5.3. Machine learning modeling
First, 5320 number of parameter combinations are produced. These cases are executed in Matlab by parallel computing. In each execution, the desired data is saved in the case of convergent results. In all of these computations,
are fixed.
The convergent results are saved as a matrix. In the saved data, the first column involves time values. The time values are starting from t=9, and are incremented 9 again until a maximum time level,
The other columns are for Dc,Dm,χc,ξc,μ1,μ2,α,β,δ values. The last three columns correspond to the mean CC density, mean ECM and mean uPA density. In this way, an input-output data is saved from numerical results. The obtained size of the data is 53100, which means that the number of convergent cases is 2655 inside 5320 combinations.
The data is divided into train (80%) and test data (20%) randomly. This division or separation is done in Matlab by using "dividerand" syntax. The divided train and test data are saved. Since the numerical results and, therefore, the data are saved in Matlab, TNN is implemented in Matlab by using syntaxes which are called by means of statistics and the machine learning toolbox. Mars implementation is employed in R-Project because of the variety in options existing in earth module used for Mars modeling. The same saved train and test data in Matlab are used in the R-project.
5.3.1. TNN modeling
In Matlab, using "fitrnet" with 100 layer sizes in each three layers and with 5-fold cross validation, models are created. The quality of the model prediction is checked in Table 5 with error metrics mean squared error (MSE) and R-squared (R-Sq) error calculated on test data. These results approve the good fit.
In Figure 6, the top three subplots show the actual (true) test data versus predicted test data results. The black line shows the perfect prediction (true test data versus true test data). The bottom three subplots display the residuals. The obtained results are satisfactory in view of goodness of fit.
5.4. Mars modeling
Mars model is also created in R-project. In Mars modeling, some of the parameters inside "earth" are set as nk = 1000, degree = 10, thresh = 1e-11, nprune = 1000, penalty = -1, trace = 2, fast.k = 1000. MSE and R-Sq metrics on test data are presented in Table 6. Cross validation is not used in this modeling. TNN model results are more powerful than Mars model results.
Harmony between true test data (horizontal axes) and predicted test data is also illustrated in Figure 7.
6.
Conclusions
In this paper, it is shown that the machine learning process on mean CC, mean ECM, and mean uPA may be achieved by using some of the numerical results of the considered mathematical model of CC invasion of tissue. In our case, one-dimensional chemotaxis-haptotaxis model of CC invasion inside tissue is taken into account, and the associated dimensionless governing equations are numerically solved by the global radial basis function method both in space and in time. The advance both in space and in time makes the process two-dimensional. In some parameter variations in dimensionless governing equations, the observed behavior of CC, ECM and uPA densities are plotted.
Moreover, in a set of problem parameter combination, numerical method is executed by parallel computing. An input-output data from convergent results is saved in which the outputs are mean CC density, mean ECM density, and mean uPA density. The data is separated into train (80%) and test data (20%) randomly. The trained data is used for modeling in the TNN and Mars. The models are tested on test data. In terms of MSE and R-Sq metrics on test data, the TNN is a powerful tool for modeling comparing to Mars. Although the TNN uses the same activation function with Mars, Mars could be advanced.
As a future study, this idea may be integrated for a real life data instead of data obtained by numerical results. The determination of mean CC density, mean ECM, and mean uPA may be an indication for the stage of CC invasion inside tissue.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
This study is carried out under the TED University's Institutional Research Fund granted between March 2023 to January 2024.
Conflict of interest
All authors declare no conflicts of interest in this paper.