The water quality index (WQI) is an aggregated indicator used to represent the overall quality of water for any intended use. It is typically calculated from several biological, chemical, and physical parameters. Assessment of factors that affect the WQI is then essential. Climate change is expected to impact a wide range of water quality issues; hence, climate variables are likely to be significant factors to evaluate the WQI. We propose three statistical models; multiple linear regression (MLR), artificial neuron network (ANN), and Gaussian process regression (GPR) to assess the WQI using the climate variables. The data is the WQI of Ping River, which flows through the provinces in the north of Thailand. The climate variables are temperature, humidity, total rainfall, and evaporation. A comparison between these models is determined by model prediction accuracy scores. The results show that the total rainfall is the most significant variable to predict the WQI for the Ping River. Although these three methods can predict the WQI relatively good, overall, the GPR model performs better than the MLR and the ANN. Besides, the GPR is more flexible as it can relax some restrictions and assumptions. Therefore, the GPR is appropriate to assess the WQI under the climate variables for the Ping River.
Citation: Kamonrat Suphawan, Kuntalee Chaisee. Gaussian process regression for predicting water quality index: A case study on Ping River basin, Thailand[J]. AIMS Environmental Science, 2021, 8(3): 268-282. doi: 10.3934/environsci.2021018
The water quality index (WQI) is an aggregated indicator used to represent the overall quality of water for any intended use. It is typically calculated from several biological, chemical, and physical parameters. Assessment of factors that affect the WQI is then essential. Climate change is expected to impact a wide range of water quality issues; hence, climate variables are likely to be significant factors to evaluate the WQI. We propose three statistical models; multiple linear regression (MLR), artificial neuron network (ANN), and Gaussian process regression (GPR) to assess the WQI using the climate variables. The data is the WQI of Ping River, which flows through the provinces in the north of Thailand. The climate variables are temperature, humidity, total rainfall, and evaporation. A comparison between these models is determined by model prediction accuracy scores. The results show that the total rainfall is the most significant variable to predict the WQI for the Ping River. Although these three methods can predict the WQI relatively good, overall, the GPR model performs better than the MLR and the ANN. Besides, the GPR is more flexible as it can relax some restrictions and assumptions. Therefore, the GPR is appropriate to assess the WQI under the climate variables for the Ping River.
[1] | Holmberg M, Forsius M, Starr M, et. al (2006) An application of artificial neural networks to carbon, nitrogen and phosphorus concentrations in three boreal streams and impacts of climate change. Ecol Model 195: 51-60. doi: 10.1016/j.ecolmodel.2005.11.009 |
[2] | Majeed SAA, Saleh LAM, Aswed GK (2018) Modeling the water quality index and climate variables using an artificial neural network and non-linear regression. Int J Eng Technol 7: 1346-1350. doi: 10.14419/ijet.v7i3.9519 |
[3] | Sallam G, Elsayed EA (2018) Estimating relations between temperature, relative humidity as independed variables and selected water quality parameters in Lake Manzala, Egypt. Ain Shams Eng J 9: 1-14. doi: 10.1016/j.asej.2015.10.002 |
[4] | Anmala J, Venkateshwarlu T (2019) Statistical assessment and neural network modeling of stream water quality observations of Green River watershed, KY, USA. Water Supply 19: 1831-1840. doi: 10.2166/ws.2019.058 |
[5] | Mehdipour P, Navidi I, Parsaeian M, et. al (2014) Application of Gaussian Process Regression (GPR) in estimating under-five mortality levels and trends in Iran 1990-2013, study protocol. Arch Iran Med 17: 189-192. |
[6] | Asante-Okyere S, Shen C, Ziggah YY, et. al (2018) Investigating the predictive performance of Gaussian process regression in evaluating reservoir porosity and permeability. Energies 11: 3261-3274. doi: 10.3390/en11123261 |
[7] | Chaurasia P, Younis K, Qadri OS, et. al (2019) Comparison of Gaussian process regression, artificial neural network, and response surface methodology modeling approaches for predicting drying time of mosambi (Citrus limetta) peel. J Food Process Eng 42: e12966. doi: 10.1111/jfpe.12966 |
[8] | Fritsch S, Guenther F, Wright MN, et. al (2019) Training of Neural Networks. R package version 1.44.2 |
[9] | Shanmuganathan S (2016) Artificial neural network modelling: An introduction. Springer. |
[10] | Rasmussen CE, Williams C (2006) Gaussian processes for machine learning. The MIT Press. |
[11] | Dancik GM (2018) Maximum Likelihood Estimates of Gaussian Processes. R package version 3.1.7 |