Graph neural networks (GNNs) is applied successfully in many graph tasks, but there still exists a limitation that many of GNNs model do not consider uncertainty quantification of its output predictions. For uncertainty quantification, there are mainly two types of methods which are frequentist and Bayesian. But both methods need to sampling to gradually approximate the real distribution, in contrast, evidential deep learning formulates learning as an evidence acquisition process, which could get uncertainty quantification by placing evidential priors over the original Gaussian likelihood function and training the NN to infer the hyperparameters of the evidential distribution without sampling. So evidential deep learning (EDL) has its own advantage in measuring uncertainty. We apply it with diffusion convolutional recurrent neural network (DCRNN), and do the experiment in spatiotemporal forecasting task in a real-world traffic dataset. And we choose mean interval scores (MIS), a good metric for uncertainty quantification. We summarized the advantages of each method.
Citation: Zhiyuan Feng, Kai Qi, Bin Shi, Hao Mei, Qinghua Zheng, Hua Wei. Deep evidential learning in diffusion convolutional recurrent neural network[J]. Electronic Research Archive, 2023, 31(4): 2252-2264. doi: 10.3934/era.2023115
Graph neural networks (GNNs) is applied successfully in many graph tasks, but there still exists a limitation that many of GNNs model do not consider uncertainty quantification of its output predictions. For uncertainty quantification, there are mainly two types of methods which are frequentist and Bayesian. But both methods need to sampling to gradually approximate the real distribution, in contrast, evidential deep learning formulates learning as an evidence acquisition process, which could get uncertainty quantification by placing evidential priors over the original Gaussian likelihood function and training the NN to infer the hyperparameters of the evidential distribution without sampling. So evidential deep learning (EDL) has its own advantage in measuring uncertainty. We apply it with diffusion convolutional recurrent neural network (DCRNN), and do the experiment in spatiotemporal forecasting task in a real-world traffic dataset. And we choose mean interval scores (MIS), a good metric for uncertainty quantification. We summarized the advantages of each method.
[1] | T. N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, preprint, arXiv: 1609.02907. https://doi.org/10.48550/arXiv.1609.02907 |
[2] | A. Hasanzadeh, E. Hajiramezanali, K. R. Narayanan, N. Duffield, M. Zhou, X. Qian, Semi-implicit graph variational auto-encoders, Adv. Neural Inf. Process. Syst., 32 (2019). |
[3] | E. Hajiramezanali, A. Hasanzadeh, K. R. Narayanan, N. Duffield, M. Zhou, X. Qian, Variational graph recurrent neural networks, Adv. Neural Inf. Process. Syst., 32 (2019). |
[4] | Q. Li, Z. Han, X. Wu, Deeper insights into graph convolutional networks for semi-supervised learning, in Thirty-Second AAAI Conference on Atificial Intelligence, 2018. https://doi.org/10.1609/aaai.v32i1.11604 |
[5] | A. Hasanzadeh, E. Hajiramezanali, S. Boluki, M. Zhou, N. Duffield, K. R. Narayanan, et al., Bayesian graph neural networks with adaptive connection sampling, in International Conference on Machine Learning, (2020), 4094–4104. |
[6] | M. Sensoy, L. M. Kaplan, M. Kandemir, Evidential deep learning to quantify classification uncertainty, Adv. Neural Inf. Process. Syst., 31 (2018). |
[7] | A. Malinin, M. J. F. Gales, Predictive uncertainty estimation via prior networks, Adv. Neural Inf. Process. Syst., 31 (2018). |
[8] | Y. Li, R. Yu, C. Shahabi, Y. Liu, Diffusion convolutional recurrent neural network: Data-driven traffic forecasting, preprint, arXiv: 1707.01926. https://doi.org/10.48550/arXiv.1707.01926 |
[9] | R. Askanazi, F. X. Diebold, F. Schorfheide, M. Shin, On the comparison of interval forecasts, J. Time Ser. Anal., 39 (2018), 953–965. https://doi.org/10.1111/jtsa.12426 doi: 10.1111/jtsa.12426 |
[10] | D. Wu, L. Gao, X. Xiong, M. Chinazzi, A. Vespignani, Y. Ma, et al., Quantifying uncertainty in deep spatiotemporal forecasting, preprint, arXiv: 2105.11982. https://doi.org/10.48550/arXiv.2105.11982 |
[11] | H. V. Jagadish, J. Gehrke, A. Labrinidis, Y. Papakonstantinou, J. M. Patel, R. Ramakrishnan, et al., Big data and its technical challenges, Commun. ACM, 57 (2014), 86–94. https://doi.org/10.1145/2611567 doi: 10.1145/2611567 |
[12] | D. Balcan, V. Colizza, B. Gonçalves, A. Vespignani, Multiscale mobility networks and the spatial spreading of infectious diseases, in Proceedings of the National Academy of Sciences, 106 (2009), 21484–21489. https://doi.org/10.1073/pnas.0906910106 |
[13] | J. Bruna, W. Zaremba, A. D. Szlam, Y. LeCun, Spectral networks and locally connected networks on graphs, preprint, arXiv: 1312.6203. https://doi.org/10.48550/arXiv.1312.6203 |
[14] | M. Defferrard, X. Bresson, P. Vandergheynst, Convolutional neural networks on graphs with fast localized spectral filtering, Adv. Neural Inf. Process. Syst., 29 (2016). |
[15] | Y. Seo, M. Defferrard, P. Vandergheynst, X. Bresson, Structured sequence modeling with graph convolutional recurrent networks, in Neural Information Processing: 25th International Conference, (2018), 362–373. https://doi.org/10.1007/978-3-030-04167-0_33 |
[16] | S. J. Hanson, In advances in neural information processing systems, in NIPS 1990, 1990. |
[17] | Y. Hechtlinger, P. Chakravarti, J. Qin, A generalization of convolutional neural networks to graph-structured data, preprint, arXiv: 1704.08165. https://doi.org/10.48550/arXiv.1704.08165 |
[18] | A. Kendall, Y. Gal, What uncertainties do we need in Bayesian deep learning for computer vision, Adv. Neural Inf. Process. Syst., 30 (2017). |
[19] | P. Harris, H. Haris, Reliable prediction intervals with regression neural networks, Neural Networks, 24 (2011), 842–851. https://doi.org/10.1016/j.neunet.2011.05.008 doi: 10.1016/j.neunet.2011.05.008 |
[20] | I. Osband, C. Blundell, A. Pritzel, B. Van Roy, Deep exploration via bootstrapped DQN, Adv. Neural Inf. Process. Syst., 29 (2016). |
[21] | D. A. Nix, A. S. Weigend, Estimating the mean and variance of the target probability distribution, in Proceedings of 1994 ieee international conference on neural networks, (1994), 55–60. https://doi.org/10.1109/ICNN.1994.374138 |
[22] | F. Richter, Mixture density networks, in Kombination Künstlicher Neuronaler Netze, Deutscher Universitätsverlag, 2003. |
[23] | I. Glitschenski, W. Schwarting, R. Sahoo, A. Amini, S. Karaman, Deep orientation uncertainty learning based on a bingham loss, in International Conference on Learning Representations, 2020. |
[24] | D. P. Kingma, T. Salimans, M. Welling, Variational dropout and the local reparameterization trick, Adv. Neural Inf. Process. Syst., 28 (2015). |
[25] | Y. Gal, Z. Ghahramani, Dropout as a bayesian approximation: Representing model uncertainty in deep learning, in International Conference on Machine Learning, (2016), 1050–1059. |
[26] | Y. A. Ma, T. Chen, E. Fox, A complete recipe for stochastic gradient MCMC, Adv. Neural Inf. Process. Syst., 28 (2015). |
[27] | M. Welling, Y. W. Teh, Bayesian learning via stochastic gradient langevin dynamics, in Proceedings of the 28th international conference on machine learning, (2011), 681–688. |
[28] | J. B. Grill, F. Strub, F. Altch'e, C. Tallec, P. Richemond, E. Buchatskaya, et al., Bootstrap your own latent: A new approach to self-supervised learning, 33 (2020), 21271–21284. |
[29] | F. Marilena, V. Domenico, Quantile Regression, Wiley series in Probability and Statistics, 2018. |
[30] | J. Chung, C. Gulcehre, K. H. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling, preprint, arXiv: 1412.3555. https://doi.org/10.48550/arXiv.1412.3555 |
[31] | A. Alexander, S. Wilko, S. Ava, R. Daniela, Deep evidential regression, Adv. Neural Inf. Process. Syst., 33 (2020), 14927–14937. |
[32] | D. Oh, B. Shin, Improving evidential deep learning via multi-task learning, in Proceedings of the AAAI Conference on Artificial Intelligence, (2022), 7895–7903. https://doi.org/10.1609/aaai.v36i7.20759 |
[33] | J. Qi, J. Du, S. M. Siniscalchi, X. Ma, C. Lee, On mean absolute error for deep neural network based vector-to-vector regression, IEEE Signal Process. Lett., 27 (2020), 1485–1489. https://doi.org/10.1109/LSP.2020.3016837 doi: 10.1109/LSP.2020.3016837 |
[34] | D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, P. Vandergheynst, The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains, IEEE Signal Process. Mag., 30 (2013), 83–98. https://doi.org/10.1109/MSP.2012.2235192 doi: 10.1109/MSP.2012.2235192 |
[35] | K. Kim, B. M. Ji, D. Yoon, S. Hwang, Self-knowledge distillation: A simple way for better generalization, preprint, arXiv: 2006.12000. |
[36] | H. Lin, Z. Gao, Y. Xu, L. Wu, L. Li, S. Z. Li, Conditional local convolution for spatio-temporal meteorological forecasting, in Proceedings of the AAAI Conference on Artificial Intelligence, 36 (2022), 7470–7478. https://doi.org/10.1609/aaai.v36i7.20711 |
[37] | X. Geng, Y. Li, L. Wang, L. Zhang, Q. Yang, J. Ye, et al., Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting, in Proceedings of the AAAI conference on artificial intelligence, 33 (2019), 3656–3663. https://doi.org/10.1609/aaai.v33i01.33013656 |
[38] | J. E. Matheson, R. L. Winkler, Scoring rules for continuous probability distributions, Manage. Sci., 22 (1976), 1087–1096. https://doi.org/10.1287/mnsc.22.10.1087 doi: 10.1287/mnsc.22.10.1087 |
[39] | A. Vehtari, A. Gelman, J. Gabry, Practical bayesian model evaluation using leave-one-out cross-validation and WAIC, Stat. Comput., 27 (2017), 1413–1432. https://doi.org/10.1007/s11222-016-9696-4 doi: 10.1007/s11222-016-9696-4 |
[40] | N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskeve, R. Salakhutdino, Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15 (2014), 1929–1958. |
[41] | J. Atwood, D. Towsley, Diffusion-convolutional neural networks, Adv. Neural Inf. Process. Syst., 29 (2016). |
[42] | A. Amini, W. Schwarting, A. P. Soleimany, D. Rus, Deep evidential regression, Adv. Neural Inf. Process. Syst., 33 (2020), 14927–14937. |