Due to external factors such as political influences, specific events and sentiment information, stock prices exhibit randomness, high volatility and non-linear characteristics, making accurate predictions of future stock prices based solely on historical stock price data difficult. Consequently, data fusion methods have been increasingly applied to stock price prediction to extract comprehensive stock-related information by integrating multi-source heterogeneous stock data and fusing multiple decision results. Although data fusion plays a crucial role in stock price prediction, its application in this field lacks comprehensive and systematic summaries. Therefore, this paper explores the theoretical models used in each level of data fusion (data-level, feature-level and decision-level fusion) to review the development of stock price prediction from a data fusion perspective and provide an overall view. The research indicates that data fusion methods have been widely and effectively used in the field of stock price prediction. Additionally, future directions are proposed. For better performance of data fusion in the field of stock price prediction, future work can broaden the scope of stock-related data types used and explore new algorithms such as natural language processing (NLP) and generative adversarial networks (GAN) for text information processing.
Citation: Aihua Li, Qinyan Wei, Yong Shi, Zhidong Liu. Research on stock price prediction from a data fusion perspective[J]. Data Science in Finance and Economics, 2023, 3(3): 230-250. doi: 10.3934/DSFE.2023014
Due to external factors such as political influences, specific events and sentiment information, stock prices exhibit randomness, high volatility and non-linear characteristics, making accurate predictions of future stock prices based solely on historical stock price data difficult. Consequently, data fusion methods have been increasingly applied to stock price prediction to extract comprehensive stock-related information by integrating multi-source heterogeneous stock data and fusing multiple decision results. Although data fusion plays a crucial role in stock price prediction, its application in this field lacks comprehensive and systematic summaries. Therefore, this paper explores the theoretical models used in each level of data fusion (data-level, feature-level and decision-level fusion) to review the development of stock price prediction from a data fusion perspective and provide an overall view. The research indicates that data fusion methods have been widely and effectively used in the field of stock price prediction. Additionally, future directions are proposed. For better performance of data fusion in the field of stock price prediction, future work can broaden the scope of stock-related data types used and explore new algorithms such as natural language processing (NLP) and generative adversarial networks (GAN) for text information processing.
[1] | Abraham A, Auyeung A (2009) Integrating Ensemble of Intelligent Systems for Modeling Stock Indices. In: Mira, J., Álvarez, J.R., Artificial Neural Nets Problem Solving Methods, Eds., Berlin: Springer, 774–781. https://doi.org/10.1007/3-540-44869-1_98 |
[2] | Alhnaity B, Abbod MF (2020) A new hybrid financial time series prediction model. Eng Appl Artif Intel 95: 103873. https://doi.org/10.1016/j.engappai.2020.103873 doi: 10.1016/j.engappai.2020.103873 |
[3] | Ariyo AA, Adewumi AO, Ayo CK (2014) Stock Price Prediction Using the ARIMA Model. 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation, 106-112. https://doi.org/10.1109/uksim.2014.67 doi: 10.1109/uksim.2014.67 |
[4] | Barak S, Arjmand A, Ortobelli S (2017) Fusion of multiple diverse predictors in stock market. Inform Fusion 36: 90–102. https://doi.org/10.1016/j.inffus.2016.11.006 doi: 10.1016/j.inffus.2016.11.006 |
[5] | Bollen J, Mao H, Zeng X (2011) Twitter mood predicts the stock market. J Comput Sci 2: 1–8. https://doi.org/10.1016/j.jocs.2010.12.007 doi: 10.1016/j.jocs.2010.12.007 |
[6] | Brogaard J, Zareei A (2022) Machine Learning and the Stock Market. J Financ Quant Anal 58: 1431–1472. https://doi.org/10.1017/s0022109022001120 doi: 10.1017/s0022109022001120 |
[7] | Carta S, Corriga A, Ferreira A, et al. (2021) A multi-layer and multi-ensemble stock trader using deep learning and deep reinforcement learning. Appl Intell 51: 889–905. https://doi.org/10.1007/s10489-020-01839-5 doi: 10.1007/s10489-020-01839-5 |
[8] | Chandrasekara V, Tilakaratne CD, Mammadov M (2019) An Improved Probabilistic Neural Network Model for Directional Prediction of a Stock Market Index. Appl Sci 9: 5334. https://doi.org/10.3390/app9245334 doi: 10.3390/app9245334 |
[9] | Cheng K, Huang M, Fu C, et al. (2021) Establishing a Multiple-Criteria Decision-Making Model for Stock Investment Decisions Using Data Mining Techniques. Sustainability 13: 3100. https://doi.org/10.3390/su13063100 doi: 10.3390/su13063100 |
[10] | Chiong R, Fan Z, Hu Z, et al. (2018) A sentiment analysis-based machine learning approach for financial market prediction via news disclosures. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. https://doi.org/10.1145/3205651.3205682 |
[11] | Chong L, Lim KG, Lee CC (2020) Stock Market Prediction using Ensemble of Deep Neural Networks. 2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (ⅡCAIET), 1–5. https://doi.org/10.1109/iicaiet49801.2020.9257864 doi: 10.1109/iicaiet49801.2020.9257864 |
[12] | Daradkeh MK (2022) A Hybrid Data Analytics Framework with Sentiment Convergence and Multi-Feature Fusion for Stock Trend Prediction. Electronics 11: 250. https://doi.org/10.3390/electronics11020250 doi: 10.3390/electronics11020250 |
[13] | Dash R, Samal S, Dash R, et al. (2019) An integrated TOPSIS crow search based classifier ensemble: In application to stock index price movement prediction. Appl Soft Comput 85: 105784. https://doi.org/10.1016/j.asoc.2019.105784 doi: 10.1016/j.asoc.2019.105784 |
[14] | Evans L, Owda M, Crockett K, et al. (2018) Big Data Fusion Model for Heterogeneous Financial Market Data (FinDf). In Springer eBooks, 1085–1101. https://doi.org/10.1007/978-3-030-01054-6_75 |
[15] | Gandhmal DP, Kumar KS (2019) Systematic analysis and review of stock market prediction techniques. Comput Sci Rev 34: 100190. https://doi.org/10.1016/j.cosrev.2019.08.001 doi: 10.1016/j.cosrev.2019.08.001 |
[16] | García-Medina A, Sandoval L, Junior Bañuelos EU, et al. (2018) Correlations and flow of information between the New York Times and stock markets. Physica D 502: 403–415. https://doi.org/10.1016/j.physa.2018.02.154 doi: 10.1016/j.physa.2018.02.154 |
[17] | Giacomel FDS, Pereira ACM, Galante R (2015) Improving Financial Time Series Prediction Through Output Classification by a Neural Network Ensemble. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H., Database and Expert Systems Applications, Eds., Cham: Springer, 331–338. https://doi.org/10.1007/978-3-319-22852-5_28 |
[18] | Guo Z, Wang H, Liu Q, et al. (2014) A Feature Fusion Based Forecasting Model for Financial Time Series. PLOS ONE 9: e101113. https://doi.org/10.1371/journal.pone.0101113 doi: 10.1371/journal.pone.0101113 |
[19] | Ho TK, Hull JR, Srihari SN (1994) Decision combination in multiple classifier systems. IEEE T Pattern Anal 16: 66–75. https://doi.org/10.1109/34.273716 doi: 10.1109/34.273716 |
[20] | Hu Z, Zhao Y, Khushi M (2021) A Survey of Forex and Stock Price Prediction Using Deep Learning. Appl Syst Innov 4: 9. https://doi.org/10.3390/asi4010009 doi: 10.3390/asi4010009 |
[21] | Jeantheau T (2004) A link between complete models with stochastic volatility and ARCH models. Financ Stoch 8: 111–131. https://doi.org/10.1007/s00780-003-0103-6 doi: 10.1007/s00780-003-0103-6 |
[22] | Keller C, Siegrist M (2006) Investing in stocks: The influence of financial risk attitude and values-related money and stock market attitudes. J Econ Psychol 27: 285–303. https://doi.org/10.1016/j.joep.2005.07.002 doi: 10.1016/j.joep.2005.07.002 |
[23] | Khuwaja P, Khowaja SA, Khoso I, et al. (2020) Prediction of stock movement using phase space reconstruction and extreme learning machines. J Exp Theor Artif Intell 32: 59–79. https://doi.org/10.1080/0952813x.2019.1620870 doi: 10.1080/0952813x.2019.1620870 |
[24] | Kim T, Kim H (2019) Forecasting stock prices with a feature fusion LSTM-CNN model using different representations of the same data. PLOS ONE 14: e0212320. https://doi.org/10.1371/journal.pone.0212320 doi: 10.1371/journal.pone.0212320 |
[25] | Kristjanpoller RW, Michell VK (2018) A stock market risk forecasting model through integration of switching regime ANFIS and GARCH techniques. Appl Soft Comput 67: 106–116. https://doi.org/10.1016/j.asoc.2018.02.055 doi: 10.1016/j.asoc.2018.02.055 |
[26] | Kuo R, Lee LJ, Lee C (1996) Integration of artificial neural networks and fuzzy Delphi for stock market forecasting. 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems 2: 1073–1078. https://doi.org/10.1109/icsmc.1996.571232 doi: 10.1109/icsmc.1996.571232 |
[27] | Lahmiri S (2018) A Technical Analysis Information Fusion Approach for Stock Price Analysis and Modeling. Fluct Noise Lett 17: 1850007. https://doi.org/10.1142/s0219477518500074 doi: 10.1142/s0219477518500074 |
[28] | Lahmiri S, Boukadoum M (2015) Intelligent Ensemble Forecasting System of Stock Market Fluctuations Based on Symetric and Asymetric Wavelet Functions. Fluct Noise Lett 14: 1550033. https://doi.org/10.1142/s0219477515500339 doi: 10.1142/s0219477515500339 |
[29] | Lai S, Ye C, Zhou H (2021) Chinese stock trend prediction based on multi-feature learning and model fusion. 2021 IEEE International Conference on Smart Data Services (SMDS), 18–23. https://doi.org/10.1109/smds53860.2021.00013 doi: 10.1109/smds53860.2021.00013 |
[30] | Lee KC, Kim WH (1995) Integration of human knowledge and machine knowledge by using fuzzy post adjustment: its performance in stock market timing prediction. Expert Syst 12: 331–338. https://doi.org/10.1111/j.1468-0394.1995.tb00270.x doi: 10.1111/j.1468-0394.1995.tb00270.x |
[31] | Lee T, Teisseyre P, Lee J (2023) Effective Exploitation of Macroeconomic Indicators for Stock Direction Classification Using the Multimodal Fusion Transformer. IEEE Access 11: 10275–10287. https://doi.org/10.1109/access.2023.3240422 doi: 10.1109/access.2023.3240422 |
[32] | Li X, Wu P, Wang W (2020) Incorporating stock prices and news sentiments for stock market prediction: A case of Hong Kong. Inform Process Manag 57: 102212. https://doi.org/10.1016/j.ipm.2020.102212 doi: 10.1016/j.ipm.2020.102212 |
[33] | Li AH, Wang DW, Xu WJ, et al. (2022a) Anomaly Detection of Growth Enterprise Market Listed Companies with Financial Fraud Based on Data Fusion. Data Analysis and Knowledge Discovery 7: 33–47. Available from: http://kns.cnki.net/kcms/detail/10.1478.G2.20220920.1740.004.html |
[34] | Li AH, Xu WJ, Shi Y (2022b) Framework of business intelligence and analysis based on data fusion. Comput Sci 49: 185–194. https://doi.org/10.11896/jsjkx.211100080 doi: 10.11896/jsjkx.211100080 |
[35] | Lin G, Lin A, Cao J (2021) Multidimensional KNN algorithm based on EEMD and complexity measures in financial time series forecasting. Expert Syst Appl 168: 114443. https://doi.org/10.1016/j.eswa.2020.114443 doi: 10.1016/j.eswa.2020.114443 |
[36] | Liu P, Zhang Y, Bao F, et al. (2022) Multi-type data fusion framework based on deep reinforcement learning for algorithmic trading. Appl Intell 53: 1683–1706. https://doi.org/10.1007/s10489-022-03321-w doi: 10.1007/s10489-022-03321-w |
[37] | Liu Y, Yu X, Wu Y, et al. (2021a) Forecasting Variation Trends of Stocks via Multiscale Feature Fusion and Long Short-Term Memory Learning. Sci Programming 1–9. https://doi.org/10.1155/2021/5113151 doi: 10.1155/2021/5113151 |
[38] | Liu Z, Huynh TLD, Dai P (2021b) The impact of COVID-19 on the stock market crash risk in China. Res Int Bus Financ 57: 101419. https://doi.org/10.1016/j.ribaf.2021.101419 doi: 10.1016/j.ribaf.2021.101419 |
[39] | Long J, Chen Z, He W, et al. (2020) An integrated framework of deep learning and knowledge graph for prediction of stock price trend: An application in Chinese stock exchange market. Appl Soft Comput 91: 106205. https://doi.org/10.1016/j.asoc.2020.106205 doi: 10.1016/j.asoc.2020.106205 |
[40] | Lu R, Lu M (2021) Stock Trend Prediction Algorithm Based on Deep Recurrent Neural Network. Wirel Commun Mob Com 2021: 1–10. https://doi.org/10.1155/2021/5694975 doi: 10.1155/2021/5694975 |
[41] | Nofsinger JR (2005) Social Mood and Financial Economics. J Behav Financ 6: 144–160. https://doi.org/10.1207/s15427579jpfm0603_4 doi: 10.1207/s15427579jpfm0603_4 |
[42] | Malkiel BG, Fama EF (1970) EFFICIENT CAPITAL MARKETS: A REVIEW OF THEORY AND EMPIRICAL WORK. J Financ 25: 383–417. https://doi.org/10.1111/j.1540-6261.1970.tb00518.x doi: 10.1111/j.1540-6261.1970.tb00518.x |
[43] | Malkiel EF (2015) A random walk down Wall Street: the time-tested strategy for successful investing. Choice Reviews Online 52: 52–6493. https://doi.org/10.5860/choice.191812 doi: 10.5860/choice.191812 |
[44] | Melin P, Soto J, Castillo O, et al. (2012) A new approach for time series prediction using ensembles of ANFIS models. Expert Syst Appl 39: 3494–3506. https://doi.org/10.1016/j.eswa.2011.09.040 doi: 10.1016/j.eswa.2011.09.040 |
[45] | Nezhad MF, Bidgoli BM (2019) Development of an Ensemble Learning-based intelligent model for Stock Market Forecasting. Sci Iran 28: 395–411. https://doi.org/10.24200/sci.2019.50353.1654 doi: 10.24200/sci.2019.50353.1654 |
[46] | Nti IK, Adekoya AF, Weyori BA (2021) A novel multi-source information-fusion predictive framework based on deep neural networks for accuracy enhancement in stock market prediction. J Big Data 8: 1–28. https://doi.org/10.1186/s40537-020-00400-y doi: 10.1186/s40537-020-00400-y |
[47] | Qian B, Rasheed K (2007) Stock market prediction with multiple classifiers. Appl Intell 26: 25–33. https://doi.org/10.1007/s10489-006-0001-7 doi: 10.1007/s10489-006-0001-7 |
[48] | Qiu X, Zhu H, Suganthan PN, et al. (2017) Stock Price Forecasting with Empirical Mode Decomposition Based Ensemble 𝜈-Support -Support Vector Regression Model. In: Mandal, J., Dutta, P., Mukhopadhyay, S., Computational Intelligence, Communications, and Business Analytics, Eds., Singapore: Springer 775: 22–34. https://doi.org/10.1007/978-981-10-6427-2_2 |
[49] | Sawhney R, Mathur P, Mangal A, et al. (2020) Multimodal Multi-Task Financial Risk Forecasting. Proceedings of the 28th ACM International Conference on Multimedia, 456–465. https://doi.org/10.1145/3394171.3413752 doi: 10.1145/3394171.3413752 |
[50] | Shi S, Liu W, Jin M (2012) Stock price forecasting using a hybrid ARMA and BP neural network and Markov model. 2012 IEEE 14th International Conference on Communication Technology, 981–985. https://doi.org/10.1109/icct.2012.6511341 doi: 10.1109/icct.2012.6511341 |
[51] | Shi Z, Wu Z, Shi S, et al. (2022) High-Frequency Forecasting of Stock Volatility Based on Model Fusion and a Feature Reconstruction Neural Network. Electronics 11: 4057. https://doi.org/10.3390/electronics11234057 doi: 10.3390/electronics11234057 |
[52] | Shields R, Zein S, Brunet N (2021) An Analysis on the NASDAQ's Potential for Sustainable Investment Practices during the Financial Shock from COVID-19. Sustainability 13: 3748. https://doi.org/10.3390/su13073748 doi: 10.3390/su13073748 |
[53] | Stoean C, Paja W, Stoean R, et al. (2019) Deep architectures for long-term stock price prediction with a heuristic-based strategy for trading simulations. PLOS ONE 14: e0223593. https://doi.org/10.1371/journal.pone.0223593 doi: 10.1371/journal.pone.0223593 |
[54] | Sun L, Xu W, Liu J (2021) Two-channel Attention Mechanism Fusion Model of Stock Price Prediction Based on CNN-LSTM. ACM Transactions on Asian and Low-resource Language Information Processing 20: 1–12. https://doi.org/10.1145/3453693 doi: 10.1145/3453693 |
[55] | Thakkar A, Chaudhari K (2021) Fusion in stock market prediction: A decade survey on the necessity recent developments and potential future directions. Inform Fusion 65: 95–107. https://doi.org/10.1016/j.inffus.2020.08.019 doi: 10.1016/j.inffus.2020.08.019 |
[56] | Tulyakov S, Jaeger S, Govindaraju V, et al. (2008) Review of Classifier Combination Methods. In: Marinai, S., Fujisawa, H., Machine Learning in Document Analysis and Recognition. Eds., Berlin: Springer 90: 361–386. https://doi.org/10.1007/978-3-540-76280-5_14 |
[57] | Wang Q, Xu W, Zheng H (2018) Combining the wisdom of crowds and technical analysis for financial market prediction using deep random subspace ensembles. Neurocomputing 299: 51–61. https://doi.org/10.1016/j.neucom.2018.02.095 doi: 10.1016/j.neucom.2018.02.095 |
[58] | Wang Y, Liu H, Guo Q, et al. (2019) Stock Volatility Prediction by Hybrid Neural Network. IEEE Access 7: 154524–154534. https://doi.org/10.1109/access.2019.2949074 doi: 10.1109/access.2019.2949074 |
[59] | Wang Y, Yan K (2023) Application of Traditional Machine Learning Models for Quantitative Trading of Bitcoin. Artif Intell Evol 2023: 34–48. https://doi.org/10.37256/aie.4120232226 doi: 10.37256/aie.4120232226 |
[60] | Xiao J, Zhu X, Huang C, et al. (2019) A New Approach for Stock Price Analysis and Prediction Based on SSA and SVM. Intl J Inf Tech Decis Mak 18: 287–310. https://doi.org/10.1142/s021962201841002x doi: 10.1142/s021962201841002x |
[61] | Xie Q, Cheng G, Xu X, et al. (2018) Research Based on Stock Predicting Model of Neural Networks Ensemble Learning. MATEC Web of Conferences 232: 02029. https://doi.org/10.1051/matecconf/201823202029 doi: 10.1051/matecconf/201823202029 |
[62] | Yang Y, Hu X, Jiang H (2022) Group penalized logistic regressions predict up and down trends for stock prices. North Am J Econ Financ 59: 101564. https://doi.org/10.1016/j.najef.2021.101564 doi: 10.1016/j.najef.2021.101564 |
[63] | Yan K, Wang Y, Li Y (2023) Enhanced Bollinger Band Stock Quantitative Trading Strategy Based on Random Forest. Artif Intell Evol 2023: 22–33. https://doi.org/10.37256/aie.4120231991 doi: 10.37256/aie.4120231991 |
[64] | Yang YJ, Yang YM, Xiao JH (2020) A Hybrid Prediction Method for Stock Price Using LSTM and Ensemble EMD. Complexity 2020: 1–16. https://doi.org/10.1155/2020/6431712 doi: 10.1155/2020/6431712 |
[65] | Zhang C, Sjarif NNA, Ibrahim R (2022a) Decision Fusion for Stock Market Prediction: A Systematic Review. IEEE Access 10: 81364–81379. https://doi.org/10.1109/access.2022.3195942 doi: 10.1109/access.2022.3195942 |
[66] | Zhang G, Xu L, Xue Y (2017a) Model and forecast stock market behavior integrating investor sentiment analysis and transaction data. Cluster Comput 20: 789–803. https://doi.org/10.1007/s10586-017-0803-x doi: 10.1007/s10586-017-0803-x |
[67] | Zhang J, Li L, Chen W (2021) Predicting Stock Price Using Two-Stage Machine Learning Techniques. Comput Econ 57: 1237–1261. https://doi.org/10.1007/s10614-020-10013-5 doi: 10.1007/s10614-020-10013-5 |
[68] | Zhang Q, Qin C, Zhang Y, et al. (2022b) Transformer-based attention network for stock movement prediction. Expert Syst Appl 202: 117239. https://doi.org/10.1016/j.eswa.2022.117239 doi: 10.1016/j.eswa.2022.117239 |
[69] | Zhang X, Qu S, Huang J, et al. (2018) Stock Market Prediction via Multi-Source Multiple Instance Learning. IEEE Access 6: 50720–50728. https://doi.org/10.1109/access.2018.2869735 doi: 10.1109/access.2018.2869735 |
[70] | Zhang X, Zhang L (2022) Forecasting Method of Stock Market Volatility Based on Multidimensional Data Fusion. Wirel Commun Mob Comput 1–14. https://doi.org/10.1155/2022/6344064 doi: 10.1155/2022/6344064 |
[71] | Zhang X, Zhang Y, Wang S, et al. (2017b) Improving stock market prediction via heterogeneous information fusion. Knowl Based Syst 143: 236–247. https://doi.org/10.1016/j.knosys.2017.12.025 doi: 10.1016/j.knosys.2017.12.025 |
[72] | Zhang Y, Lu S (2021) Multi-Model Fusion Method and its Application in Prediction of Stock Index Movements. 2021 6th International Conference on Machine Learning Technologies, 58–64. https://doi.org/10.1145/3468891.3468900 doi: 10.1145/3468891.3468900 |
[73] | Zhong Y, Zhao Q, Rao W (2017) Predicting stock market indexes with world news. 2017 4th International Conference on Systems and Informatics (ICSAI), 1535–1540. https://doi.org/10.1109/icsai.2017.8248528 doi: 10.1109/icsai.2017.8248528 |
[74] | Zhou F, Zhang Q, Zhu Y, et al. (2022) T2V_TF: An adaptive timing encoding mechanism based Transformer with multi-source heterogeneous information fusion for portfolio management: A case of the Chinese A50 stocks. Expert Syst Appl 213: 119020. https://doi.org/10.1016/j.eswa.2022.119020 doi: 10.1016/j.eswa.2022.119020 |
[75] | Zhou Z, Xu K, Zhao J (2018) Tales of emotion and stock in China: volatility causality and prediction. World Wide Web 21: 1093–1116. https://doi.org/10.1007/s11280-017-0495-4 doi: 10.1007/s11280-017-0495-4 |