The ability to accurately predict stock price direction is important for investors and policymakers. We aim to predict the direction of daily stock returns for five major South African banks using ensemble machine learning techniques. Financial ratios were used as predictors in single classifier and ensemble models. The key findings were that the support vector machine performed best among single classifiers, with the highest accuracy for 4 banks ranging from 54% to 99% and produces fewer wrong classifications compared to its peer single classifiers. More importantly, the heterogeneous ensemble classifier, combining support vector machines, decision trees and k- (KNN) nearest neighbors, achieved average accuracy rates above 95% and outperformed all other models. This confirms that ensemble methods that combine multiple models can generate more accurate predictions compared to single classifiers. The results suggest that the heterogeneous ensemble is a suitable approach for predicting stock price direction in the South African banking sector. The findings imply that investing in banks may be a good decision and can assist investors. However, further research could expand the models to incorporate macroeconomic and other external factors that influence stock prices. Overall, we demonstrate the value of ensemble learning for a complex forecasting problem. The heterogeneous ensemble approach achieved high accuracy and outperformed single classifiers. However, future research incorporating additional factors and policy implications could build on these findings.
Citation: Angelica Mcwera, Jules Clement Mba. Predicting stock market direction in South African banking sector using ensemble machine learning techniques[J]. Data Science in Finance and Economics, 2023, 3(4): 401-426. doi: 10.3934/DSFE.2023023
The ability to accurately predict stock price direction is important for investors and policymakers. We aim to predict the direction of daily stock returns for five major South African banks using ensemble machine learning techniques. Financial ratios were used as predictors in single classifier and ensemble models. The key findings were that the support vector machine performed best among single classifiers, with the highest accuracy for 4 banks ranging from 54% to 99% and produces fewer wrong classifications compared to its peer single classifiers. More importantly, the heterogeneous ensemble classifier, combining support vector machines, decision trees and k- (KNN) nearest neighbors, achieved average accuracy rates above 95% and outperformed all other models. This confirms that ensemble methods that combine multiple models can generate more accurate predictions compared to single classifiers. The results suggest that the heterogeneous ensemble is a suitable approach for predicting stock price direction in the South African banking sector. The findings imply that investing in banks may be a good decision and can assist investors. However, further research could expand the models to incorporate macroeconomic and other external factors that influence stock prices. Overall, we demonstrate the value of ensemble learning for a complex forecasting problem. The heterogeneous ensemble approach achieved high accuracy and outperformed single classifiers. However, future research incorporating additional factors and policy implications could build on these findings.
[1] | Ahamed J, Mir RN, Chishti MA (2022) Industry 4.0 oriented predictive analytics of cardiovascular diseases using machine learning, hyperparameter tuning and ensemble techniques. Ind Robo 49: 544–554. https://doi.org/10.1108/IR-10-2021-0240 doi: 10.1108/IR-10-2021-0240 |
[2] | Albanis G, Batchelor R (2007) Combining heterogeneous classifiers for stock selection. Intell Syst Account Financ Manag 15: 1–21. https://doi.org/10.1002/isaf.282 doi: 10.1002/isaf.282 |
[3] | Basistha A, Kurov A (2008) Macroeconomic cycles and the stock market's reaction to monetary policy. J Bank Financ 32: 2606–2616. https://doi.org/10.2139/ssrn.1092246 doi: 10.2139/ssrn.1092246 |
[4] | Bonga-Bonga L (2012) Equity prices, monetary policy, and economic activities in emerging market economies: The case of South Africa. J Appl Bus Res 28: 1217–1228. https://doi.org/10.19030/jabr.v28i6.7337 doi: 10.19030/jabr.v28i6.7337 |
[5] | Brown GW, Cliff MT (2004) Investor sentiment and the near-term stock market. J Empir Financ 11: 1–27. https://doi.org/10.1016/j.jempfin.2002.12.001 doi: 10.1016/j.jempfin.2002.12.001 |
[6] | Chen K, Zhou Y, Dai F (2015) A LSTM-based method for stock returns prediction: A case study of China stock market. In 2015 IEEE international conference on big data (big data), 2823–2824. https://doi.org/10.1109/ACCESS.2019.2953542 doi: 10.1109/ACCESS.2019.2953542 |
[7] | De Long JB, Shleifer A, Summers LH, et al. (1990) Noise trader risk in financial markets. J Polit Econ 98: 703–738. https://doi.org/10.1086/261703 doi: 10.1086/261703 |
[8] | Dimingo R (2019) Prediction of stock market returns and direction: application of machine learning models. University of Johannesburg, South Africa, 77. Available from: https://hdl.handle.net/10210/414991. |
[9] | Fonseca AR, Leles MC, Moreira MG, et al. (2021) Testing the application of support vector machine (SVM) to technical trading rules. In 2021 IEEE International Systems Conference (SysCon), 1–8. https://doi.org/10.1109/SysCon48628.2021.9447068 doi: 10.1109/SysCon48628.2021.9447068 |
[10] | Galdi P, Tagliaferri R (2018) Data mining: accuracy and error measures for classification and prediction. Encyclopedia of Bioinformatics and Computational Biology, 1: 431–436. https://doi.org/10.1016/B978-0-12-809633-8.20474-3 doi: 10.1016/B978-0-12-809633-8.20474-3 |
[11] | Geetha E, Swaaminathan TM (2015) A study on the factors influencing stock price A Comparative study of Automobile and Information Technology Industries stocks in India. Int J Curr Res Acad Rev 3: 97–109. https://doi.org/10.20546/ijcrar.2015.303.011 doi: 10.20546/ijcrar.2015.303.011 |
[12] | Gonzalez RT, Padilha CA, Barone DAC (2015) Ensemble system based on genetic algorithm for stock market forecasting. In 2015 IEEE congress on evolutionary computation (CEC), 3102–3108. https://doi.org/10.1109/CEC.2015.7257276 doi: 10.1109/CEC.2015.7257276 |
[13] | Goyal N, Kumar N, Kapil (2022) Leaf Bagging: A novel meta heuristic optimization based framework for leaf identification. Multimedia Tools Appl 81: 32243–32264. https://doi.org/10.1007/s11042-022-12825-z doi: 10.1007/s11042-022-12825-z |
[14] | Gupta P, Seth DD (2022) Improving the Prediction of Heart Disease Using Ensemble Learning and Feature Selection. Int J Adv Soft Comput Appl 14: 37–40. https://doi.org/10.15849/IJASCA.220720.03 doi: 10.15849/IJASCA.220720.03 |
[15] | Hassan MR, Nath B, Kirley M (2007) A fusion model of HMM, ANN and GA for stock market forecasting. Expert Syst Appl 33: 171–180. https://doi.org/10.1016/j.eswa.2006.04.007 doi: 10.1016/j.eswa.2006.04.007 |
[16] | Hu X, Madden LV, Edwards S, et al. (2015) Combining models is more likely to give better predictions than single models. Phytopathology 105: 1174–1182. https://doi.org/10.1094/PHYTO-11-14-0315-R doi: 10.1094/PHYTO-11-14-0315-R |
[17] | Huang W, Nakamori Y, Wang Y (2005) Forecasting stock market movement direction with support vector machine. Comput Operat Res 32: 2513–2522. https://doi.org/10.1016/j.cor.2004.03.016 doi: 10.1016/j.cor.2004.03.016 |
[18] | Ifeacho C, Ngalawa H (2014) Performance of the South African banking sector since 1994. J Appl Bus Res 30: 1183–1196. https://doi.org/10.19030/jabr.v30i4.8663 doi: 10.19030/jabr.v30i4.8663 |
[19] | Ioannidis C, Kontonikas A (2008) The impact of monetary policy on stock prices. J Policy Model 30: 33–53. https://doi.org/10.1016/j.jpolmod.2007.06.015 doi: 10.1016/j.jpolmod.2007.06.015 |
[20] | Kabari LG, Onwuka UC (2019) Comparison of bagging and voting ensemble machine learning algorithm as a classifier. Int J Adv Res Comput Sci Soft Eng 9: 19–23. |
[21] | Khan MMR, Arif RB, Siddique MAB, et al. (2018) Study and observation of the variation of accuracies of KNN, SVM, LMNN, ENN algorithms on eleven different datasets from UCI machine learning repository. In 2018 4th International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT), 124–129. https://doi.org/10.1109/CEEICT.2018.8628041 doi: 10.1109/CEEICT.2018.8628041 |
[22] | Kheradyar S, Ibrahim I (2011) Stock return predictability with financial ratios. Int J Trade Economics Financ 2: 391–396. https://doi.org/10.7763/IJTEF.2011.V2.137 doi: 10.7763/IJTEF.2011.V2.137 |
[23] | Khumalo S, Ferreira-Schenk S, Van Rensburg JJ, et al. (2021) Evaluating the Credit Risk and Macroeconomic Interaction in South African Banks. Acta Universitatis Danubius. Œconomica 17. 66–82. Available from: https://dj.univ-danubius.ro/index.php/AUDOE/article/view/964/1647. |
[24] | Kim KJ (2003) Financial time series forecasting using support vector machines. Neurocomputing 55: 307–319. https://doi.org/10.1109/CIS.2014.22 doi: 10.1109/CIS.2014.22 |
[25] | Lawal AI, Somoye RO, Babajide AA, et al. (2018) The effect of fiscal and monetary policies interaction on stock market performance: Evidence from Nigeria. Future Bus J 4: 16–33. https://doi.org/10.1016/j.fbj.2017.11.004 doi: 10.1016/j.fbj.2017.11.004 |
[26] | Lee J, Kim R, Koh Y, et al. (2019) Global stock market prediction based on stock chart images using deep Q-network. IEEE Access 7: 167260–167277. https://doi.org/10.1109/ACCESS.2019.2953542 doi: 10.1109/ACCESS.2019.2953542 |
[27] | Letsoalo MM (2021) The Profitability-Structure Phenomenon: Evidence from the South African Banking Industry. University of Johannesburg (South Africa). https://doi.org/10.20546/ijcrar.2015.303.011 |
[28] | Mamela TL, Sukdeo N, Mukwakungu SC (2020) The integration of AI on workforce performance for a South African Banking Institution. In 2020 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems (icABCD), 1–8. 10.1109/icABCD49160.2020.9183834 |
[29] | Maysami RC, Koh TS (2000) A vector error correction model of the Singapore stock market. Int Rev Econ Financ 9: 79–96. https://doi.org/10.1016/S1059-0560(99)00042-8 doi: 10.1016/S1059-0560(99)00042-8 |
[30] | Murphy JJ (1999) Technical analysis of the financial markets: A comprehensive guide to trading methods and applications. Penguin. https://doi.org/10.1007/978-1-4757-3264-1 |
[31] | Patel J, Shah S, Thakkar P, et al. (2015) Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Syst Appl 42: 259–268. https://doi.org/10.1016/j.eswa.2014.07.040 doi: 10.1016/j.eswa.2014.07.040 |
[32] | Prasad D, Goyal SK, Sharma A, et al. (2019) System model for prediction analytics using k-nearest neighbours algorithm. J Comput Theor Nanosci 16: 4425–4430. https://doi.org/10.1166/jctn.2019.8536 doi: 10.1166/jctn.2019.8536 |
[33] | Pring MJ (2021) Technical Analysis Explained: The Successful Investor's to Spotting investment trends turning points. McGraw-Hill. https://doi.org/10.1036/0071381937 doi: 10.1036/0071381937 |
[34] | Rigobon R, Sack B (2003) Measuring the reaction of monetary policy to the stock market. Q J Econ 118: 639–669. https://doi.org/10.1162/003355303321675473 doi: 10.1162/003355303321675473 |
[35] | Salzberg S, Chandar R, Ford H, et al. (1995) Decision trees for automated identification of cosmic-ray hits in Hubble Space Telescope images. Publ Astron Soc Pac 107: 279–288. https://doi.org/10.1086/133551 doi: 10.1086/133551 |
[36] | Selvin S, Vinayakumar R, Gopalakrishnan EA, et al. (2017) Stock price prediction using LSTM, RNN and CNN-sliding window model. In 2017 international conference on advances in computing, communications and informatics (icacci), 1643–1647. https://doi.org/10.1109/ICACCI.2017.8126078 doi: 10.1109/ICACCI.2017.8126078 |
[37] | Sewell M (2011) History of the efficient market hypothesis. Rn 11: 14. Available from: http://www.cs.ucl.ac.uk/fileadmin/UCL-CS/images/Research_Student_Information/RN_11_04.pdf. |
[38] | Silwal PP, Napit S (2019) Fundamentals of Stock Price in Nepalese commercial banks. Int Res J Manag Sci 4: 83–98. https://doi.org/10.3126/irjms.v4i0.27887 doi: 10.3126/irjms.v4i0.27887 |
[39] | Shakya D, Agarwal M, Deshpande V, et al. (2022) Estimating particle froude number of sewer pipes by boosting machine-learning models. J Pipeline Syst Eng 13: 04022012. https://doi.org/10.1061/(ASCE)PS.1949-1204.0000643 doi: 10.1061/(ASCE)PS.1949-1204.0000643 |
[40] | Shen P (2000) The P/E ratio and stock market performance. Economic review-Federal reserve bank of Kansas City 85: 23–36. |
[41] | Silva J, Rojas K, Naveda AS, et al. (2020) Assembly of classifiers to determine the academic profile of students. Procedia Comput Sci 170: 953–958. https://doi.org/10.1016/j.procs.2020.03.102 doi: 10.1016/j.procs.2020.03.102 |
[42] | Singh T, Mehta S, Varsha MS (2010) Macroeconomic factors and stock returns: Evidence from Taiwan. J Econ Int Financ 2: 217–227. Available from: https://www.researchgate.net/publication/228985237_Macroeconomic_factor_and_stock_returns_Evidence_from_Taiwan. |
[43] | Sun J, Li H (2012) Financial distress prediction using support vector machines: Ensemble vs. individual. Appl Soft Comput 12: 2254–2265. https://doi.org/10.1016/j.asoc.2012.03.063 doi: 10.1016/j.asoc.2012.03.063 |
[44] | Tchereni BH, Mpini S (2020) Monetary policy shocks and stock market volatility in emerging markets. Risk Gov Control Financ Mark I 10: 50–61. https://doi.org/10.22495/rgcv10i3p4 doi: 10.22495/rgcv10i3p4 |
[45] | Todorovski L, Džeroski S (2003) Combining classifiers with meta decision trees. Mach learn 50: 223–249. https://doi.org/10.1023/A:1021709817809 doi: 10.1023/A:1021709817809 |
[46] | Tsai CF, Lin YC, Yen DC, et al. (2011) Predicting stock returns by classifier ensembles. Appl Soft Comput 11: 2452–2459. https://doi.org/10.1016/j.asoc.2010.10.001 doi: 10.1016/j.asoc.2010.10.001 |
[47] | Yun KK, Yoon SW, Won D (2021) Prediction of stock price direction using a hybrid GA-XGBoost algorithm with a three-stage feature engineering process. Expert Syst Appl 186: 115716. https://doi.org/10.1016/j.eswa.2021.115716 doi: 10.1016/j.eswa.2021.115716 |
[48] | Zahedi J, Rounaghi MM (2015) Application of artificial neural network models and principal component analysis method in predicting stock prices on Tehran Stock Exchange. Physica A 438: 178–187. https://doi.org/10.1016/j.physa.2015.06.033 doi: 10.1016/j.physa.2015.06.033 |