Feature | Description |
Customer ID | Unused |
Credit score | Input |
Country | Input |
Gender | Input |
Age | Input |
Tenure | Input |
Balance | Input |
Products number | Input |
Credit card | Input |
Active member | Input |
Estimated | Input |
Churn | Target |
The relationship between real exchange rate volatility and the trade balance has been a contentious issue since the fall of Bretton woods agreement of 1973, owing to the lack of unanimity on the effect. This article provides empirical evidence of the link between the real exchange rate volatility and the trade balance in the light of financial development, confirming the assertion that the effect is significantly dependent on the country's level of financial development. Due to Nigeria's relatively undeveloped financial system, its exchange rate dampens the country's exports. Rather than studying the relationship in isolation, we examine the moderating role of financial development on the link between export and the real exchange rate volatility in this paper. The empirical estimation is based on the Nigeria's data set spanning the years 1980–2019, and it employs threshold autoregressive non-linear co-integration and non-linear ARDL estimation techniques. According to the findings, financial development magnifies the beneficial benefits of the real exchange rate on Nigeria's foreign trade. It also states that the uncertainty in foreign capital flows has a negative impact on Nigeria's international trade. The findings have broad policy implications, implying that in order to diversify and improve the economy's future growth and associated international trade, Nigeria's policymakers should promote adequate financial sector development, as financial shocks are amplified by poorly implemented credit markets.
Citation: Nuraddeen Umar Sambo, Ibrahim Sambo Farouq, Mukhtar Tijjani Isma'il. Asymmetric effect of exchange rate volatility on trade balance in Nigeria[J]. National Accounting Review, 2021, 3(3): 342-359. doi: 10.3934/NAR.2021018
[1] | Aditya Subhash Khanna, Mert Edali, Jonathan Ozik, Nicholson Collier, Anna Hotton, Abigail Skwara, Babak Mahdavi Ardestani, Russell Brewer, Kayo Fujimoto, Nina Harawa, John A. Schneider . Projecting the number of new HIV infections to formulate the "Getting to Zero" strategy in Illinois, USA. Mathematical Biosciences and Engineering, 2021, 18(4): 3922-3938. doi: 10.3934/mbe.2021196 |
[2] | Loïc Michel, Cristiana J. Silva, Delfim F. M. Torres . Model-free based control of a HIV/AIDS prevention model. Mathematical Biosciences and Engineering, 2022, 19(1): 759-774. doi: 10.3934/mbe.2022034 |
[3] | Andrew Omame, Sarafa A. Iyaniwura, Qing Han, Adeniyi Ebenezer, Nicola L. Bragazzi, Xiaoying Wang, Woldegebriel A. Woldegerima, Jude D. Kong . Dynamics of Mpox in an HIV endemic community: A mathematical modelling approach. Mathematical Biosciences and Engineering, 2025, 22(2): 225-259. doi: 10.3934/mbe.2025010 |
[4] | Sophia Y. Rong, Ting Guo, J. Tyler Smith, Xia Wang . The role of cell-to-cell transmission in HIV infection: insights from a mathematical modeling approach. Mathematical Biosciences and Engineering, 2023, 20(7): 12093-12117. doi: 10.3934/mbe.2023538 |
[5] | Aditya S. Khanna, Dobromir T. Dimitrov, Steven M. Goodreau . What can mathematical models tell us about the relationship between circular migrations and HIV transmission dynamics?. Mathematical Biosciences and Engineering, 2014, 11(5): 1065-1090. doi: 10.3934/mbe.2014.11.1065 |
[6] | Nara Bobko, Jorge P. Zubelli . A singularly perturbed HIV model with treatment and antigenic variation. Mathematical Biosciences and Engineering, 2015, 12(1): 1-21. doi: 10.3934/mbe.2015.12.1 |
[7] | Romulus Breban, Ian McGowan, Chad Topaz, Elissa J. Schwartz, Peter Anton, Sally Blower . Modeling the potential impact of rectal microbicides to reduce HIV transmission in bathhouses. Mathematical Biosciences and Engineering, 2006, 3(3): 459-466. doi: 10.3934/mbe.2006.3.459 |
[8] | Nicolas Bacaër, Xamxinur Abdurahman, Jianli Ye, Pierre Auger . On the basic reproduction number R0 in sexual activity models for HIV/AIDS epidemics: Example from Yunnan, China. Mathematical Biosciences and Engineering, 2007, 4(4): 595-607. doi: 10.3934/mbe.2007.4.595 |
[9] | Churni Gupta, Necibe Tuncer, Maia Martcheva . Immuno-epidemiological co-affection model of HIV infection and opioid addiction. Mathematical Biosciences and Engineering, 2022, 19(4): 3636-3672. doi: 10.3934/mbe.2022168 |
[10] | Yicang Zhou, Yiming Shao, Yuhua Ruan, Jianqing Xu, Zhien Ma, Changlin Mei, Jianhong Wu . Modeling and prediction of HIV in China: transmission rates structured by infection ages. Mathematical Biosciences and Engineering, 2008, 5(2): 403-418. doi: 10.3934/mbe.2008.5.403 |
The relationship between real exchange rate volatility and the trade balance has been a contentious issue since the fall of Bretton woods agreement of 1973, owing to the lack of unanimity on the effect. This article provides empirical evidence of the link between the real exchange rate volatility and the trade balance in the light of financial development, confirming the assertion that the effect is significantly dependent on the country's level of financial development. Due to Nigeria's relatively undeveloped financial system, its exchange rate dampens the country's exports. Rather than studying the relationship in isolation, we examine the moderating role of financial development on the link between export and the real exchange rate volatility in this paper. The empirical estimation is based on the Nigeria's data set spanning the years 1980–2019, and it employs threshold autoregressive non-linear co-integration and non-linear ARDL estimation techniques. According to the findings, financial development magnifies the beneficial benefits of the real exchange rate on Nigeria's foreign trade. It also states that the uncertainty in foreign capital flows has a negative impact on Nigeria's international trade. The findings have broad policy implications, implying that in order to diversify and improve the economy's future growth and associated international trade, Nigeria's policymakers should promote adequate financial sector development, as financial shocks are amplified by poorly implemented credit markets.
In the ever-evolving landscape of data science and predictive analytics, one of the most pervasive challenges is the intricacy posed by class imbalance within datasets. As organizations delve deeper into leveraging machine learning algorithms to gather insights and drive informed decision-making, understanding how varying class distributions impact model performance becomes paramount.
Consider a decision-maker at a bank, trying to keep customers from leaving. The team made a smart prediction tool, using a statistical model like random forest, to find out who might leave. But as they go through all the data, they keep hitting the same problem: class imbalance. This phenomenon, where one class significantly outnumbers the other(s) in a classification problem (Dube and Verster, 2023), can skew model predictions, leading to biased outcomes and suboptimal decision-making. Motivated by this real-world scenario, our study delves into the intricate interplay between class imbalance and predictive modeling performance. We begin an investigation to understand the hidden patterns, using statistical tools and large amounts of data. Our goal is to figure out how different levels of balanced data affect how well prediction models perform, especially looking at the famous random forest method.
Our investigation stands as a testament to the dynamic nature of predictive modeling. It echoes the sentiments of Verster and Fourie (2023) who delved into the future of predictive modeling by considering the influence of machine learning, financial crises, and financial technology. As we unpack the complexities of class imbalance, we contribute to the broader conversation surrounding the evolving landscape of predictive modeling, paving the way for innovative solutions and collaborative efforts between academia and industry partners. We aim to uncover the nuances hidden within the data, shedding light on the intricate relationship between class imbalance and model behavior. Moreover, we delve further into the essence of the random forest model, employing state-of-the-art techniques such as Shapley values and partial dependence plots. These tools help us navigate the intricate paths of the data and understand the black-box effect. With each analysis, we unravel the intricate web of relationships, shedding light on how individual features influence the model's predictions and how these influences shift with changes in class balance. The ML model's interpretability has gained a lot of attention over the past few decades, with researchers such as Du Toit et al. (2023), Nohara et al. (2022), and Ribeiro et al. (2016) applying it successfully in their research. Jafari et al. (2023), Guliyev and Tatoğlu (2021), Dumitrache et al. (2020) and many more have shown how model interpretability can be used in modeling customer churn. As we explored the data, we found interesting patterns and surprising discoveries.
The analysis of churn and fraud datasets has revealed significant insights in existing literature. Notably, prior studies have demonstrated that addressing class imbalance can lead to substantial improvements in model performance, particularly in the context of precision, recall, F1-score, and accuracy. In our previous work (Dube and Verster, 2024), we have shown how machine learning models for default prediction are affected by missing data and class imbalance, further underscoring the importance of dataset balance in predictive modeling. This study is crucial as it delves into the explainability of model predictions across different levels of class imbalance. By investigating Shapley values and feature importance, the study identifies consistent patterns and significant relationships between features and model predictions. Moreover, PDPs and breakdown plots provide a deeper understanding of how class imbalance affects individual predictions and baseline predictions, highlighting the stability of fundamental relationships between input variables and predicted outcomes as datasets approach balance. Overall, these analyses underscore the importance of addressing class imbalance for enhancing the performance and reliability of predictive models in identifying rare instances.
The structure of our paper unfolds as follows: We commence with the Introduction in Section 1, providing an overview of the research problem and emphasizing its significance. We outline our dataset in Section 2, comprising imbalanced churn and fraud datasets, and describe the random forest classifier in Section 3. We will review related work on class imbalance and interpretability in Section 4, we introduce various interpretability techniques such as Shapley values, PDPs, and breakdown plots. Section 5 discusses the random forest model and how it can be adopted for the adjustment of class weights. We define evaluation metrics in Section 6 and present results in Section 7 indicating improved model performance with decreased class imbalance. Through discussion in Section 8, we explore the practical implications and underline the importance of considering class distribution for robust model interpretation, concluding with insights for developing reliable predictive models in real-world applications in Section 9.
In this analysis, we employed two datasets. The first one is a churn dataset sourced from Kaggle, encompassing 10,000 observations with 10 predictor variables and a binary (0/1) response variable. Table 1 represents the description of the churn data and Table 2 shows different sample sizes that were used for the analysis. The second dataset is the fraud dataset, also sourced from Kaggle, with 110,106 observations and eight (8) predictors. Table 3 displays the description of the fraud dataset and Table 4 shows different sample sizes. To generate the different samples of varying class balance, a random over-sampling technique as described in Dube and Verster (2023) was adopted on the minority class. Originally, the churn and fraud datasets had 20% churn and 1% fraud rate, respectively. These are indicated by the asterisk signs on the Tables 2 & 4.
Feature | Description |
Customer ID | Unused |
Credit score | Input |
Country | Input |
Gender | Input |
Age | Input |
Tenure | Input |
Balance | Input |
Products number | Input |
Credit card | Input |
Active member | Input |
Estimated | Input |
Churn | Target |
Churn % | Yes | No |
20∗ | 2,037 | 7,963 |
30 | 3,583 | 7,963 |
40 | 5,574 | 7,963 |
50 | 7,963 | 7,963 |
Feature | Description |
Fraud | Fraud transaction, indicator variable |
Type | Type of online transaction |
Amount | The amount of the transaction |
OldbalanceOrg | Balance before the transaction |
NewbalanceOrig | Balance after the transaction |
OldbalanceDest | Initial balance of recipient before the transaction |
NewbalanceDest | The new balance of recipient after the transaction |
Fraud % | Yes | No |
1∗ | 1,059 | 109,047 |
5 | 5,452 | 109,047 |
10 | 10,905 | 109,047 |
15 | 16,357 | 109,047 |
A random forest (RF) is a classifier made up of a set of tree-structured classifiers (Breiman, 2001), h(x,Θk), where k=1,2,…. Each tree is built from a random vector of parameters, Θk, and contributes a single vote to the most popular class for a given input x (subsample) as indicated in Figure 1 below. This ensemble technique generates diverse classifiers through randomization, resulting in efficient classification, similar to bagging or random subspace methods. The algorithm grows numerous decision trees, and to classify a new object, it goes through each tree in the forest, with the final classification determined by the majority vote across all trees.
Each decision tree is constructed by sampling, with replacement, from the original dataset to form a training set (Liaw et al., 2002). At each node, a subset of input variables is randomly chosen for splitting, ensuring diversity among the trees. In our case, a maximum of 2 features were specified in the model when looking for the best split at each node. By setting this parameter to a value less than the total number of features in the dataset, a random subset of features will be considered for splitting at each node. This helps introduce diversity among the trees in the ensemble. The design parameters include the number of features selected for each tree, the number of trees in the forest, and the minimum number of samples in a leaf node. Notably, the selection of features significantly impacts the RF's performance. An important aspect of RF is the use of out-of-bag (OOB) data, which consists of approximately one-third of the original dataset not included in the bootstrap sample (Gislason et al., 2006). This OOB data facilitates unbiased estimation of classification error, eliminating the need for separate validation sets or cross-validation. The accuracy of RF is characterized by its generalization error, which is determined by the margin function. This function measures the difference between the average number of votes for the correct class and the maximum average vote for any other class. The strength of RF, in terms of the margin function, reflects its ability to reduce variance through averaging and randomization, thereby decreasing correlation among the trees in the forest (Abd Algani et al., 2022); (Liaw et al., 2002). In this analysis, a subset of only 2 input variables (features) was randomly chosen for splitting, ensuring diversity among the trees and the forest contained 100 decision trees, with each trained on a bootstrap sample of the training data with replacement.
Breiman (2001) highlights several strengths of random forest, including its efficiency on large databases, robustness to datasets with thousands of input variables, estimation of important variables, handling of missing data, and ability to balance class errors in imbalanced datasets. Mathematically, the generalization error of the ensemble classifier is bounded above by a function of the mean correlation between base classifiers and their average strength (Hastie et al., 2009). If ρ represents the mean correlation, the upper bound for the generalization error is given by ρ(1−S2)/S2, where S is the expected value of the strength of the random forest.
In our study, we extended the application of RF to address class imbalance, a common challenge in binary classification tasks. As highlighted by Dube and Verster (2023), RF demonstrates superior performance in handling class imbalance compared to other machine learning models. To further enhance the interpretability and effectiveness of the RF model in our analysis, we employed the technique of RF with class weights (Shahhosseini and Hu, 2021). This approach involves modifying the weighting strategy of the standard RF model, assigning higher weights to the minority class instances during training. By incorporating class weights, the RF model can effectively correct for oversampling and make more accurate predictions, as demonstrated by Winham et al. (2013). Through this adaptation, RF with class weights aims to mitigate the bias toward the majority class and improve the overall balance and performance of the classifier, ensuring fair treatment of both classes in the binary classification setting.
In accordance with the guidelines outlined by Nationalbank Oesterreichische (2004), it is imperative to adjust the probabilities obtained from oversampled samples to align with the average probabilities of the original dataset. This adjustment is achieved indirectly using relative default frequencies (RDFs), as specified in the following procedure:
1. Compute the average sample default rate derived from the random forest model and transform it into RDFssample.
2. Determine or estimate the average default rate in the original dataset and convert it into RDForiginal.
3. Calculate the representation of each default probability generated by the random forest model as RDFunscaled.
4. Multiply RDFunscaled by the scaling factor specific to the corresponding model.
5. Convert the resulting scaled RDF into a scaled default probability.
The scaled RDFscaled is computed as follows:
RDFscaled=RDFunscaled×RDForiginalRDFsample |
Here, RDF denotes the probability of default (PD) divided by 1−PD or PD=RDF1+RDF. RDFsample is derived from the average predicted probability of default within our implementation sample, while RDForiginal reflects the true default rate in the original dataset prior to oversampling. Lastly, RDFunscaled is computed from the individual default probabilities generated by the random forest model. This procedure ensures the calibration of PDs to accurately reflect the characteristics of the original dataset while considering the effects of oversampling.
Our methodology (outlined in Figure 2) initiates by acquiring the dataset and meticulously cleaning it to ensure data integrity. Samples of varying class imbalance were generated in order to assess the impact on the performance of an RF model. These samples were then divided into distinct training and testing subsets, facilitating both model training and evaluation. During the training phase, the random forest classifier is trained using the training subset, while the testing subset is reserved for assessing the model's performance. After generating predicted probabilities, we adopted the approach proposed by Nationalbank Oesterreichische (2004) to transform these probabilities, ensuring they accurately reflect the characteristics of the true population. Subsequently, we meticulously reported on the performance measures outlined in Section 6.
Interpretability in machine learning ensures trustworthiness and comprehension of model decisions, particularly in domains where such decisions carry significant implications. Across various studies, the importance of interpretability resonates as researchers navigate the complexities of diverse applications.
In the context of customer churn prediction, Jafari et al. (2023) proposed a comprehensive framework aimed at enhancing both predictive performance and interpretability. Their approach, spanning preprocessing techniques, novel classification algorithms, and rigorous evaluation criteria, addresses the dual challenge of accurate prediction and transparent decision-making, catering to the needs of managerial stakeholders. Similarly, Tekouabou et al. (2022) tackled the intricacies of customer relationship management systems, recognizing the challenges posed by heterogeneous data and class imbalances. Through the adept application of ensemble methods and data balancing techniques, they constructed predictive models that not only mitigate these challenges but also offer interpretable insights, facilitating informed decision-making within CRM contexts. In the banking sector, Peng et al. (2023) delved into the pressing issue of customer churn, leveraging advanced modeling techniques augmented by interpretability analyses. By employing genetic algorithm-enhanced XGBoost and elucidating feature contributions through Shapley values, they provided actionable insights for banking institutions, empowering them to proactively address customer retention challenges.
Building upon the insights gleaned from existing research, Zhu et al. (2023) and Davis et al. (2022) offered valuable contributions by employing a range of algorithms such as LightGBM, XGBoost, logistic regression, and decision trees to forecast loan defaults. These models not only exhibited high predictive performance, as evidenced by metrics like accuracy and area under the curve, but also prioritized interpretability through methods like local interpretable model-agnostic explanations (LIME) and generated simple rules understandable to various stakeholders. Similarly, Ariza-Garzón et al. (2020) and Tran et al. (2022) underscored the significance of explainable credit risk models in peer-to-peer lending and financial markets. By utilizing advanced techniques like SHAP values, they demonstrated how machine learning algorithms can not only achieve superior predictive accuracy but also offer transparency and comprehensibility which was deemed crucial for fostering trust among stakeholders including industry players, regulators, and investors.
In nanoparticle studies, Yu et al. (2021) navigated the complexities of highly heterogeneous data, developing a framework that combines tree-based random forest analysis with feature interaction networks. Their approach not only facilitates accurate prediction of immune responses and lung burden but also enhances model interpretability, thereby offering valuable guidance for nanoparticle design and application. Meanwhile, Uddin et al. (2022) focused on credit default prediction, employing random forest methodology to discern patterns within micro-enterprise credit data. Through rigorous analysis and consideration of both traditional financial variables and non-traditional predictors, they underscore the importance of interpretability in credit risk assessment, offering insights that are invaluable for financial market participants. Lastly, Moraffah et al. (2020) provided a comprehensive survey on causal interpretable models, shedding light on the evolving landscape of interpretability methodologies. By exploring the nuances of causal explanations and evaluation metrics, they equip practitioners with a deeper understanding of interpretability concepts, thereby fostering greater transparency and trust in machine learning systems.
Collectively, these studies and more underscore the critical role of interpretability in enhancing the utility and reliability of machine learning models across diverse domains, offering insights that are indispensable for informed decision-making and stakeholder trust.
This section explores several key methods for understanding model behavior and feature importance. We delve into permutation feature importance, Shapley values, partial dependence plots, and breakdown plots, each providing unique perspectives on model interpretability. Permutation feature importance uncovers the significance of individual features by assessing the impact of shuffling feature values. Shapley values, rooted in cooperative game theory, assign values to features based on their contribution to predictions for specific instances. Partial dependence plots offer insights into the relationship between features and predictions by visualizing how the prediction changes with varying feature values. Finally, breakdown plots provide a granular view of feature contributions to individual predictions, aiding in model debugging and transparency. These techniques collectively enhance our understanding of machine learning models and promote trust, transparency, and fairness in decision-making processes. In the following subsections, we will discuss these interpretability techniques in details.
Researchers need to identify the primary predictor in a predictive model and ascertain its comparative impact on model outcomes. Permutation importance, employed by Breiman (2001), is a commonly employed method to assess feature significance. It involves randomly shuffling feature values and observing resultant changes in model predictions to discern which features influence predictions most significantly. Importance weights are determined based on the predictive variance between the original and perturbed feature values (Fisher et al., 2019). Feature importance, inferred from these weights, can be evaluated for all features, providing insight into their respective impacts on model outputs (Gregorutti et al., 2017). Permutation importance for features can be expressed as:
I(j)=exp(f(x+j))−exp(f(x+j+π(xj))). | (1) |
Here, j indicates the jth feature that needs explanation, xj denotes the value of the jth feature, and x+j indicates the value of sample x with the jth feature. π(xj) denotes the disturbance added to xj. f is the prediction of a complex model on x and exponential expression (exp()) is the predicted accuracy of f.
According to Shapley (2020) and Lundberg and Lee (2017), Shapley values are a concept from cooperative game theory. In machine learning, they are used to assign a value to each feature that represents its contribution to the prediction for a specific instance. The concept aims to distribute the total gain or payoff among players based on their relative contributions to the final outcome of a game. Shapley values offer a method to fairly allocate rewards to each player, characterized by natural properties such as local accuracy (additivity), consistency (symmetry), and nonexistence (null effect) (Shapley, 2020). In the context of activity predictions, Shapley values can also be interpreted as a fair allocation of feature importance given a specific model output (Rodríguez-Pérez and Bajorath, 2019). Features contribute differently to the model's output, which is captured by Shapley values, representing both the magnitude and direction of the contribution. Features with positive values contribute to activity prediction, while those with negative values contribute to inactivity prediction.
The importance of a feature j is quantified by its Shapley value, as defined in Equation 2:
ϕj=1|N|!∑S⊆N∖{j}|S|!(|N|−|S|−1)![f(S∪{j})−f(S)] | (2) |
where f(S) is the model output with a feature set S, and N is the complete set of features. The Shapley value of feature j (ϕj) is computed as the average of its contributions across all possible permutations of feature sets. This approach accounts for feature orderings, crucial for understanding changes in model output due to correlated features.
The concept of the partial dependence profile (PDP) was introduced by Greenwell et al. (2017). Let j denote any jth feature in the dataset. Then, the PDP can be defined as a function of the observation z for a model f and a variable j as follows:
PDP(f,j,z)=E−j[f(j|=z)]. | (3) |
In simpler terms, the PDP value for the jth column in the observation z is the average prediction of model f when values in the jth column are set to z. However, in practice (Biecek and Burzykowski, 2021a), the distribution of −j is often unknown. Therefore, it is estimated using the following formula:
^PDP(f,j,z)=1nn∑i=1f(j|=zi). | (4) |
A breakdown (BD) plot (Biecek and Burzykowski, 2021b) shows the contributions of each feature to the final prediction for a single instance. It visually breaks down the prediction into the impact of individual features. This approach offers a model-agnostic method for interpreting predictions, allowing for the explanation of both additive and non-additive models. While it may lead to some loss of information regarding the model's structure, it proves useful for various models. The core idea behind the ag-break approach is to identify elements of xnew that, if altered significantly, would result in a notable change in the prediction f(xnew). This approach uses the concept of a relaxed model prediction (Staniak and Biecek, 2018). Let fIndSet(xnew) denote the expected model prediction for xnew relaxed on the set of indices IndSet={1,…,p}.
fIndSet(xnew)=E[f(x)|xIndSet=xnewIndSet]. |
The relaxed prediction represents an average model response for observations matching xnew for features in IndSetC, following the population distribution for features in IndSet.
Since the joint distribution of x is unknown, an estimate is used instead:
^fIndSet(xnew)=1nn∑i=1f(xi−IndSet,xnewIndSet). |
Individual prediction explanations explains why a specific prediction was made and which features had the most influence. In our case, individual explanations will be adopted to help explain the impact of oversampling the minority cases. Particularly, this will explain how individual predictions are affected.
In this paper, we adopted a widely used approach to understanding the performance of a random forest model in handling class imbalance, namely precision, recall, and F1-score as outlined by Goutte and Gaussier (2005). These metrics play a critical role in assessing the performance of classification models and are essential for determining their effectiveness in real-world applications.
Accuracy measures the overall correctness of the model's predictions across all classes (Jiao and Du, 2016). It is calculated as the ratio of correctly predicted instances to the total number of instances in the dataset, as shown in Equation 5:
Accuracy=True Positive+True NegativeTotal Instances. | (5) |
A high accuracy indicates that the model is making correct predictions across all classes. However, accuracy alone may not be sufficient for evaluating the performance of a model, especially in the presence of imbalanced datasets where one class dominates the others.
Precision, also known as positive predictive value, measures the accuracy of positive predictions made by the model. It is calculated as the ratio of true positive predictions to the total number of positive predictions, as shown in Equation 6:
Precision=True PositiveTrue Positive+False Positive. | (6) |
A high precision indicates that the model is proficient at correctly identifying positive instances while minimizing false positives.
Recall, also referred to as sensitivity, measures the ability of the model to capture all positive instances in the dataset. It is calculated as the ratio of true positive predictions to the total number of actual positive instances, as shown in Equation 7:
Recall=True PositiveTrue Positive+False Negative. | (7) |
A high recall indicates that the model can successfully identify most positive instances, minimizing false negatives.
F1-score is the harmonic mean of precision and recall, providing a balanced assessment of a model's performance. It is calculated using Equation 8:
F1-score=2×Precision×RecallPrecision+Recall. | (8) |
The F1-score considers both false positives and false negatives, making it a useful metric for evaluating models with imbalanced datasets. Precision, recall, and F1-score are essential metrics in machine learning for evaluating the performance of classification models. While precision focuses on the accuracy of positive predictions, recall emphasizes the model's ability to capture all positive instances. The F1-score provides a balanced measure by considering both precision and recall, making it a valuable tool for model evaluation. These measures consider the number of positive and negative cases and to accommodate for the rare cases, we will adopt the methodology specified in Section 3.
In the pursuit of understanding the influence of class imbalance on model performance, a random forest model was trained and evaluated on a churn (20%, 30%, 40%, and 50%) and fraud (1%, 5%, 10%, and 15%) dataset with varying levels of class distribution. Both datasets that were used underwent an 80/20 split into training and testing sets. The random forest model was trained on four different samples (per original dataset), each with varying class balance proportions. The subsequent testing results across these different churn and fraud percentages are detailed in Table 5 below. The scores on the table strictly represent the positive cases.
Dataset | Class % | Precision | Recall | F1-score | Accuracy |
Churn | 20∗ | 45 | 78 | 57 | 76 |
30 | 64 | 84 | 73 | 81 | |
40 | 76 | 83 | 80 | 82 | |
50 | 83 | 80 | 82 | 83 | |
Fraud | 1∗ | 17 | 49 | 29 | 94 |
5 | 53 | 52 | 52 | 95 | |
10 | 66 | 54 | 59 | 97 | |
15 | 69 | 60 | 63 | 98 |
In the churn dataset analysis, we observed a consistent improvement in precision, recall, F1-score, and accuracy as the class imbalance decreased. Precision, which measures the proportion of true positive predictions among all positive predictions, showed an increase from 45% to 83% as the class imbalance decreased from 20% to 50%. This suggested that with a more balanced dataset, the model becomes more precise in correctly identifying churn cases. Recall, representing the proportion of true positive predictions among all actual positives, also demonstrated improvement from 78% to 80% with decreasing class imbalance. This indicates that the model is better at capturing actual churn cases when the dataset is less imbalanced. F1-score, which is the harmonic mean of precision and recall, showed a similar trend of enhancement from 57% to 82% as class imbalance decreased. This implies that the overall performance of the model in balancing precision and recall improved with a more balanced dataset. Accuracy, reflecting the proportion of correctly classified cases among all cases, increased from 76% to 83% as class imbalance decreased. This indicates that the model's overall predictive accuracy improves with a reduction in class imbalance, as it becomes better at correctly classifying both churn and non-churn cases.
In the fraud dataset analysis, we also observed a consistent improvement in precision, recall, F1-score, and accuracy as the class imbalance decreased. Precision increased from 17% to 69% as the class imbalance decreased from 1% to 15%. This suggests that with a more balanced dataset, the model becomes more precise in identifying fraud cases. Recall showed a significant improvement from 49% to 60% with decreasing class imbalance, indicating that the model captured a higher proportion of actual fraud cases when the dataset was less imbalanced. F1-score demonstrated a similar trend of enhancement from 29% to 63% as class imbalance decreased, implying an overall improvement in the model's ability to balance precision and recall. Accuracy increased from 94% to 98% as class imbalance decreased, indicating an overall improvement in the model's predictive accuracy with a reduction in class imbalance.
The next part of the experiment was to investigate the impact, or rather the effect, class imbalance has on explaining this sophisticated model. First, Shapley values were investigated across the four samples as shown in Figures 3–17. The Figures 3, 5, 7, 9 display Shapley values for each feature and instance in the churn dataset. The vertical position indicates the feature, and the horizontal position shows the Shapley value. The color shows the feature value, ranging from low to high. If points overlap, they are slightly moved vertically to show the spread of Shapley values for each feature. Features are arranged based on their importance. Figures 11, 13, 15, 17 display the same information but with the focus on the feature importance. Noticeably, it was observed that features age and balance had a positive relationship with the Shapley values throughout the four samples whereas variables such as products number, active member, and credit card showed negative relationships with the Shapley values. Some of the features, like country, did not show the same relationship throughout the samples. We also noted the reordering of features from 20%–40% class imbalanced which then stayed the same when the dataset was 50% balanced. Moreover, we observed an overall decrease of Shapley values as the dataset became more balanced but an improvement in feature importance. In the Fraud dataset, the ordering of features in terms of importance was also observed and the overall decrease of the SHAP values in all the samples.
Shapley values for churn and fraud datasets
Next, we looked at the partial dependence plots (PDP) by selected variables in each dataset across various samples of varying class imbalance. In the course of this investigation, the influence of varying class balance on the shape of partial dependence plots (PDPs) was examined using a random forest model. Visual inspection of the PDPs illustrated a consistent overall upward trend in estimated values as both datasets approached a more balanced distribution. A noteworthy observation was the consistent increase in the baseline from 0.12 to 0.5 as the dataset achieved greater balance across the four samples in the churn dataset, according the variable age, see in Figures 19, 21, 23, 25. In the fraud dataset Figures 20, 22, 24, 26, the baseline was as low as below 0.008 at a 1% fraud rate but went as high as 0.08 when the dataset had a 15% fraud rate, according the variable OldbalanceDest. Crucially, the overarching shape of the partial dependence plots remained stable throughout this process. This implies that while the baseline predictions of the model demonstrated an increase with improved class balance, the fundamental relationships between the input variable and the predicted outcome retained their intrinsic characteristics.
We also looked at how individual predictions are affected by class imbalance. Breakdown plots illustrate the manner in which contributions assigned to specific explanatory variables alter the mean model's prediction, resulting in the actual prediction for a particular individual instance or observation. In Figures 27–34, green bars signify positive changes, while red bars represent negative changes in mean predictions, reflecting the contributions attributed to explanatory variables. In Figures 35–42, red dots highlight the mean predictions for the full dataset. Particularly, we were interested in the probability of churn for a male customer aged 42 with a credit score of 619 who earned 65,000 in the churn dataset. To evaluate the impact of imbalance of individual explanatory variables to this particular single-instance prediction, we investigated the changes in the model's predictions when fixing the values of the variables and noted changes as the dataset became more balanced. The two breakdown plots used revealed a significant change in the prediction as the data was more balanced. It can be seen that the predicted value can be as low as below 0.5 when the data is 20% balanced, but can increase the prediction to as high as 0.9 when data is more balanced. This analysis was also followed for the fraud data, and again the predicted probability was as low as 3.5% at a 1% fraud rate and as high as 97% at a 15% fraud rate. In a classification setting, this means that if the model was trained with a wrong class-balance dataset, there is a risk of misclassifying some observations. Similarly, on the average level, this trend was also true for the whole dataset.
The results of the experiment provided nuanced insights into the intricate relationship between class balance, model performance, and interpretability, particularly in the context of random forest models for churn and fraud detection. One of the most significant findings is the consistent improvement in model performance metrics as class imbalance decreased. This observation aligns with Dube and Verster (2023) and other existing literature on the challenges posed by imbalanced datasets, where the rarity of minority class instances can lead to biased model predictions favoring the majority class. By addressing class imbalance, the experiment demonstrates the potential to mitigate these biases and improve the model's ability to accurately identify rare events such as churn or fraud.
The analysis of Shapley values and feature importance adds depth to our understanding of how individual features contribute to model predictions across varying levels of class imbalance. The observation that certain features maintain consistent relationships with model predictions regardless of class distribution highlights the importance of these features in capturing meaningful patterns within the data. Conversely, the variability observed in the relationship between other features and model predictions underscores the complexity of feature interactions and their sensitivity to changes in class balance. This insight underscores the importance of considering feature importance in the context of class distribution, as the relevance of features may vary depending on the rarity of the target event. In a similar study done by Chen et al. (2024), it was established that interpretations generated from Shapley values are less stable as the class imbalance increases in a dataset.
Furthermore, the examination of partial dependence plots (PDPs) provided valuable insights into the overall trends in model predictions as class balance improves. Despite variations in baseline predictions, the stability of the underlying relationships between input variables and predicted outcomes suggests robustness in the model's understanding of feature interactions. This finding is particularly significant as it indicates that while class imbalance may influence baseline predictions, it does not necessarily alter the fundamental relationships between features and the target variable. This stability in feature relationships enhances the interpretability of the model and facilitates more informed decision-making.
The analysis of individual predictions through breakdown plots further elucidates the impact of class imbalance on model predictions at the individual level. The observed changes in predicted probabilities highlight the importance of considering class distribution when interpreting individual predictions, as variations in dataset balance can significantly affect the confidence and reliability of model predictions. This insight has practical implications for decision-making in real-world scenarios, where accurate predictions are essential for mitigating risks associated with churn or fraud.
This study represents a pioneering effort in utilizing a comprehensive suite of interpretability tools, including Shapley values, partial dependence plots (PDPs), feature importance analysis, and breakdown plots, to investigate the impact of class imbalance across datasets of varying natures. By integrating these advanced techniques, we bridge a significant gap in the existing literature by offering a holistic understanding of how class imbalance affects model performance and interpretability. This research not only fills a critical void in the current understanding of imbalanced data scenarios but also offers practical insights that can inform the development of more effective and interpretable machine learning models in real-world applications. By closing this gap, our study provides researchers and practitioners with valuable guidance for mitigating the challenges posed by class imbalance and leveraging its potential benefits to enhance predictive accuracy and model interpretability.
In conclusion, the experiment provides valuable insights into the complex interplay between class balance, model performance, and interpretability in random forest models for churn and fraud detection. By elucidating these dynamics, this research contributes to advancing our understanding of effective model development and deployment in scenarios characterized by imbalanced data distributions. These insights have practical implications for improving the reliability and interpretability of machine learning models in real-world applications, particularly in domains where accurate predictions of rare events are critical for decision-making.
Our experiment was conducted to explore the impact of class balance on random forest model performance in churn and fraud detection scenarios and has provided valuable insights into the intricate relationship between data distribution, model performance, and interpretability.
The findings underscore the critical importance of addressing class imbalance in training datasets to enhance the model's ability to accurately identify rare events. The consistent improvement in performance metrics such as precision, recall, F1-score, and accuracy as class imbalance decreases highlights the necessity of balancing the representation of minority and majority classes to achieve optimal predictive performance. Moreover, the analysis of Shapley values and feature importance revealed nuanced insights into the contribution of individual features to model predictions across varying class distributions. While some features exhibited consistent relationships with model predictions, others displayed more variability, emphasizing the complex interplay between feature importance and class distribution. Additionally, the examination of partial dependence plots (PDPs) demonstrated stable trends in estimated values as class balance improved, indicating that fundamental relationships between input variables and predicted outcomes remained unchanged despite variations in baseline predictions. Furthermore, the analysis of individual predictions through breakdown plots emphasized the significant impact of class imbalance on model predictions at the individual level, highlighting the importance of considering class distribution when interpreting model outputs in real-world applications.
Furthermore, while this study provides valuable insights, there are important avenues for future research to explore. Additional methodologies for addressing class imbalance, such as advanced sampling techniques or algorithmic adjustments, warrant investigation to further improve model performance in imbalanced datasets. Moreover, validating the generalizability of these findings across diverse datasets and application domains is essential to ensure the robustness and applicability of the proposed approaches. Additionally, considering the limitations of this study, including the specific characteristics of the datasets used and the choice of machine learning algorithms, future research could benefit from examining alternative models and datasets to provide a more comprehensive understanding of the impact of class imbalance on model performance and interpretability. By addressing these future research directions and considering the study limitations, we can continue to advance the field of imbalanced data analysis and contribute to the development of more effective and reliable predictive models in real-world settings.
Overall, this research contributes to advancing our understanding of the challenges and opportunities associated with imbalanced data in machine learning applications, particularly in domains such as churn and fraud detection. By elucidating the complex interplay between class balance, model performance, and interpretability, this study provides a foundation for developing more robust and reliable predictive models in scenarios characterized by imbalanced data distributions. Moving forward, further research is warranted to explore additional methodologies for addressing class imbalance and to validate the generalizability of these findings across diverse datasets and application domains.
This work is based on the research supported wholly/in part by the National Research Foundation of South Africa (Grant Number 126885).
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
The authors would like to express their deepest gratitude to their supervisor, Prof. Tanja Verster, for the unwavering guidance, invaluable guidance, and exceptional mentorship throughout the course of this research. They would also like to extend their gratitude to the NWU Centre for BMI for availing their resources to our wonderful staff.
All authors declare no conflicts of interest in this paper.
[1] | Adelowokan Oluwaseyi A, Balogun Oluwakemi D, Adesoye AB (2015) Exchange rate volatility on investment and growth in Nigeria, an empirical analysis. Global J Manage Bus Res. |
[2] | Adelowokan OA (2012) Exchange rate in Nigeria: A dynamic evidence. Eur J Humanit Social Sci 16: 785-801. |
[3] | Adeniran JO, Yusuf SA, Adeyemi OA (2014) The impact of exchange rate fluctuation on the Nigerian economic growth: An empirical investigation. Int J Acad Res Bus Social Sci 4: 224. |
[4] | Adeoye BW, Atanda A (2012) Exchange rate volatility in Nigeria: A convergence analysis. Bus Manage J 2. |
[5] | Adusei M (2016) Determinants of bank technical efficiency: Evidence from rural and community banks in Ghana. Cogent Bus Manage 3: 1199519. |
[6] | Afonso A, Blanco Arana C (2018) Financial development and economic growth: a study for OECD countries in the context of crisis. REM Working Paper, 046-2018. |
[7] |
Aghion P, Bacchetta P, Ranciere R, et al. (2009) Exchange rate volatility and productivity growth: The role of financial development. J Monetary Econ 56: 494-513. doi: 10.1016/j.jmoneco.2009.03.015
![]() |
[8] |
Ahmad AU, Loganathan N, STREIMIKIENE D, et al. (2018) FINANCIAL INSTABILITY, TRADE OPENNESS AND ENERGY PRICES ON LEADING AFRICAN COUNTRIES SUSTAINABLE GROWTH. Econ Comput Econ Cybernetics Stud Res 52: 127-142. doi: 10.24818/18423264/52.1.18.08
![]() |
[9] | Ajakaiye O, Ojowu O (1994) Exchange rate depreciation and the structure of sectoral prices in Nigeria under an alternative pricing regime, 1986-89. AERC, Nairobi, KE. |
[10] | Andersen ES, Schumpeter JA (2011) A theory of social and economic evolution. Basing-stoke: Palgrave Macmillan. |
[11] |
Asteriou D, Spanos K (2019) The relationship between financial development and economic growth during the recent crisis: Evidence from the EU. Financ Res Lett 28: 238-245. doi: 10.1016/j.frl.2018.05.011
![]() |
[12] | Bank TW (2019) The world bank data. Available from: POpulation Grwoth: https://data.worldbank.org/indicator/SP. POP. GROW. |
[13] |
Baxter M, Stockman AC (1989) Business cycles and the exchange-rate regime: some international evidence. J Monetary Econ 23: 377-400. doi: 10.1016/0304-3932(89)90039-1
![]() |
[14] | Belke AH, Setzer R (2003) Exchange rate volatility and employment growth: Empirical evidence from the CEE economies. |
[15] | Belke A, Gros D (2001) Real impacts of intra-European exchange rate variability: a case for EMU? Open Econ Rev 12: 231-264. |
[16] |
Berument H, Pasaogullari M (2003) Effects of the real exchange rate on output and inflation: evidence from Turkey. Dev Econ 41: 401-435. doi: 10.1111/j.1746-1049.2003.tb01009.x
![]() |
[17] | Bilas V, Bošnjak M, Novak I (2017) Examining the relationship between financial development and international trade in Croatia. South East Eur J Econ Bus 12. |
[18] | Bostan I, Firtescu BN (2018) Exchange rate effects on international commercial trade competitiveness. J Risk Financ Manage 11: 19. |
[19] | Central Bank of Nigeria (2016) Foreign exchange: Education in economics series. 4: 1-50. |
[20] | Chan KS (1993) Consistency and limiting distribution of the least squares estimator of a threshold autoregressive model. Ann Stat 21: 520-533. |
[21] | Chu P (2001) Using BDS statistics to detect nonlinearity in time series. 53rd session of the International Statistical Institute (ISI). |
[22] |
Connolly ML (1983) Analytical molecular surface calculation. J Appl Crystallogr 16: 548-558. doi: 10.1107/S0021889883010985
![]() |
[23] | Danlami MR, Loganathan N, Streimikiene D, et al. (2018) The Effects of Financial Development-Trade Openness Nexus on Nigeria's Dynamic Economic Growth. Econ Sociol 11: 128. |
[24] | Danmola RA (2013) The impact of exchange rate volatility on the macro economic variables in Nigeria. Eur Sci J 9. |
[25] | Dorina L, Simina U (2007) Testing efficiency of the stock market in emerging economies. J Faculty Econ-Econ Sci Series 2: 827-831. |
[26] |
Ductor L, Grechyna D (2015) Financial development, real sector, and economic growth. Int Rev Econ Financ 37: 393-405. doi: 10.1016/j.iref.2015.01.001
![]() |
[27] |
Edwards S (1986) The pricing of bonds and bank loans in international markets: An empirical analysis of developing countries' foreign borrowing. Eur Econ Rev 30: 565-589. doi: 10.1016/0014-2921(86)90009-7
![]() |
[28] |
Elbadawi IA, Kaltani L, Soto R (2012) Aid, real exchange rate misalignment, and economic growth in Sub-Saharan Africa. World Dev 40: 681-700. doi: 10.1016/j.worlddev.2011.09.012
![]() |
[29] |
Elliott G, Müller UK (2006) Minimizing the impact of the initial condition on testing for unit roots. J Econometrics 135: 285-310. doi: 10.1016/j.jeconom.2005.07.024
![]() |
[30] | El-Ramly H, Abdel-Haleim SM (2008) The effect of devaluation on output in the Egyptian economy: A vector autoregression analysis. Int Res J Financ Econ 14: 82-99. |
[31] | Enders W, Granger CWJ (1998) Unit-root tests and asymmetric adjustment with an example using the term structure of interest rates. J Bus Econ Stat 16: 304-311. |
[32] | Eneji MA, Nanwul DF, Eneji AI, et al. (2018) Effect of Exchange Rate Policy and its Volatility on Economic Growth in Nigeria. Int J Adv Stud Econ Public Sector Manage 6: 166-190. |
[33] | Farouq I, Sulong Z, Ahmad U, et al. (2020) Heterogeneous Data Approach on Financial development of Selected African Leading Economies. Data Brief 30: 105670. |
[34] | Farouq IS, Sulong Z, Sambo NU (2020) An empirical review of the role economic growth and financial globalization uncertainty plays on financial development. Afr J Econ Sust Dev 3: 48-63. |
[35] |
Farouq IS, Sulong Z (2020) The impact of economic growth, oil price, and financial globalization uncertainty on financial development: evidence from selected leading African countries. Int J Bus Econ Manage 7: 274-289. doi: 10.18488/journal.62.2020.75.274.289
![]() |
[36] | Farouq IS, Sulong Z (2021) The effects of foreign direct investment uncertainty on financial development in Nigeria: an asymmetric approach. Iran J Manage Stud (IJMS) 14: 383-399. |
[37] | Farouq IS, Sulong Z, Ahmad AU, et al. (2020) The effects of economic growth on financial development in Nigeria: Interacting role of foreign direct investment: An application of NARDL. Int J Sci Technol Res 9: 6321-6328. |
[38] | Sulong Z, Farouq IS (2021) Energy-Finance Nexus: Evidence from African Oil Exporting Countries. Int Energy J 21: 171-181. |
[39] | Farouq IS, Sulong Z, Sambo NU (2020) The Effects of Environmental Quality, Trade Openness, And Economic Growth on Financial Development in Algeria: A Diks And Panchenko Approach. J Crit Rev 7: 545-554. |
[40] | Farouq IS, Sulong Z, Sanusi SS (2020) The empirical relationship between economic growth, ICT, financial globalization uncertainty and financial development: Evidence from selected leading African economies. Islamic Univ Multidiscip J 7: 1-14. |
[41] |
Farouq IS, Sulong Z, Sambo NU (2020) Covid-19 Perception: A Survey in Kano Metropolis, Nigeria. J Manage Theory Pract (JMTP) 1: 83-89. doi: 10.37231/jmtp.2020.1.3.53
![]() |
[42] |
Farouq IS, Sambo NU, Ahmad AU, et al. (2021) Does financial globalization uncertainty affect CO2 emissions? Empirical evidence from some selected SSA countries. Quant Financ Econ 5: 247-263. doi: 10.3934/QFE.2021011
![]() |
[43] | Farouq IS, Sambo NU, Jakada AH, et al. (2021) Real Exchange Rate and Economic Growth: The Interacting Role of Financial Development in Nigeria. Iran Econ Rev. |
[44] |
Garber PM, Svensson LE (1995) The operation and collapse of fixed exchange rate regimes. Handbook Int Econ 3: 1865-1911. doi: 10.1016/S1573-4404(05)80016-4
![]() |
[45] | Gül H, Özer M (2018) Frequency domain causality analysis of tourism and economic activity in Turkey. Eur J Tourism Res 19: 86-97. |
[46] | Gylfason T, Radetzki M (1985) Does devaluation make sense in the least developed coun tries? Seminar paper No 314. Institute for International Economics Studies, University of Stockholm. |
[47] | Gylfason T, Schmid M (1983) Does devaluation cause stagflation? Canadian J Econ, 641-654. |
[48] |
Hirschman AO (1943) The commodity structure of world trade. Q J Econ 57: 565-595. doi: 10.2307/1884656
![]() |
[49] | Ismaila M (2016) Exchange rate depreciation and Nigeria economic performance after Structural Adjustment Programmes (SAPs). NG-J Social Dev 417: 1-11. |
[50] | Iyeli II, Utting C (2017) Exchange rate volatility and economic growth in Nigeria. Int J Econ Commer Manage 5: 583-595. |
[51] |
Jehan Z, Irshad I (2020) Exchange Rate Misalignment and Economic Growth in Pakistan: The Role of Financial Development. Pakistan Dev Rev 59: 81-99. doi: 10.30541/v59i1pp.81-99
![]() |
[52] | Kamin SB, Klau M (1998) Some multi-country evidence on the effects of real exchange rates on output. FRB International Finance Discussion Paper. |
[53] |
Kapetanios G, Shin Y, Snell A (2003) Testing for a unit root in the nonlinear STAR framework. J Econometrics 112: 359-379. doi: 10.1016/S0304-4076(02)00202-6
![]() |
[54] | Karimo TM, Ogbonna OE (2017) Financial deepening and economic growth nexus in Nigeria: Supply-leading or demand-following? Economies 5: 4. |
[55] | Kassi DF, Sun G, Gnangoin YT, et al. (2019) Dynamics between Financial development, Energy consumption and Economic growth in Sub-Saharan African countries: Evidence from an asymmetrical and nonlinear analysis. |
[56] |
King RG, Levine R (1993) Finance and growth: Schumpeter might be right. Q J Econ 108: 717-737. doi: 10.2307/2118406
![]() |
[57] |
Lawal AI, Somoye RO, Babajide AA (2016) Impact of oil price shocks and exchange rate volatility on stock market behavior in Nigeria. Binus Bus Rev 7: 171-177. doi: 10.21512/bbr.v7i2.1453
![]() |
[58] |
Bahmani-Oskooee M, Nasir MA (2020) Asymmetric J-curve: evidence from industry trade between US and UK. Appl Econ 52: 2679-2693. doi: 10.1080/00036846.2019.1693700
![]() |
[59] | Moses TK, Victor OU, Uwawunkonye EG, et al. (2020) Does Exchange Rate Volatility Affect Economic Growth in Nigeria? Int J Econ Financ 12: 1-54. |
[60] |
Nasir MA, Leung M (2021) US trade deficit, a reality check: New evidence incorporating asymmetric and non-linear effects of exchange rate dynamics. World Econ 44: 818-836. doi: 10.1111/twec.12986
![]() |
[61] |
Nasir MA, Simpson J (2018) Brexit associated sharp depreciation and implications for UK's inflation and balance of payments. J Econ Stud 45: 231-246. doi: 10.1108/JES-02-2017-0051
![]() |
[62] |
Nasir MA, Jackson K (2019) An inquiry into exchange rate misalignments as a cause of major global trade imbalances. J Econ Stud 46: 902-924. doi: 10.1108/JES-03-2018-0102
![]() |
[63] | Nnanna OJ (2002) Monetary policy and exchange rate stability in Nigeria. |
[64] | Nsofor ES, Takon SM, Ugwuegbe SU (2017) Modeling Exchange Rate Volatility and Economic Growth in Nigeria. Noble Int J Econ Financ Res 2: 88-97. |
[65] | Nwosu NCF (2016) Impact of exchange rate volatility on economic growth in Nigeria, 1987-2014 (unpublished Ph. D Thesis). Department of Banking and Finance, University of Nigeria, Enugu. |
[66] | Obeng CK (2017) Effects of Exchange Rate Volatility on Non-Traditional Exports in Ghana. |
[67] | Obstfeld M, Rogoff KS, Wren-Lewis S (1996) Foundations of international macroeconomics. Cambridge, MA: MIT press. |
[68] |
Odusola AF, Akinlo AE (2001) Output, inflation, and exchange rate in developing countries: An application to Nigeria. Dev Econ 39: 199-222. doi: 10.1111/j.1746-1049.2001.tb00900.x
![]() |
[69] |
Ohlan R (2017) The relationship between tourism, financial development and economic growth in India. Future Bus J 3: 9-22. doi: 10.1016/j.fbj.2017.01.003
![]() |
[70] |
Oloyede JA, Fapetu O (2018) Effect of exchange rate volatility on economic growth in Nigeria (1986-2014). Afro-Asian J Financ Account 8: 404-412. doi: 10.1504/AAJFA.2018.095243
![]() |
[71] | Owolabi SA, Adegbite RO (2013) Nigeria and the Structural Adjustment Programme. Nigerian Economic Structure, Growth and Development., Benin City, 387-402. |
[72] |
Ozer M, Kamisli M (2016) Frequency domain causality analysis of interactions between financial markets of Turkey. Int Bus Res 9: 176-186. doi: 10.5539/ibr.v9n1p176
![]() |
[73] |
Pesaran MH, Shin Y, Smith RJ (2001) Bounds testing approaches to the analysis of level relationships. J Appl Econometrics 16: 289-326. doi: 10.1002/jae.616
![]() |
[74] | Phiri A (2018) Nonlinear relationship between exchange rate volatility and economic growth (No. 08/2018). EERI Research Paper Series. |
[75] |
Rebelo S (1991) Long-run policy analysis and long-run growth. J Political Econ 99: 500-521. doi: 10.1086/261764
![]() |
[76] |
Rhodd RT (1993) The effect of real exchange rate changes on output: Jamaica's devaluation experience. J Int Dev 5: 291-303. doi: 10.1002/jid.3380050305
![]() |
[77] |
Sehrawat M, Giri AK (2016) Financial development, poverty and rural-urban income inequality: evidence from South Asian countries. Qual Quant 50: 577-590. doi: 10.1007/s11135-015-0164-6
![]() |
[78] | Sekkat K (2012) Exchange rate undervaluation, financial development and growth. In Economic Research Forum, Working Paper (No. 742). |
[79] | Serven L (1997) Irreversibility, uncertainty and private investment: Analytical issues and some lessons for Africa. J Afr Econ 6: 229-268. |
[80] |
Shahbaz M, Van Hoang TH, Mahalik MK, et al. (2017) Energy consumption, financial development and economic growth in India: New evidence from a nonlinear and asymmetric analysis. Energy Econ 63: 199-212. doi: 10.1016/j.eneco.2017.01.023
![]() |
[81] | Shin Y, Yu B, Greenwood-Nimmo M (2009) Modelling asymmetric cointegration and dynamic multipliers in an ARDL framework, In: International Conference on Applied Economics and Time Series Econometrics. |
[82] |
Velasco C (1999) Gaussian semiparametric estimation of non‐stationary time series. J Time Series Anal 20: 87-127. doi: 10.1111/1467-9892.00127
![]() |
[83] | World Bank (2020) World Development Indicators (dataset). Available from: https://databank.worldbank.org/source/world-development-indicators. |
[84] | Yakub MU, Sani Z, Obiezue TO, et al. (2019) Empirical investigation on exchange rate volatility and trade flows in Nigeria. Central Bank Nigeria Econ Financ Rev 57: 23-46. |
[85] | Yakubu AS, Aboagye AQ, Mensah L, et al. (2018) Effect of financial development on international trade in Africa: Does measure of finance matter? J Int Trade Econ Dev 27: 917-936. |
[86] | Zhou P, Qi Z, Zheng S, et al. (2016) Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. arXiv preprint arXiv: 1611.06639. |
1. | S. J. Gutowska, K. A. Hoffman, K. F. Gurski, Improving adherence to a daily PrEP regimen is key when considering long-time partnerships, 2024, 18, 1751-3758, 10.1080/17513758.2024.2390843 |
Feature | Description |
Customer ID | Unused |
Credit score | Input |
Country | Input |
Gender | Input |
Age | Input |
Tenure | Input |
Balance | Input |
Products number | Input |
Credit card | Input |
Active member | Input |
Estimated | Input |
Churn | Target |
Churn % | Yes | No |
20∗ | 2,037 | 7,963 |
30 | 3,583 | 7,963 |
40 | 5,574 | 7,963 |
50 | 7,963 | 7,963 |
Feature | Description |
Fraud | Fraud transaction, indicator variable |
Type | Type of online transaction |
Amount | The amount of the transaction |
OldbalanceOrg | Balance before the transaction |
NewbalanceOrig | Balance after the transaction |
OldbalanceDest | Initial balance of recipient before the transaction |
NewbalanceDest | The new balance of recipient after the transaction |
Fraud % | Yes | No |
1∗ | 1,059 | 109,047 |
5 | 5,452 | 109,047 |
10 | 10,905 | 109,047 |
15 | 16,357 | 109,047 |
Dataset | Class % | Precision | Recall | F1-score | Accuracy |
Churn | 20∗ | 45 | 78 | 57 | 76 |
30 | 64 | 84 | 73 | 81 | |
40 | 76 | 83 | 80 | 82 | |
50 | 83 | 80 | 82 | 83 | |
Fraud | 1∗ | 17 | 49 | 29 | 94 |
5 | 53 | 52 | 52 | 95 | |
10 | 66 | 54 | 59 | 97 | |
15 | 69 | 60 | 63 | 98 |
Feature | Description |
Customer ID | Unused |
Credit score | Input |
Country | Input |
Gender | Input |
Age | Input |
Tenure | Input |
Balance | Input |
Products number | Input |
Credit card | Input |
Active member | Input |
Estimated | Input |
Churn | Target |
Churn % | Yes | No |
20∗ | 2,037 | 7,963 |
30 | 3,583 | 7,963 |
40 | 5,574 | 7,963 |
50 | 7,963 | 7,963 |
Feature | Description |
Fraud | Fraud transaction, indicator variable |
Type | Type of online transaction |
Amount | The amount of the transaction |
OldbalanceOrg | Balance before the transaction |
NewbalanceOrig | Balance after the transaction |
OldbalanceDest | Initial balance of recipient before the transaction |
NewbalanceDest | The new balance of recipient after the transaction |
Fraud % | Yes | No |
1∗ | 1,059 | 109,047 |
5 | 5,452 | 109,047 |
10 | 10,905 | 109,047 |
15 | 16,357 | 109,047 |
Dataset | Class % | Precision | Recall | F1-score | Accuracy |
Churn | 20∗ | 45 | 78 | 57 | 76 |
30 | 64 | 84 | 73 | 81 | |
40 | 76 | 83 | 80 | 82 | |
50 | 83 | 80 | 82 | 83 | |
Fraud | 1∗ | 17 | 49 | 29 | 94 |
5 | 53 | 52 | 52 | 95 | |
10 | 66 | 54 | 59 | 97 | |
15 | 69 | 60 | 63 | 98 |