Classical and Bayesian inference for the discrete Poisson Ramos-Louzada distribution with application to COVID-19 data

Ibrahim Alkhairy; Ibrahim Alkhairy

doi:10.3934/mbe.2023628

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 8: 14061-14080. doi: 10.3934/mbe.2023628

Previous Article Next Article

Research article Special Issues

Classical and Bayesian inference for the discrete Poisson Ramos-Louzada distribution with application to COVID-19 data

Ibrahim Alkhairy ^,

Department of Mathematics, Al-Qunfudah University College, Umm Al-Qura University, Mecca, Saudi Arabia

Academic Editor: Yang Kuang

Received: 27 February 2023 Revised: 27 May 2023 Accepted: 12 June 2023 Published: 25 June 2023

The present study is based on the derivation of a new extension of the Poisson distribution using the Ramos-Louzada distribution. Several statistical properties of the new distribution are derived including, factorial moments, moment-generating function, probability moments, skewness, kurtosis, and dispersion index. Some reliability properties are also derived. The model parameter is estimated using different classical estimation techniques. A comprehensive simulation study was used to identify the best estimation method. Bayesian estimation with a gamma prior is also utilized to estimate the parameter. Three examples were used to demonstrate the utility of the proposed model. These applications revealed that the PRL-based model outperforms certain existing competing one-parameter discrete models such as the discrete Rayleigh, Poisson, discrete inverted Topp-Leone, discrete Pareto and discrete Burr-Hatke distributions.

Keywords:

Citation: Ibrahim Alkhairy. Classical and Bayesian inference for the discrete Poisson Ramos-Louzada distribution with application to COVID-19 data[J]. Mathematical Biosciences and Engineering, 2023, 20(8): 14061-14080. doi: 10.3934/mbe.2023628

Related Papers:

[1]	Fang Liu, Yanfei Du . Spatiotemporal dynamics of a diffusive predator-prey model with delay and Allee effect in predator. Mathematical Biosciences and Engineering, 2023, 20(11): 19372-19400. doi: 10.3934/mbe.2023857
[2]	Kawkab Al Amri, Qamar J. A Khan, David Greenhalgh . Combined impact of fear and Allee effect in predator-prey interaction models on their growth. Mathematical Biosciences and Engineering, 2024, 21(10): 7211-7252. doi: 10.3934/mbe.2024319
[3]	Yun Kang, Sourav Kumar Sasmal, Amiya Ranjan Bhowmick, Joydev Chattopadhyay . Dynamics of a predator-prey system with prey subject to Allee effects and disease. Mathematical Biosciences and Engineering, 2014, 11(4): 877-918. doi: 10.3934/mbe.2014.11.877
[4]	Mengyun Xing, Mengxin He, Zhong Li . Dynamics of a modified Leslie-Gower predator-prey model with double Allee effects. Mathematical Biosciences and Engineering, 2024, 21(1): 792-831. doi: 10.3934/mbe.2024034
[5]	Juan Ye, Yi Wang, Zhan Jin, Chuanjun Dai, Min Zhao . Dynamics of a predator-prey model with strong Allee effect and nonconstant mortality rate. Mathematical Biosciences and Engineering, 2022, 19(4): 3402-3426. doi: 10.3934/mbe.2022157
[6]	A. Q. Khan, I. Ahmad, H. S. Alayachi, M. S. M. Noorani, A. Khaliq . Discrete-time predator-prey model with flip bifurcation and chaos control. Mathematical Biosciences and Engineering, 2020, 17(5): 5944-5960. doi: 10.3934/mbe.2020317
[7]	Yuhong Huo, Gourav Mandal, Lakshmi Narayan Guin, Santabrata Chakravarty, Renji Han . Allee effect-driven complexity in a spatiotemporal predator-prey system with fear factor. Mathematical Biosciences and Engineering, 2023, 20(10): 18820-18860. doi: 10.3934/mbe.2023834
[8]	Kunlun Huang, Xintian Jia, Cuiping Li . Analysis of modified Holling-Tanner model with strong Allee effect. Mathematical Biosciences and Engineering, 2023, 20(8): 15524-15543. doi: 10.3934/mbe.2023693
[9]	Claudio Arancibia–Ibarra, José Flores . Modelling and analysis of a modified May-Holling-Tanner predator-prey model with Allee effect in the prey and an alternative food source for the predator. Mathematical Biosciences and Engineering, 2020, 17(6): 8052-8073. doi: 10.3934/mbe.2020408
[10]	Yongli Cai, Malay Banerjee, Yun Kang, Weiming Wang . Spatiotemporal complexity in a predator--prey model with weak Allee effects. Mathematical Biosciences and Engineering, 2014, 11(6): 1247-1274. doi: 10.3934/mbe.2014.11.1247

Abstract

1. Introduction

The vigorous advancement of internet technology continues to generate large volumes of data through several sources, such as media, the cloud, the Web, the Internet of Things and databases ^[1]. The aggregation of these sources is referred to as big data, and companies are looking to process and analyze these huge data sets to extract benefits ^[2]. As a result, numerous studies have shown the benefits of BDA in organizations. Specifically, the use of BDA enhances the prediction of future product development trends, which improves the decision-making process ^[2,3,4,5], and enhances supply chain systems ^[6]. BDA prediction is also paramount in the promotion of firm performance ^[7,8,9], improvement of marketing efficiency ^[5,10] and prediction of market trends ^[5,11]. The momentousness of BDA adoption culminates with the development of a sustainable dynamic economic system that takes advantage of current contextual demands ^[12]. Evidence shows that firms that succeed in implementing BDA primarily graduate into major cross-national corporations. Examples of such corporations include Google, Apple, Twitter, Uber, Walmart, Amazon, IBM Watson, Rolls-Royce, Toyota and others ^[13]. Despite the benefits of BDA for firms and economic performance, numerous companies still encounter an assortment of barriers that inhibit the adoption of BDA, especially by SMEs ^[14,15,16]. For most developing economies, SMEs are pivotal in economic development and validation of BDA implementation. However, Coleman et al. ^[17] indicated that SMEs are still slow in implementing BDA, as they are faced with several barriers in the application of big data ^[17,18]. Del Vecchio et al. ^[19] pointed out the challenges and benefits of big data for SMEs. Noonpakdee et al. ^[20] presented barriers when Thailand SMEs adopted big data. Similarly, Chuah and Thurusamry ^[21] mentioned the challenges of SMEs in Malaysia using BDA. In addition, Mangla et al. ^[22] demonstrated the performance of SMEs' adoptions of BDA in India. Park and Kim ^[23] and Maroufkhani et al. ^[9] identified drivers of big data adoption among Korean and Iranian SMEs. However, the majority of these studies concentrate on the advantages and efficiency of BDA adoption, as well as the challenges that SMEs face when performing BDA. Previous research examining the factors influencing the use of BDA by SMEs is still scarce. With limited studies on BDA application by SMEs, such as in Vietnam, it becomes very difficult for SMEs to adopt BDA. The Technology-Organization-Environment (TOE) framework is composed of technology, organization and environment pillars ^[24]. It is considered to be the most comprehensive and flexible approach for examining company decisions on the adoption and implementation of information technology-based innovations ^[25]. Therefore, this study applies the TOE framework and four data mining algorithms (CHAID, Bayesian networks, neural networks and C5.0) to identify the predictors of readiness to adopt BDA by SMEs. The study was guided by the following objectives:

1) To identify the best model for the predicting factors' influences on the readiness to adopt BDA among SMEs and

2) To predict the key factors that affect the readiness to adopt BDA in SMEs.

The findings will be useful for managers, policymakers and providers to understand the influences of BDA adoption readiness. Managers can, therefore, build competitive strategies to enhance company performance through the use of BDA. Additionally, the study proves new techniques that can be used to predict the factors influencing enterprise readiness to adopt BDA.

2. Literature review

2.1. Big data analytics adoption among SMEs

Big data includes both structured and unstructured large volumes of data, and their analysis requires specific processing. The key features of the big data process are categorized into 3 Vs: (i) volume, (ii) velocity and (iii) variety. In this case, volume depicts the amount of information in the dataset, while velocity refers to the rate at which data are created. Variety indicates the different forms of data that are created. Zhong et al. ^[26] added two more Vs, verification and value, to characterize big data as a "5Vs" data source. In this case, verification concerns bad data that need to be verified, whereas value addresses the economic and social costs of application. On the other hand, Saggi and Jain ^[27] classified big data features into volume, velocity, variety, valence, veracity, variability and value to produce the "7Vs" classification. The valence is related to the complexity of the data, and veracity reflects accuracy within the dataset, while the inconsistencies in all data are mostly responsible for variability.

Ideally, BDA involves two components, big data and business analytics ^[5]. The former provides the foundation for informational and technological analysis for business activities, whereas the latter provides valuable insights necessary for the improvement of the decision process in the business unit. This has a multidisciplinary benefit that promotes firm business performance ^[28]. For example, big data has been adopted in the manufacturing sector ^[9], the health care sector ^[29], the service sector ^[26] and the hospitality industry ^[30]. Dubey et al. ^[31] argued that BDA presents unequivocal and fundamental impact effects on the swiftness of supply chains and competitive advantage. Previous studies that presented benefits, challenges and performance applied big data in SMEs ^{[17,19,20,22,32,33]}. For example, Park and Kim ^[23] used the analytic hierarchy process and regression analysis and found that benefits received, technological abilities, financial abilities and data quality are the major factors predicting the intention to apply big data among Korean companies. Mangla et al. ^[22] applied structural equation modeling (SEM) to show that BDA increased project performance in Indian SMEs. Similarly, Maroufkhani et al. ^[9] and Lutfi et al. ^[34] also used partial least squares structural equation modeling (PLS-SEM) to identify the elements impacting the intentions of Iranian SMEs and Jordanian SMEs to use BDA. In addition, Sun et al. ^[35], Maroufkhani et al. ^[36] and Baig et al. ^[37] used a review of related articles to figure out drivers of an organization's inclination toward the utilization of big data for businesses purposes. Clearly, most of the previous studies on factors affecting the intentions of BDA adoption used latent variables. This leads to the limitation of independent factors ^[38]. The observed variables (e.g., demographic variables, sector, firm size) are rarely included in the research model. This research works to bridge this gap.

2.2. Theoretical background

2.2.1. Technology-Organization-Environment framework

The TOE framework is useful in revealing the drivers of decisions to embrace new information technology ^[24]. It is a threefold framework consisting of technology, organization and environment. The technology pillar defines factors associated with tools, software, IT infrastructure, etc. which affect decisions to apply big data by individuals and/or organizations. The organization pillar defines the capacity of a firm to acquire competence in the employment of multiple resources required for the operation of information systems in firms. The environmental pillar consists of multiple industrial features, e.g., competitors and vendor support, directly or indirectly affecting the operations of enterprises. The TOE framework is considered to be flexible and is widely used in technology application studies amongst companies ^[39]. Some previous studies on BDA adoption have applied the TOE framework. Sun et al. ^[35] and Baig et al. ^[37] laid out a synopsis of the determinants of big data adoption using the TOE framework. Park et al. ^[40] and Park and Kim ^[23] applied the TOE framework to ascertain the drivers of big data adoption among Korean companies. Similarly, Lai et al. ^[41], used the TOE framework to identify the determinants of BDA adoption by Chinese firms. Maroufkhani et al. ^[9] applied the framework to find out the determinants of BDA application among SMEs in Iranian. However, previous studies evaluating factors affecting BDA mostly refer to latent variables without considering observed variables. Therefore, the present study extends the TOE framework to understand the drivers of BDA adoption. The research model of this study is shown in Figure 1.

Figure 1. Research model.

DownLoad: Full-Size Img PowerPoint

2.2.2. Technology dimension

The technology pillar involves intra- and inter-organizational drivers that influence company decisions to embrace new information technology ^[42]. In this dimension, the first factor mentioned is the relative advantage, which outlines the level to which the new proposed technology provides greater benefit for firms ^[43]. According to Ghobakhloo et al. ^[44], SMEs are only willing to embrace new technology if the said advantages outweigh the performance of existing technology. IT infrastructure is salient for organizational competitiveness ^[30], reflecting a firm's ability to operationalize information systems. However, SMEs often lack IT resources, undercutting their abilities for data collection and analysis ^[19]. According to Wang and Wang ^[32], the lack of IT specialists is a major drawback for most SMEs in attaining flexibility in IT infrastructure usage. Data quality is an important factor leading to the success of enterprises' BDA adoption.

Big data stockpiles could be structured, semi-structured or unstructured. Organizations must choose specific software to ensure the quality of the data as well as the efficiency of BDA ^[14]. Park and Kim ^[23] mentioned that data quality has a great influence on big data adoption decisions among Korean firms. The security issue is critical for firms' decisions to adopt BDA. Third parties are privy to personal and company information, thus exposing individuals and companies to cybercrime ^[45]. Therefore, data security is a key factor affecting the decisions of enterprises to adopt BDA ^[35]. Technical competence refers to expertise, which is a prerequisite for analyzing big data by employees. Yadegaridehkordi et al. ^[30] indicated that enough knowledge for staff to analyze information technology is an important factor affecting the application of innovation in organizations. Alharthi et al. ^[14] concluded that staff lack of BDA skills is a barrier when companies adopt BDA.

2.2.3. Organization dimension

The organizational dimension represents different organizational conditions that affect readiness toward the adoption of BDA. The first element is management support, which is critically vital in the adoption of an innovation ^[46]. If managers realize the benefits of BDA adoption, they can allocate the resources needed for implementation. By contrast, if management does not see the profits of BDA adoption, they will oppose the application of that data ^[47]. Second, the adoption of BDA is attached to a cost factor to maintain and develop the application of big data ^[35]. In this regard, company development-related costs are usually funded through support from the financial institution. Such support tends to be limited for SMEs compared to larger firms, thereby undermining the adoption of BDA by small companies ^[17]. Hence, firm size is considered an essential driver for the adoption of technological innovations ^[24]. The type of industry is another driver believed to influence the intentions to apply new technology in enterprises. Gangwar ^[48] pointed out that there was a significant difference between the manufacturing and the service sectors regarding BDA. Finally, decision-making culture is another factor that influences the adoption of BDA. More often than not, organizations that apply an evidence-based decision-making culture embrace big data analytics to develop evidence that enhances managers' competence for strategic decision-making, thereby improving enterprise profitability ^[35].

2.2.4. Environment dimension

Environmental factors include external factors that the organization may encounter ^[49]. Factors such as competition pressure, partner pressure and government support are perceived as external drivers of big data adoption by SMEs ^[23]. Competitive pressure outlines the extent to which competitors affect organizational decisions towards the adoption of new technologies ^[24]. The role of competition pressure is widely acknowledged in the literature on IT adoption ^[50,51]. Zhu et al. ^[52] revealed the importance of the pressure from trading partners in influencing company decisions to adopt and utilize new information technology. In addition, the government also plays a fundamental role in influencing the adoption of information technology. If the government exudes a strong political will and ensures a good institutionally enabling environment for the enrollment of big data technology, firms are often encouraged to develop internal policies for the adoption and implementation of BDA. Such a positive relationship has been confirmed by numerous studies ^[35,41]. Government support and policy include the provision of public data, fostering of experts, protecting intellectual property and regulation for privacy and security that affect the use of big data by firms ^[53].

2.2.5. Manager's characteristics dimension

Rojas-Méndez et al. ^[54] demonstrated that demographic variables (gender, age, education level) are important factors for predicting people's willingness to adopt the technology. In this regard, the manager's level of education is the most important demographic characteristic affecting the application of technology ^[54]. Parasuraman and Colby ^[55] pointed out that there is a need for studies focusing on factors such as age, education level, occupation and demographic characteristics to assess the readiness to use the new technology of each person. For this reason, the manager's characteristics dimension is included to predict the determinants of big data adoption by SMEs.

2.3. Data mining

Data mining includes many different algorithms used mainly for classification purposes. CHAID analysis is an algorithm that develops a predictive model that merges predictors that best explain the response variable ^[56]. A Bayesian network is a probability-based graphical model that represents expertise about an uncertain domain, where individual nodes correspond to some random variable, and each edge represents the conditional probability for the corresponding random variables ^[56,57]. Neural networks are a set of connected input/output units where each connection has a distinct weight associated with each other ^[56]. One of the most often used decision tree inducers is the C5.0 model, which divides the sample according to the field that delivers the most information gained at each level.

The four algorithms have some differences. Neural networks are widely used because of their ability to produce results quickly, although their capacity for problem-solving is limited. The CHAID model uses simple predictions based on the frequency distribution of potential problems. The C5.0 model is considered an algorithm with outstanding performance and high accuracy ^[58].

The data mining technique is applied in research to collect data from questionnaires and predict factors affecting the research problem. For instance, Cortez and Silva ^[59] collected data from 788 students in a public school in Portugal by questionnaire. The questionnaire included 37 items that mentioned demographics, social and school information. Four algorithms, consisting of decision trees, random tree, neural networks and support vector machines, were used to predict students' mathematics and Portuguese grades in this study. Yukselturk et al. ^[60] predicted dropout students through four algorithms: k-nearest neighbor, decision tree, naive Bayes and neural network. In that study, data was collected from 189 students in Turkey. The questionnaire included ten variables to predict students who drop out of courses. Applying the data mining technique, the researcher can easily discover unexpected factors ^[61]. However, studies using the data mining technique to predict factors affecting the adoption of BDA have still not been found.

3. Methodology

3.1. Data collection and sample

The questionnaire was literature-based and collected comments from professionals and managers of SMEs. The questionnaire was partitioned into three sections. Section A used thirty-five items collecting data on determinants of readiness to implement big data among SMEs. Section B consisted of nine items assessing the readiness to apply BDA. The first two sections used a seven-point agreement Likert-scale, ranging from 1 for "Strongly Disagree" to 7 for "Strongly Agree." Section C collected data on the respondents' socio-economic characteristics.

The subjects of this study are SMEs involved in manufacturing and service provision. The manufacturing and service sectors are two areas that have important roles in the economies of each country ^[62]. Manufacturing refers to the activities of people using tools and machines to convert raw materials into finished products, transport them to suppliers and recycle used products ^[26,63]. Services include areas such as retail, finance, tourism, health, accommodation services, restaurants, etc., whereby the service sector provides services to consumers. The questionnaire was emailed to Vietnamese managers of SMEs that met the eligibility criterion of the study. A total sample of 240 managers of manufacturing and service provider companies participated in the study. The data were collected during the period from September to December 2020.

Table 1 shows the respondents' demographic analysis. The gender proportion showed that the majority of respondents were males (72.5%), followed by females (27.5%). Age distribution was such that the majority of the respondents were aged 30 to 45 (57.9%), with those aged ≥ 46 accounting for 29.2%, and those aged < 30 accounted for 12.9%. The descriptive statistics revealed that 46.7% of managers hold bachelor's degrees, 39.2% hold post-graduate degrees, and only 14.2% have college or vocational training. Firm size showed that the majority of participants were small enterprises (82.5%), and medium enterprises accounted for 17.5%. Among these firms, 50.8% were manufacturing firms, and 49.2% were service firms.

Table 1. Demographics of respondents (n = 240).

Variable	Type	Frequency	Percentage (%)
Gender	Male	174	72.5
Gender	Female	66	27.5
Age	< 30	31	12.9
	30–45	139	57.9
	≥ 46	70	29.2
Education level	College education	34	14.2
	Bachelor's degree	112	46.7
	Master's degree or above	94	39.2
Role of respondent	Chief Executive Officer	85	35.4
Role of respondent	Executive management	91	37.9
	IT management	64	26.7
Sector	Manufacturing	122	50.8
Sector	Service	118	49.2
Firm size	Small enterprise	198	82.5
Firm size	Medium enterprise	42	17.5

| Show Table

DownLoad: CSV

3.2. Reliability, validity analysis and coding of the readiness to apply big data analysis

In this study, each variable is measured by at least three items based on references. To be more specific, the variables are relative advantage (four items) ^[51], IT infrastructure (three items) ^[20], data quality (three items) ^[41], data security (three items) ^[64], technical competence (four items) ^[65], management support (three items) ^[66], cost (three items) ^[51], decision-making culture (three items) ^[35], competitive pressure (three items) ^[67], partner pressure (three items) ^[67], government support (three items) ^[26] and readiness to apply BDA in SMEs (nine items) ^[37,55,68].

To assess the reliability and validity of latent variables, Cronbach's α value, composite reliability (CR), average variance extracted (AVE) of all constructs and factor loadings of items are shown in Table A1. A preliminary dataset analysis of External Factor Analysis (EFA) was carried out. The KMO (Kaiser-Meyer-Olkin) value was 0.814, being greater than the critical value (0.7) ^[69], and the Bartlett sphericity test's significant value was p = 0.000, indicating that factor analysis is suitable for the original dataset. Cronbach's α value was computed to assess the reliability of the questionnaire. The reliability test indicated that the value of Cronbach's α for the latent variables ranged between 0.626 and 0.867. According to Hair et al. ^[70], if the Cronbach's α value is greater than 0.700 (0.600 acceptable), the questionnaire has good internal consistency. Therefore, the questionnaire for this study was found to be consistent and reliable.

All factor loadings (from 0.520 to 0.865) were higher than the acceptable limit (0.5) ^[69]. The CR of all constructs indicated good internal consistency, being higher than 0.7 ^[71]. All constructs, except for data quality (0.457) and management support (0.471), had AVE values higher than 0.5, indicating good convergent validity. Taking into consideration the Fornell and Larcker ^[72] proposal that an AVE value equal to 0.4 can be acceptable if the CR value is greater than 0.6, the data quality and management support variables were accepted in this study because they had a CR value high of 0.7. This proves that all latent variables in this study have acceptable convergent values.

To predict the factors' influences on BDA adoption readiness, the dependent variable (readiness to apply big data in SMEs) was divided into two options based on an average of nine items that identify the readiness to apply BDA among SMEs. The first option was coded "1 = Low readiness, " with the mean values of the nine items < 6.0, and "2 = High readiness" was used with the mean values of the nine items ≥ 6.0. Table 2 and Table 3 present the sixteen independent (input) variables and the dependent (target) variable.

Table 2. The description of the independent variables.

No.	Variable	Data type	Description
Technology dimension
1	Relative advantage	Continuous	Mean value
2	IT infrastructure	Continuous	Mean value
3	Data quality	Continuous	Mean value
4	Data security	Continuous	Mean value
5	Technical competence	Continuous	Mean value
Organization dimension
6	Management support	Continuous	Mean value
7	Cost	Continuous	Mean value
8	Firm size	Nominal	1 = "Small", 2 = "Medium"
9	Sector	Nominal	1 = "Manufacturing", 2 = "Service"
10	Decision-making culture	Continuous	Mean value
Environment dimension
11	Competitive pressure	Continuous	Mean value
12	Partner pressure	Continuous	Mean value
13	Government support	Continuous	Mean value
Manager's characteristics dimension
14	Gender	Nominal	1 = "Male", 2 = "Female"
15	Age	Nominal	1 = " < 30", 2 = "30–45", 3 = "≥ 46"
16	Education level	Nominal	1 = "High school, College/Vocational education", 2 = "Bachelor's degree", 3 = "Master's degree, or above"

| Show Table

DownLoad: CSV

Table 3. The description of the dependent variable.

Category	Frequency	Percentage (%)
1 (Low readiness)	119	49.6
2 (High readiness)	121	50.4
Total	240	100.0

| Show Table

DownLoad: CSV

3.3. Data analysis

This study used four data mining algorithms that were run through the Statistical Package for Social Sciences (SPSS) 18 software (IBM, Armonk, NY, USA). The algorithms used for the prediction of factors' influences on the adoption readiness of BDA include CHAID, Bayesian networks, neural networks and C5.0. These algorithms are commonly applied in studies that analyze data collected from questionnaires.

CHAID algorithm

CHAID is one of the pioneer algorithms that partition data into multiple subgroups ^[73]. However, this method does not allow for data pruning. CHAID applies the chi-square independence test to identify the splitting rule for each node. This test performs an automatic split categorization of independent categorical variables from continuous variables. Super-classes are then produced through the merging of the input variables based on statistical analogy, maintaining them if they are statistically dissimilar. A comparative analysis between the super-classes and the target variable is done to assess dependency using the chi-square independence test. The super-class that shows the highest significance is then selected as the splitting criteria for the node.

Bayesian networks algorithm

The Bayesian network is popular, being used in multiple research fields ^[74]. This method combines qualitative and quantitative variables. A Bayesian network is a directed graph with an additional set of probability distributions. Here, the graph represents the qualitative aspect, whereas the probability distributions represent the quantitative part. In the graph, the nodes denote dubious factors, while the arcs address the presence of a causal connection between two factors. Bayesian networks are very effective in predictive studies. The structure makes inferences from Bayesian networks robust, reduces the differences of estimated parameters and is also robust against overfitting.

Neural network algorithm

Neural networks are modeled from brain functionality. They use numerous connected receptor units that accept messages from other units, processing them and conveying the new message to other units. However, the output of the neural network is difficult to retrace; hence, interpretation becomes hard. These disadvantages are overridden by the complexity and flexibility of the algorithm, transforming it into a robust and comprehensive discriminator that is applicable to resolve varied problems compared to other methods ^[56].

C5.0 algorithm

The C5.0 algorithm evolved from the C4.5 algorithm as formulated by Ross Quinlan ^[75]. The algorithm has the capacity to segment data into multiple subgroups. The C5.0 possesses pruning ability, selecting splitting rules through an impurity measure ^[56]. The pros of the C5.0 algorithm include its robustness in handling missing data points and several input columns. In addition, the method requires shorter training sessions for estimates and uses normal enhancement techniques to improve the accuracy of the classification function.

3.4. Measures for performance evaluation

This study sought to categorize response variables into two options (Low readiness and High readiness); then, a partition node was inserted to segregate the data into training (70%) and testing (30%) sets. The performance of models was assessed through the confusion matrix (Table 4). Next, the performance of models was analyzed using the attributes of accuracy, precision, recall, specificity, F-measure and area under the receiver operating characteristic (ROC) curve (AUC) and k-fold cross-validation.

Table 4. Form of confusion matrix.

Confusion Matrix of Readiness Low or High		Predicted value
Confusion Matrix of Readiness Low or High		Low readiness	High readiness
Observed value	Low readiness	True Negative (TN)	False Positive (FP)
Observed value	High readiness	False Negative (FN)	True Positive (TP)

| Show Table

DownLoad: CSV

In Table 4, true positive and true negative present the number of correct positive and correct negative samples predicted by the model. False positive and false negative stand for the number of wrong positive and wrong negative samples ^[76,77].

Accuracy is judging the overall correct rate, that is, that the actual category is consistent with the predicted category ^[76,77].

${\rm{Accuracy}} = \frac{\mathrm{T}\mathrm{P}+\mathrm{T}\mathrm{N}}{\mathrm{T}\mathrm{P}+\mathrm{F}\mathrm{P}+\mathrm{F}\mathrm{N}+\mathrm{T}\mathrm{N}}$

(1)

Precision is judging how much of the recall is true, that is, how much of the actual truth is accurately predicted to be true ^[76].

${\rm{Precision}} = \frac{\mathrm{T}\mathrm{P}}{\mathrm{T}\mathrm{P}+\mathrm{F}\mathrm{P}}$

(2)

Recall is the proportion of true positives to the total number of true positives and false negatives ^[76,77].

${\rm{Recall}} = \mathrm{T}\mathrm{P}\mathrm{R} = \frac{\mathrm{T}\mathrm{P}}{\mathrm{T}\mathrm{P}+\mathrm{F}\mathrm{N}}$

(3)

$1 - {\rm{Recall}} = \mathrm{F}\mathrm{P}\mathrm{R} = \frac{\mathrm{F}\mathrm{P}}{\mathrm{F}\mathrm{P}+\mathrm{T}\mathrm{N}}$

(4)

Specificity is the correct rate of judgment that is true, that is, the ratio of true to true among predictions ^[76].

${\rm{Specificity}} = \frac{\mathrm{T}\mathrm{N}}{\mathrm{T}\mathrm{N}+\mathrm{F}\mathrm{P}}$

(5)

F-measure: The harmonic mean of the precision and precision performance measurements is used to calculate the precision recovery curve. A high F-measurement result suggests that the categorization quality is excellent ^[76].

${\rm{F - measure}} = 2 × \frac{\mathrm{P}\mathrm{r}\mathrm{e}\mathrm{c}\mathrm{i}\mathrm{s}\mathrm{i}\mathrm{o}\mathrm{n}\;\mathrm{x}\;\mathrm{R}\mathrm{e}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{l}}{\mathrm{P}\mathrm{r}\mathrm{e}\mathrm{c}\mathrm{i}\mathrm{s}\mathrm{i}\mathrm{o}\mathrm{n}+\mathrm{R}\mathrm{e}\mathrm{c}\mathrm{a}\mathrm{l}\mathrm{l}}$

(6)

AUC: The ROC is a two-dimensional diagram of the false positive rate (FPR) on the horizontal axis versus the true positive rate (TPR) on the vertical axis. Based on Eqs (3) and (4), the TPR and FPR values of the cut-off points between 0 and 1 are calculated, and then the diagram is plotted by joining these data points. The area under the curve (AUC) is an appropriate measure if its value always varies between 0.5 and 1. The AUC is the standard to evaluate the model performance ^[78]. More specifically, the model performance is evaluated as acceptable (0.7 < = AUC < 0.8), good (0.8 < AUC < 0.9) or outstanding (AUC ≥ 0.9) discrimination ^[79].

k-fold cross-validation: In a comparative analysis of various forecast models, the total collection data is commonly divided into training and testing subsets, and thoroughly expecting models are analyzed based on their precision in the test data set. By dividing the information into designing and testing datasets, a decision of doing a single split or multiple splits can be made, which is regularly called k-fold cross-validation. To estimate the performances of classifiers, a stratified 10-fold cross-validation approach is used. Empirical studies showed that 10 folds seem to be an optimal number ^[80]. In this study, each fold of data included 24 cases (240 cases/10 = 24 cases) and was used once to test the performance of the classifier.

To be clearer, the research process of this study is shown in Figure 2.

Figure 2. The study research process.

DownLoad: Full-Size Img PowerPoint

Data were collected from 240 managers of Vietnamese SMEs. A total of sixteen input variables (eleven latent variables and five observed variables) were analyzed through the data mining technique. The performances of prediction models were evaluated through the four classification models. The best performance was revealed by the C5.0 model, predicting readiness to apply big data with more accuracy. Therefore, C5.0 was employed to predict the five observed variables' (firm size, sector, gender, age and education level) impacts on the readiness to apply BDA. Finally, the C5.0 procedure was illustrated as a decision tree.

4. Results

4.1. Prediction accuracy of models

As shown in Table 5, the correctness values of the predictions of CHAID, Bayesian networks, neural network and C5.0 for training data were 83.32, 82.93, 81.10 and 87.20%, respectively. The results of the correct predictions on the testing data were 68.42, 85.53, 71.05 and 89.47%, respectively. Hence, these models have high prediction accuracy.

Table 5. Evaluating the measurement results of four models.

Model type	Title	Training		Testing		AUC
Model type	Title	Training		Testing		Training	Testing
CHAID	Correct	135	83.32%	52	68.42%	0.891	0.747
	Wrong	29	17.68%	24	31.58%
	Total	164		76
Bayesian networks	Correct	136	82.93%	65	85.53%	0.941	0.910
	Wrong	28	17.07%	11	14.47%
	Total	164		76
Neural network	Correct	133	81.10%	54	71.05%	0.861	0.815
	Wrong	31	18.90%	22	28.95%
	Total	164		76
C5.0	Correct	143	87.20%	68	89.47%	0.893	0.939
	Wrong	21	12.80%	8	10.53%
	Total	164		76

| Show Table

DownLoad: CSV

Moreover, the training data showed an AUC value range of 0.861 to 0.941, while the test set ranged from 0.747 to 0.939. Hence, the models were considered good in discriminating the predictors ^[79]. The stream of four models is shown in Figure 3.

Figure 3. Stream of the four models with sixteen input variables.

DownLoad: Full-Size Img PowerPoint

The ROC curve is also used for the evaluation of the classification algorithms. The ROC curve visualizes the false positive rate against the true positive rate. The false positive rate result will change according to the classification threshold value, and the best classification result model can be selected according to the area under the ROC curve. A larger area means the model has a better classification effect. In Figure 4, the results show that on training data, Bayesian networks are the best model, and for testing data, C5.0 is the best model.

Figure 4. Graph of the ROC values of the four models.

DownLoad: Full-Size Img PowerPoint

The coincidence matrix and the evaluation results of the four models are shown in Table 6. It is clear that the resulting values in all four models for accuracy, precision, recall, specificity and F-measure were higher than 0.7, excepting accuracy, precision, recall, specificity and F-measure values on the testing data of the CHAID model and precision and specificity values on the testing data of the neural network model, which were approximately 0.7. This proves that the four models used in this study have good classification quality. Specifically, C5.0 is the model with the highest performance evaluation, followed by Bayesian networks, neural networks, and (the lowest) the CHAID model.

Table 6. Coincidence matrix and the evaluation results of the four models.

Model type	Partition	Title	Coincidence matrix		Accuracy	Precision	Recall	Specificity	F-measure
Model type	Partition	Title	Low readiness	High readiness	Accuracy	Precision	Recall	Specificity	F-measure
CHAID	Training	Low readiness	69	9	0.8232	0.8800	0.7674	0.8846	0.8199
		High readiness	20	66
		Total	89	75
	Testing	Low readiness	28	13	0.6842	0.6486	0.6857	0.6829	0.6667
		High readiness	11	24
		Total	39	37
Bayesian networks	Training	Low readiness	70	8	0.8293	0.8919	0.7674	0.8974	0.8250
		High readiness	20	66
		Total	90	74
	Testing	Low readiness	37	4	0.8553	0.8750	0.8000	0.9024	0.8358
		High readiness	7	28
		Total	44	32
Neural network	Training	Low readiness	65	13	0.8110	0.8395	0.7907	0.8333	0.8144
		High readiness	18	68
		Total	83	81
	Testing	Low readiness	27	14	0.7105	0.6585	0.7714	0.6585	0.7105
		High readiness	8	27
		Total	35	41
C5.0	Training	Low readiness	67	11	0.8720	0.8736	0.8837	0.8590	0.8786
		High readiness	10	76
		Total	77	87
	Testing	Low readiness	36	5	0.8947	0.8649	0.9143	0.8780	0.8889
		High readiness	3	32
		Total	39	37

| Show Table

DownLoad: CSV

Similarly, the results of the 10-fold cross-validation for the four models are shown in Table A2, which indicated that C5.0 is the model with the highest average accuracy among the four selected tested models. Accordingly, the accuracy average for the 10-fold cross-validation of the C5.0 model is 0.885, followed by Bayesian networks (0.833) and neural networks (0.772), making the CHAID model (0.679) the lowest. This can be explained because C5.0 has higher memory performance than other algorithms and then can generate more precise rules. CHAID, an algorithm, applies the chi-square independence test that is suitable for categorical data. However, the input variables of this study are mostly continuous variables. To improve the precision of the model, this algorithm must perform by grouping data into categories. That is the reason why CHAID performs with the least precise predictions.

4.2. Predictor importance of the input variables

Predictor importance is a sensitivity analysis technique. It is used to identify the more important variables and/or omit the least important variables in the forecasting model ^[76].

The important drivers of readiness to adopt BDA are presented in Table 7. In CHAID and Bayesian networks, the most important variable was the management support variable. Conversely, in the neural network, the cost variable was the most critical. The most critical variable in the C5.0 model was data quality. The predictors of four algorithms rank the predictors from most important to least essential based on the total value (total relative importance value for each attribute).

Table 7. The most important factors impacting the readiness for BDA.

Variable	Technique				Total value
Variable	CHAID	Bayesian networks	Neural network	C5.0	Total value
Management support	0.3566	0.3627	0.1179	0.2029	1.0401
Data quality	0.2079	0.2007	0.0973	0.2073	0.7132
Firm size	0.1485	0.0675	0.0000	0.1099	0.3259
Data security	0.0954	0.0455	0.1398	0.0000	0.2807
Cost	0.0000	0.0000	0.1613	0.0779	0.2392
Sector	0.0759	0.0293	0.0516	0.0685	0.2253
Competitive pressure	0.0028	0.0936	0.0610	0.0297	0.1871
Partner pressure	0.0000	0.0728	0.0775	0.0000	0.1503
Gender	0.0000	0.0000	0.0410	0.0958	0.1368
Government support	0.0254	0.0000	0.0720	0.0000	0.0974
Technical competence	0.0000	0.0507	0.0443	0.0000	0.0950
IT infrastructure	0.0000	0.0000	0.0000	0.0747	0.0747
Age	0.0028	0.0000	0.0000	0.0541	0.0569
Decision-making culture	0.0000	0.0349	0.0000	0.0000	0.0349
Education level	0.0343	0.0000	0.0000	0.0000	0.0343
Relative advantage	0.0000	0.0288	0.0000	0.0000	0.0288

| Show Table

DownLoad: CSV

To get an overview of the gauge result of the four models, we consolidated the values of the four models. The mix of these prescient models is known as aggregation-based sensitivity examination and is suggested in light of the fact that it produces hearty, exact models ^[76,81]. As a result, the sixteen input variables were categorized into four dimensions—technology dimension (relative advantage, IT infrastructure, data quality, data security, technical competence), organization dimension (top management support, cost, sector, firm size, decision making culture), environment dimension (competitive pressure, partner pressure, government support) and manager's characteristics dimension (gender, age, education level)—that have an impact on the readiness of BDA adoption. The major predictor variables for BDA adoption among Vietnam SMEs were identified to be management support, data quality, firm size, data security and cost.

4.3. Predicting the effects of observed variables on the readiness to adopt big data in SMEs

Based on the results of the evaluation of the four forecasting models, the C5.0 is the model with the highest predictive accuracy. Therefore, the authors used the C5.0 model to evaluate in detail the observed variables affecting the readiness to use BDA in SMEs. The output variable was the readiness to apply BDA among SMEs (Low readiness and High readiness), and input variables were firm size, sector, gender, age and education level. The stream of the C5.0 model is presented in Figure 5.

Figure 5. Stream of C5.0 model with five observed variables.

DownLoad: Full-Size Img PowerPoint

The process of the C5.0 model consists of five input observed variables. This model used the whole dataset, with the result of a correct prediction percentage of 73.75% and an AUC value of 0.758. This proves that the model has high-performance measurements. The results of the model represented three descriptors splitting nodes (firm size, sector and age).

Figure 6 illustrates the results of the decision tree of the C5.0 model. The first splitting node of readiness to apply BDA in SMEs was firm size. In node 1, the proportion of small companies that are not ready to adopt BDA is 57.58%, while the number of small companies with high readiness is lower (42.42%). Next, node 1 diverged into nodes 2 and 3. In node 2, 69.83% of manufacturing companies were still not ready to adopt BDA, and only 30.17% of companies had high readiness. In node 3, the rate of the services companies' readiness to apply BDA is high, 59.76%, and the figure for low willingness companies was 40.24%. Next, node 3 diverged into nodes 4 and 5. In node 4, 70.97% of service companies with leaders under 46 have a high level of readiness to adopt BDA, whereas only 29.03% of service companies have low readiness. Otherwise, in node 5, with leaders aged 46 and over, the percentage of companies willing to adopt BDA (25.00%) was lower than the percentage of companies that were not ready to adopt BDA (75.00%). Finally, in node 6, the majority of medium companies have a high willingness to adopt BDA (88.09%), whereas only 11.91% of medium enterprises have low readiness.

Figure 6. Prediction readiness to apply BDA by observed variables (C5.0 model).

DownLoad: Full-Size Img PowerPoint

5. Discussion

The findings of the current study demonstrated that sixteen factors of four dimensions (technology, organization, environment and manager's characteristics) have impacts on the readiness to adopt BDA. Furthermore, management support, data quality, firm size, data security and cost were revealed as major predictors of the readiness to apply BDA among Vietnamese SMEs. In addition, medium-sized companies in the service sector are assessed to have higher readiness to apply BDA than other SMEs. In addition, the results of the C5.0 model indicated that firm size, sector and age do have an impact on the BDA adoption readiness.

The results of the study show that management support is the strongest decisive factor in the readiness to apply BDA among Vietnamese SMEs. The result is similar to findings from previous studies such as Sun et al. ^[35], Maroufkhani et al. ^[9], Lai et al. ^[41], Asiaei and Rahim ^[82]. The support of managers will create favorable conditions for the company in maintaining and using technology ^[82]. Realizing the benefits of big data, management can allocate the resources needed for adoption and implementation. By contrast, if the management does not see the benefits of big data for businesses, they will oppose its adoption ^[47].

Generally, data is supposed to be an important input when companies adopt BDA. To perform a successful BDA, data quality is extremely important. Firms have abundant data sources and have high accuracy that will contribute to applying big data readiness. In this study, data quality is a strong factor of BDA adoption, which is consistent with the findings of Park and Kim ^[23].

Not surprisingly, firm size affected the readiness of BDA adoption. This is consistent with the results of Sohaib et al. ^[83] and Alshamaila et al. ^[84]. To be more specific, medium enterprises have higher readiness to adopt BDA than small enterprises. This can be explained by medium-sized companies having larger revenue and more employees than small companies. Therefore, they have many advantages when investing in BDA applications.

Data security was also predicted as a strong influencing factor in this study. Big data includes a lot of personal information ^[14]; hence, it is of serious concern among firms when deciding to adopt BDA. The influence of data security in technology adoption was also found in many previous studies, such as in software-as-a-service adoption ^[85], cloud computing ^[51,83] and big data adoption ^[23,35,37].

Cost is one of the five factors that are predicted to have an important influence on the readiness of SMEs to adopt BDA. This finding is similar to Park and Kim ^[23] and Sun et al. ^[35], who found that cost is an important factor in maintaining and developing the analysis of big data in enterprises. In addition, costs for big data adoption can be a barrier for companies to implementing big data ^[17,86].

The classification results of the C5.0 model with five observed variables show that the service sector has a higher readiness to apply BDA than the manufacturing sector. This result is consistent with Gangwar ^[48], who identified factors influencing big data adoption in Indian companies. This is because service organizations like wholesalers, retailers and lodging providers have early access to information technology systems and high-quality human resources to analyze large amounts of data. Moreover, in the context of the complicated development of the COVID-19 pandemic, wholesale and retail companies in Vietnam have had a rapid shift from traditional shopping to online shopping. As a result, organizations must develop suggestion systems and find ways to respond to client information as quickly as possible. Hence, service SMEs are better prepared to adopt BDA. Manufacturing companies are stated to be encountering numerous obstacles, such as a lack of infrastructure and BDA tools, when it comes to using BDA to optimize supply chains ^[86].

The findings show that small service firms with managers under the age of 46 have a higher readiness to adopt BDA than those firms with older managers. This can be explained by young managers being bolder in adopting new technology, while older managers consider more carefully the necessary conditions when applying BDA, such as information technology, high-quality human resources and finance. In addition, in the implementation of new technologies, some of the older leaders have a lagging mindset, fear of risk and fear of change. This is consistent with the findings of Badri et al. ^[87], who mentioned that elderly teachers are thought to show less technology readiness than younger teachers.

6. Conclusions and implications

Applying BDA plays an important role in helping organizations improve competitiveness, enhance supply chains, optimize logistics and improve business performance. Based on the data mining technique, the findings of the study show that the C5.0 model is the best model to predict factors affecting BDA adoption readiness in SMEs. Five factors have the greatest influence on the readiness to adopt BDA: management support, data quality, firm size, data security and cost. Moreover, an important finding of this study is that the age of managers also affects the readiness to adopt BDA.

This study is useful to managers of SMEs, providers and policymakers in developing better policies and strategies for the adoption of BDA. In terms of managers, the volume of data generated in organizations is growing exponentially. So, how to effectively analyze big data is a matter of great interest to organizations today. The proposed model can assist businesses in determining their readiness to adopt BDA. Furthermore, the findings of the study assist managers in increasing their awareness of the elements affecting the enterprise's readiness to use big data. For example, this research shows that management support is the most important factor influencing BDA adoption readiness. As a result, before deciding to embrace BDA, SME management should be proactive in studying to increase their knowledge of the technology and developing a clear strategy. In terms of service providers, the outcomes of this study reveal that SMEs should prioritize data quality, data security and cost factors when preparing to embrace BDA. SMEs, on the other hand, are having financial challenges. As a result, plans for developing BDA tools, hardware, software and other products that meet the needs of providers' clients in emerging and underdeveloped countries should be formed. In addition, when implementing BDA, suppliers must improve services to support SMEs. In terms of policymakers, the survey revealed that the service sector is more prepared to use big data than the manufacturing sector and that medium-sized businesses are more prepared to use big data than small businesses. As a result, the government should have policies in place to assist each sort of business.

Thanks to the great benefits that BDA contributes to business development, a huge number of businesses are interested in BDA. This study has made significant contributions that help practitioners and researchers understand the importance of influencing factors on the readiness to apply big data in SMEs. First, instead of using traditional analytical methods to perform information-based sensitivity analysis, as shown in previous studies, well-known data mining algorithms were used to develop predictive models in this study. Second, this study explored factors that have strong impacts on the readiness to adopt BDA among SMEs. From these findings, the research model is expected to be a useful reference for practitioners in developing countries and the scientific community for doing future related research.

In addition to the study findings, this study also demonstrates some limitations. First is the limitation on the number of samples when using the data mining technique. Therefore, future studies should be conducted with larger sample sizes. Second, the numbers of input variables and prediction algorithms are limited. In future investigations, the number of input variables should increase, and different forecasting algorithms may be used to evaluate the predictive model's findings.

Acknowledgments

The authors would like to thank all respondents who spent valuable time answering questionnaires and the insightful comments of the reviewers.

Conflict of interest

The authors declare there is no conflict of interest.

Appendix A

Table A1. Reliability and validity assessment.

Variable	Item number	Factor loadings	Cronbach α	CR	AVE
Relative advantage	4	0.530–0.865	0.807	0.805	0.519
IT structure	3	0.617–0.848	0.798	0.809	0.590
Data quality	3	0.569–0.728	0.691	0.714	0.457
Data security	3	0.705–0.756	0.716	0.764	0.520
Technical competence	4	0.744–0.816	0.867	0.867	0.620
Management support	3	0.520–0.781	0.707	0.722	0.471
Cost	3	0.755–0.849	0.798	0.843	0.643
Decision-making culture	3	0.615–0.846	0.746	0.768	0.528
Competitive pressure	3	0.787–0.842	0.856	0.856	0.664
Partner pressure	3	0.690–0.717	0.626	0.751	0.501
Government support	3	0.703–0.727	0.719	0.757	0.509
Readiness to adopt big data	9	0.684–0.810	0.773	0.909	0.526
*Note: CR: Composite Reliability, AVE: Average Variance Extracted

| Show Table

DownLoad: CSV

Table A2. The results of the 10-fold cross-validation for the four model types.

Fold No.	CHAID			Bayesian networks			Neural network			C5.0
Fold No.	Confusion matrix		Accuracy	Confusion matrix		Accuracy	Confusion matrix		Accuracy	Confusion matrix		Accuracy
1	12	5	0.649	13	4	0.784	12	5	0.703	14	3	0.892
1	11	9	0.649	4	16	0.784	6	14	0.703	1	19	0.892
2	21	11	0.729	28	4	0.847	26	6	0.780	27	5	0.881
2	5	22	0.729	5	22	0.847	7	20	0.780	2	25	0.881
3	28	13	0.684	37	4	0.855	27	14	0.711	36	5	0.895
3	11	24	0.684	7	28	0.855	8	27	0.711	3	32	0.895
4	32	21	0.615	49	4	0.846	36	17	0.692	45	8	0.875
4	25	26	0.615	12	39	0.846	15	36	0.692	5	46	0.875
5	53	10	0.736	58	5	0.840	46	17	0.720	55	8	0.888
5	23	39	0.736	15	47	0.840	18	44	0.720	6	56	0.888
6	46	25	0.697	64	7	0.828	51	20	0.752	62	9	0.890
6	19	55	0.697	18	56	0.828	16	58	0.752	7	67	0.890
7	66	13	0.669	71	20	0.828	65	14	0.761	69	9	0.883
7	41	43	0.669	8	64	0.828	25	59	0.761	10	75	0.883
8	84	12	0.728	87	23	0.836	62	34	0.692	83	13	0.887
8	41	58	0.728	9	76	0.836	26	73	0.692	9	90	0.887
9	93	14	0.657	96	11	0.833	86	21	0.681	93	14	0.884
9	60	49	0.657	25	84	0.833	48	61	0.681	11	98	0.884
10	62	55	0.627	105	12	0.835	44	18	0.732	101	16	0.877
10	33	86	0.627	27	92	0.835	22	65	0.732	13	106	0.877
Average			0.679			0.833			0.722			0.885
Confusion matrix illustrates the classiﬁcation of the cases in the test dataset. In the confusion matrix, the columns represent the actual cases, and the rows represent the predicted. Accuracy = (TP + TN)/(TP + FP + TN + FN).

| Show Table

DownLoad: CSV

References

[1]	M. Shoukri, M. H. Asyali, R. VanDorp, D. Kelton, The Poisson inverse Gaussian regression model in the analysis of clustered counts data, J. Data Sci., 2 (2004), 17–32. https://doi.org/10.6339/JDS.2004.02(1).135 doi: 10.6339/JDS.2004.02(1).135
[2]	G. Shmueli, T. P. Minka, J. B. Kadane, S. Borle, P. Boatwright, A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution, J. R. Stat. Soc. Ser. C., 54 (2005), 127–142. https://doi.org/10.1111/j.1467-9876.2005.00474.x doi: 10.1111/j.1467-9876.2005.00474.x
[3]	E. Mahmoudi, H. Zakerzadeh, Generalized poisson–lindley distribution, Commun. Stat. Methods, 39 (2010), 1785–1798. https://doi.org/10.1080/03610920902898514 doi: 10.1080/03610920902898514
[4]	L. Cheng, S. R. Geedipally, D. Lord, The Poisson–Weibull generalized linear model for analyzing motor vehicle crash data, Saf. Sci., 54 (2013), 38–42. https://doi.org/10.1016/j.ssci.2012.11.002 doi: 10.1016/j.ssci.2012.11.002
[5]	H. Hassan, S. A. Dar, P. B. Ahmad, Poisson Ishita distribution: A new compounding probability model, IOSR J. Eng., 9 (2019), 38–46.
[6]	E. Altun, A new model for over-dispersed count data: Poisson quasi-Lindley regression model, Math. Sci., 13 (2019), 241–247. https://doi.org/10.1007/s40096-019-0293-5 doi: 10.1007/s40096-019-0293-5
[7]	B. A. Para, T. R. Jan, H. S. Bakouch, Poisson Xgamma distribution: A discrete model for count data analysis, Model Assist. Stat. Appl., 15 (2020), 139–151. https://doi.org/10.3233/MAS-200484 doi: 10.3233/MAS-200484
[8]	E. Altun, G. M. Cordeiro, M. M. Ristić, An one-parameter compounding discrete distribution, J. Appl. Stat., 49 (2022), 1935–1956. https://doi.org/10.1080/02664763.2021.1884846 doi: 10.1080/02664763.2021.1884846
[9]	M. Ahsan-ul-Haq, A. Al-bossly, M, El-morshedy, M. S. Eliwa, Poisson XLindley distribution for count data : Statistical and reliability properties with estimation techniques and inference, Comput. Intell. Neurosci., 2022 (2022). https://doi.org/10.1155/2022/6503670 doi: 10.1155/2022/6503670
[10]	M. Ahsan-ul-Haq, On poisson moment exponential distribution with applications, Ann. Data Sci., 2022. https://doi.org/10.1007/s40745-022-00400-0 doi: 10.1007/s40745-022-00400-0
[11]	P. L. Ramos, F. Louzada, A Distribution for instantaneous failures, Stats, 2 (2019), 247–258. https://doi.org/10.3390/stats2020019 doi: 10.3390/stats2020019
[12]	D. Roy, Discrete rayleigh distribution, IEEE Trans. Reliab., 53 (2004), 255–260. https://doi.org/10.1109/TR.2004.829161 doi: 10.1109/TR.2004.829161
[13]	H. Krishna, P. S. Pundir, Discrete Burr and discrete Pareto distributions, Stat. Methodol., 6 (2009), 177–188. https://doi.org/10.1016/j.stamet.2008.07.001 doi: 10.1016/j.stamet.2008.07.001
[14]	M. El-Morshedy, M. S. Eliwa, E. Altun, Discrete Burr-Hatke distribution with properties, estimation methods and regression model, IEEE Access, 8 (2020), 74359–74370. https://doi.org/10.1109/ACCESS.2020.2988431 doi: 10.1109/ACCESS.2020.2988431
[15]	A. S. Eldeeb, M. Ahsan-ul-Haq, A. Babar, A discrete analog of inverted Topp-Leone distribution: Properties, estimation and applications. Int. J. Anal. Appl., 19 (2021), 695–708. https://doi.org/10.28924/2291-8639-19-2021-695 doi: 10.28924/2291-8639-19-2021-695
[16]	J. F. Lawless, Statistical models and methods for lifetime data, John Wiley & Sons, 2011.

This article has been cited by:

1.	F. A. Rihan, H. J. Alsakaji, C. Rajivganthi, Stability and Hopf Bifurcation of Three-Species Prey-Predator System with Time Delays and Allee Effect, 2020, 2020, 1076-2787, 1, 10.1155/2020/7306412
2.	Heba Alsakaji, Fathalla A. Rihan, Rajivganthi Chinnathambi, Dynamics of a Three Species Predator-Prey Delay Differential Model with Allee Effect and Holling Type-II Functional Response, 2018, 1556-5068, 10.2139/ssrn.3273687
3.	Jai Prakash Tripathi, Partha Sarathi Mandal, Ashish Poonia, Vijay Pal Bajiya, A widespread interaction between generalist and specialist enemies: The role of intraguild predation and Allee effect, 2021, 89, 0307904X, 105, 10.1016/j.apm.2020.06.074
4.	Dingyong Bai, Yun Kang, Shigui Ruan, Lisha Wang, Dynamics of an intraguild predation food web model with strong Allee effect in the basal prey, 2021, 58, 14681218, 103206, 10.1016/j.nonrwa.2020.103206
5.	Liyun Lai, Zhenliang Zhu, Fengde Chen, Stability and Bifurcation in a Predator–Prey Model with the Additive Allee Effect and the Fear Effect, 2020, 8, 2227-7390, 1280, 10.3390/math8081280
6.	Hua Liu, Yong Ye, Yumei Wei, Weiyuan Ma, Ming Ma, Kai Zhang, Pattern Formation in a Reaction-Diffusion Predator-Prey Model with Weak Allee Effect and Delay, 2019, 2019, 1076-2787, 1, 10.1155/2019/6282958
7.	Yong Ye, Yi Zhao, Bifurcation Analysis of a Delay-Induced Predator–Prey Model with Allee Effect and Prey Group Defense, 2021, 31, 0218-1274, 2150158, 10.1142/S0218127421501583
8.	R. P. GUPTA, DINESH K. YADAV, ROLE OF ALLEE EFFECT AND HARVESTING OF A FOOD-WEB SYSTEM IN THE PRESENCE OF SCAVENGERS, 2022, 30, 0218-3390, 149, 10.1142/S021833902250005X
9.	Hafizul Molla, Sahabuddin Sarwardi, Stacey R. Smith, Mainul Haque, Dynamics of adding variable prey refuge and an Allee effect to a predator–prey model, 2022, 61, 11100168, 4175, 10.1016/j.aej.2021.09.039
10.	Prahlad Majumdar, Sabyasachi Bhattacharya, Susmita Sarkar, Uttam Ghosh, On optimal harvesting policy for two economically beneficial species mysida and herring: a clue for conservation biologist through mathematical model, 2022, 0228-6203, 1, 10.1080/02286203.2022.2064708
11.	Xiaofen Lin, Hua Liu, Xiaotao Han, Yumei Wei, Stability and Hopf bifurcation of an SIR epidemic model with density-dependent transmission and Allee effect, 2022, 20, 1551-0018, 2750, 10.3934/mbe.2023129
12.	Xiaqing He, Zhenliang Zhu, Jialin Chen, Fengde Chen, Dynamical analysis of a Lotka Volterra commensalism model with additive Allee effect, 2022, 20, 2391-5455, 646, 10.1515/math-2022-0055
13.	Dipankar Kumar, Md. Mehedi Hasan, Gour Chandra Paul, Dipok Debnath, Nayan Mondal, Omar Faruk, Revisiting the spatiotemporal dynamics of a diffusive predator-prey system: An analytical approach, 2023, 44, 22113797, 106122, 10.1016/j.rinp.2022.106122
14.	Ali Yousef, Fatma Bozkurt, Thabet Abdeljawad, Qualitative Analysis of a Fractional Pandemic Spread Model of the Novel Coronavirus (Covid-19), 2020, 66, 1546-2226, 843, 10.32604/cmc.2020.012060
15.	Sangeeta Saha, Guruprasad Samanta, Switching effect on a two prey–one predator system with strong Allee effect incorporating prey refuge, 2024, 17, 1793-5245, 10.1142/S1793524523500122
16.	Anuj Kumar Umrao, Prashant K. Srivastava, Bifurcation Analysis of a Predator–Prey Model with Allee Effect and Fear Effect in Prey and Hunting Cooperation in Predator, 2023, 0971-3514, 10.1007/s12591-023-00663-w
17.	Dingyong Bai, Jianhong Wu, Bo Zheng, Jianshe Yu, Hydra effect and global dynamics of predation with strong Allee effect in prey and intraspecific competition in predator, 2024, 384, 00220396, 120, 10.1016/j.jde.2023.11.017
18.	S. Biswas, D. Pal, G.S. Mahapatra, Harvesting effect on prey-predator system with strong Allee effect in prey and herd behaviour in both, 2023, 37, 0354-5180, 1561, 10.2298/FIL2305561B
19.	Ruma Kumbhakar, Mainul Hossain, Nikhil Pal, Dynamics of a two-prey one-predator model with fear and group defense: A study in parameter planes, 2024, 179, 09600779, 114449, 10.1016/j.chaos.2023.114449
20.	Qun Zhu, Fengde Chen, Impact of Fear on Searching Efficiency of First Species: A Two Species Lotka–Volterra Competition Model with Weak Allee Effect, 2024, 23, 1575-5460, 10.1007/s12346-024-01000-4
21.	Samim Akhtar, Nurul Huda Gazi, Sahabuddin Sarwardi, Mathematical modelling and bifurcation analysis of an eco-epidemiological system with multiple functional responses subjected to Allee effect and competition, 2024, 26667207, 100421, 10.1016/j.rico.2024.100421

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Mathematical Biosciences and Engineering

3.9

Metrics

Article views(1674) PDF downloads(80) Cited by(11)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(8) / Tables(10)

Mathematical Biosciences and Engineering

Classical and Bayesian inference for the discrete Poisson Ramos-Louzada distribution with application to COVID-19 data

Related Papers:

Abstract

1. Introduction

2. Literature review

2.1. Big data analytics adoption among SMEs

2.2. Theoretical background

2.2.1. Technology-Organization-Environment framework

2.2.2. Technology dimension

2.2.3. Organization dimension

2.2.4. Environment dimension

2.2.5. Manager's characteristics dimension

2.3. Data mining

3. Methodology

3.1. Data collection and sample

3.2. Reliability, validity analysis and coding of the readiness to apply big data analysis

3.3. Data analysis

3.4. Measures for performance evaluation

4. Results

4.1. Prediction accuracy of models

4.2. Predictor importance of the input variables

4.3. Predicting the effects of observed variables on the readiness to adopt big data in SMEs

5. Discussion

6. Conclusions and implications

Acknowledgments

Conflict of interest

Appendix A

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Mathematical Biosciences and Engineering

Classical and Bayesian inference for the discrete Poisson Ramos-Louzada distribution with application to COVID-19 data

Related Papers:

Abstract

1. Introduction

2. Literature review

2.1. Big data analytics adoption among SMEs

2.2. Theoretical background

2.2.1. Technology-Organization-Environment framework

2.2.2. Technology dimension

2.2.3. Organization dimension

2.2.4. Environment dimension

2.2.5. Manager's characteristics dimension

2.3. Data mining

3. Methodology

3.1. Data collection and sample

3.2. Reliability, validity analysis and coding of the readiness to apply big data analysis

3.3. Data analysis

3.4. Measures for performance evaluation

4. Results

4.1. Prediction accuracy of models

4.2. Predictor importance of the input variables

4.3. Predicting the effects of observed variables on the readiness to adopt big data in SMEs

5. Discussion

6. Conclusions and implications

Acknowledgments

Conflict of interest

Appendix A

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog