1.
Introduction
Medical diagnosis is of great significance to doctors to judge the situations of patients correctly. Due to the complexity of the illness, doctors are often not able to determine the specific illness of a patient based on the symptoms. There exists uncertainty during the judgment of doctors. Fuzzy sets have a great advantage in representing the uncertainty of information. In 1965, Zadeh [1] proposed a method of describing the fuzzy phenomena in mathematics, which is called the fuzzy set theory. The fuzzy set is a vital definition, which can model the uncertain, imprecise and vague information in a quantitative form. In the fuzzy set, each element is represented as a membership value that belongs to the interval [0, 1] [2]. However, the non-membership value (NMV) is not directly represented in the fuzzy sets [3,4,5]. The NMV of each fuzzy number within the fuzzy sets is considered to be the result of (1-membership value). In this case, the hesitant degree is ignored. To accurately describe the positive degree, hesitant degree and negative degree of fuzzy information, the concept of intuitionistic fuzzy sets (IFSs) was designed by Atanassov [6]. In an IFS, the membership value (MV) μP and NMV ϑP satisfy the following conditions: μP∈[0,1], ϑP∈[0,1] and μP+ϑP∈[0,1] [7,8]. Wang et al. [9] pointed out that IFSs cannot express the indeterminate information and inconsistent information in an explicit way. The neutrosophic sets (NSs) [10,11] are a general fuzzy information modeling framework for generalizing fuzzy sets [11], IFSs, picture fuzzy sets [12,13,14], etc. In each NS, the truth-membership value (TV), indeterminacy-membership value (Ⅳ) and falsity-membership value (FV) are non-standard subsets of]0-, 1+[. However, NSs should be specified. Otherwise, they cannot be directly applied in real applications. In this case, a special instantiation of neutrosophic sets, called single valued NS (SVNS), was proposed by Wang et al. [9]. At present, SVNS are widely used in the medical diagnostics. Luo et al. [15] proposed a novel distance between single-valued neutrosophic sets based on the matrix norm, and applied the proposed method to pattern recognition and medical diagnoses. Hanna et al. [16] proposed a risk classification model for cardiac patients based on the theory of neutrosophic sets and compared the proposed method with other commonly used models. Hassan et al. [17] used the neutrosophic sets to deal with the system deadlock problem in the use of electronic medical records in hospitals and achieved good results.
Similar to NSs, each single valued neutrosophic number (SVNN) consists of three components, TV, Ⅳ and FV and these three components are independent of each other [18]. Different from NSs, TV, Ⅳ and FV in SVNSs belongs to the unit interval [0, 1] [19,20,21]. Due to the complex information representation of SVNSs, various measures and decision-making methods have been proposed for handling the SVNSs [22,23]. The correlation coefficient is an important measurement metric that is used to quantify the correlation between two objects [24]. This measurement metric has been widely applied into the fields of pattern recognition [25,26], decision-making methods [27,28,29] and medical diagnosis [30,31]. Because of its valuable use, it has been used to measure the correlation between fuzzy variables [32]. It has also been extended to measure the correlation between IFSs [33]. However, the complex information representation of SVNSs is different from fuzzy sets and IFSs. Ye [34] first developed the correlation coefficient calculation formulas for SVNSs, and applied them for handling decision problems. Ye [35] further improved the correlation coefficient calculation formulas and also used them to improve the decision-making methods. Meng et al. [36] introduced SVNSs to represent the fuzzy information, calculated the attribute weights by the Demental-Analytical Network Process (DANP) method, and then used the evaluation based on distance from average solution (EDAS) method to rank the alternatives. Song et al. [37] used the feature of neutrosophic sets can describe the uncertainty of the problem well and used it in image segmentation algorithm. Gou et al. [38] used the single-valued neutrosophic sets to represent trapezoidal fuzzy numbers and combined with TOPSIS method for fuzzy risk analysis.
However, through the numerical case analysis, it is known that these correlation coefficient formulas of SVNSs [34,35] have a value range of [0, 1]. Meanwhile, by looking up the literatures, we also found that most of the existing correlation coefficients of SVNSs ignore the negative correlation of two SVNSs.
To overcome the drawbacks of these existing studies, a novel SVNS correlation coefficient is proposed in this paper. The contributions of this paper are summarized as follows:
(1) Several counter-examples are used to analyze the drawbacks of the existing correlation coefficients of SVNSs. To overcome these drawbacks, a novel SVNSs correlation coefficient is proposed.
(2) Two application cases related to pattern recognition are used to show that the unknown pattern can be properly classified into the known pattern when the proposed SVNSs correlation coefficient is used.
(3) A TCM medical diagnosis case is introduced to compare our proposed correlation coefficient with the distance and similarity of SVNSs, the results show that the proposed correlation coefficient formula can accurately judge the patient's condition.
The rest of this paper is organized as follows. In Section 2, we present the basic knowledge about IFSs, SVNSs and analyze the drawbacks of the existing correlation coefficients in the SVNS environment. Section 3, a novel correlation coefficient formula of SVNSs is proposed and its characteristics are discussed as well. Section 4 shows the application of our proposed correlation coefficient in pattern recognition. In Section 5, a TCM medical diagnosis case is provided to show the effectiveness of the proposed correlation coefficient. Finally, some valuable conclusions are summarized in Section 6.
2.
Preliminaries
In this section, we will briefly describe the basic concepts and mathematical forms of IFSs and SVNSs. Then we will analyze two existing correlation coefficient formulas about SVNSs. Finally, two examples will be given to illustrate their limitation.
2.1. Intuitionistic fuzzy sets
IFSs [6] use pairs of MV and NMV to describe the uncertain information. The mathematical form of an IFS is described as follows.
Definition 2.1 [6]. Let U={u1,u2,...,un} be a finite universe of discourse, then an IFS P on U is defined as
where μP(u):U→[0,1] and ϑP(u):U→[0,1] are the MV and NMV of each element u∈U belonging to the set P, respectively. They satisfy that 0⩽μP⩽1, 0⩽ϑP⩽1 and 0⩽μP+ϑP⩽1. The hesitant degree or indeterminacy degree can be calculated as πP=1−μP−ϑP.
2.2. Single-valued neutrosophic sets
In SVNSs, three important parameters are used together to express the uncertain information, which are truth-membership value (TV), indeterminacy-membership value (Ⅳ) and falsity-membership value (FV).
Definition 2.2 [34]. A single-valued neutrosophic set P on a finite universe of discourse U={u1,u2,...,un} is defined as
where the mathematical symbols μP(u), ϑP(u) and ηP(u) denote the TV, FV and Ⅳ of each element u∈U belonging to the set P, respectively. They satisfy the conditions: 0⩽μP(u),ηP(u),ϑP(u)⩽1 and 0⩽μP(u)+ηP(u)+ϑP(u)⩽3.
A triplet of TV, FV and Ⅳ of each element, denoted as (μP(u),ηP(u),ϑP(u)), is usually called as a single-valued neutrosophic number (SVNN) [34,35]. For simplification of operations, it is usually denoted as (μP,ηP,ϑP).
2.3. The existing correlation coefficients for SVNS
Definition 2.3 [34]. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be any two SVNSs on the universe of discourse U={u1,u2,...,un}, then the correlation coefficient between P and S was defined by Ye [34] as the following form:
Ye [34] proved that the formula (3) satisfies the following properties:
(1) K(P,S)=K(S,P);
(2) K(P,S)=1 if P=S;
(3) 0⩽K(P,S)⩽1.
By observing the formula (3) and its properties, it can be easily found that the formula (3) proposed by Ye [34] cannot measure the negative correlation between two SVNSs. To show the implementation process of the formula, an example about the calculation of the correlation coefficient between two SVNSs is given below:
Example 2.1. Let P={(0.4,0.3,0.1),(0.5,0.3,0.2),(0.4,0.3,0.0)} and S={(0.1,0.3,0.4),(0.2,0.3,0.5),(0.0,0.3,0.4)} be two SVNSs defined in the set U={u1,u2,u3}. If the formula (3) proposed by Ye [34] is used to measure the correlation coefficient between P and S, then we can obtain the result K(P,S)=0.6180.
Definition 2.4 [35]. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be any two SVNSs on the universe of discourse U={u1,u2,...,un}, then the correlation coefficient between the SVNSs P and S proposed by Ye [35] was defined as the following form:
where
Ye [35] proved that the formula (4) also meets the following properties:
(1) M(P,S)=M(S,P);
(2) M(P,S)=1 if P=S;
(3) 0⩽M(P,S)⩽1.
By observing the formula (4), it can be easily seen that the formula (4) proposed by Ye [35] also cannot measure the negative correlation between two SVNSs.
Example 2.2. Let P={(0.4,0.3,0.1),(0.5,0.3,0.2),(0.4,0.3,0.0)} and S={(0.1,0.3,0.4),(0.2,0.3,0.5),(0.0,0.3,0.4)} be two SVNSs defined in the set U={u1,u2,u3}. If the formula (4) proposed by Ye [35] is used to measure the correlation coefficient between P and S, then we can obtain the result M(P,S)=0.7504.
In summary, these two correlation coefficient formulas between SVNSs proposed by Ye [34] and Ye [35] cannot measure the negative correlation between two SVNSs. To solve this problem, we will propose a novel formula of correlation coefficient between two SVNSs, the value of which falls into the range of −1 to 1.
3.
Statistical correlation coefficients between two SVNSs
In this section, the statistical theory is used to develop some novel correlation coefficients for SVNSs, and the proposed statistical correlation coefficients are compared with the existing correlation coefficients to show the usefulness of the proposed statistical correlation coefficients. Before proposing the novel formulas of correlation coefficient between two SVNSs, it is necessary to develop some basic concepts based on the statistical knowledge for SVNSs.
First, the variance of an SVNS P is developed by introducing the variance concept from statistics.
Definition 3.1. Let P={(μPi,ηPi,ϑPi)|ui∈U} be an SVNS on the universe of discourse U={u1,u2,...,un}, then the variance of P is statistically defined as:
where the term di(P) is calculated as:
with
Next, the covariance formula between two SVNSs is developed based on the conventional statistical viewpoint of covariance of two sets.
Definition 3.2. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be any two SVNSs on the universe of discourse U={u1,u2,...,un}, then the covariance of SVNSs P and S is statistically defined as:
where the terms di(P) and di(S) are calculated as:
Now, the novel formulas of correlation coefficient between SVNSs are developed based on the Pearson correlation coefficient in statistics.
Definition 3.3. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be any two SVNSs on the universe of discourse U={u1,u2,...,un}, then the statistical correlation coefficient between SVNSs P and S is defined as:
Remark 3.1. The value of the proposed statistical correlation coefficient formula falls into the range [−1,1], while the value range of the existing correlation coefficient formulas is [0,1]. As we know, the correlation coefficient is used to measure the linear relationship between any two variables or subsets. Both of positive correlation and negative correlation have practical significance, and it is unreasonable to give up any part of them. Thus, the proposed statistical correlation coefficient formula obtains more reasonable values.
Properties 3.1. Given SVNSs P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U}, then we have
(1) ρ(P,S)=ρ(S,P);
(2) −1⩽ρ(P,S)⩽1.
Proof. (1) Straightforward.
(2) According to the Cauchy-Schwarz inequality (∑mk=1ukvk)2⩽(∑mk=1u2k)(∑mk=1v2k) with u=(u1,u2,...,um) and v=(v1,v2,...,vm), then (COV(P,S))2=1n2{n∑i=1di(P)di(S)}2⩽1nn∑i=1d2i(P)×1nn∑i=1d2i(S)=D(P)D(S).
Then, we can obtain that:
Finally, we have −1⩽ρ(P,S)=COV(P,S)[D(P)D(S)]12⩽1.
Example 3.1. Let P={(0.4,0.3,0.1),(0.5,0.3,0.2),(0.4,0.3,0.0)} and S={(0.1,0.3,0.4),(0.2,0.3,0.5),(0.0,0.3,0.4)} be two SVNSs, then using the formula (7) to calculate the correlation coefficient value between P and S can obtain ρ(P,S)=−1.
By comparing the results of Example 2.1, Example 2.2 and Example 3.1, it can be found that the result of the proposed statistical correlation coefficient is more reasonable, and the proposed statistical correlation coefficient formula can obtain negative correlation between two SVNSs.
In some practical problems, the elements in the set U={u1,u2,...,un} shows different importance. For example, the criteria in the multicriteria decision-making problems usually own different importance. To let the proposed formula (7) of the statistical correlation coefficient be applied to solve these practical problems, the proposed formula (7) is extended by considering the importance of elements, and its definition is given below.
Definition 3.4. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be any two SVNSs on the universe of discourse U={u1,u2,...,un}, then the weighted statistical correlation coefficient value between SVNSs P and S is calculated as:
where the term ωi denotes the importance of the element ui in U, and COVω(P,S)=1n∑ni=1ωidi(P)di(S), Dω(P)=1nn∑i=1ωid2i(P), Dω(S)=1nn∑i=1ωid2i(S).
Properties 3.2. Given SVNSs P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U}, then we have
(1) ρω(P,S)=ρω(S,P);
(2) −1⩽ρω(P,S)⩽1.
Proof. (1) Straightforward.
(2) According to the Cauchy-Schwarz inequality and the proposed formula (8), the following inequality can be obtained as
Then, we can obtain that:
Hence, we have −1⩽ρω(P,S)=COVω(P,S)[Dω(P)Dω(S)]12⩽1.
In the next section, we will apply the proposed statistical correlation coefficients in the form of formulas (7) and (8) to solve the problems of pattern recognition and medical diagnosis.
4.
Applications of the proposed statistical correlation coefficients
To show the usefulness and superiority of the proposed statistical correlation coefficients in terms of the formulas (7) and (8), the proposed statistical correlation coefficient formulas are applied to solve two practical problems, which are pattern recognition and medical diagnosis, respectively.
4.1. Application of the statistical correlation coefficient in pattern recognition
In the practical pattern recognition problem, an unknown pattern is classified into some known patterns with the help of various information measures such as the distance measure, divergence measure, similarity measure, correlation coefficient, etc. Here, the proposed statistical correlation coefficient formula is used for solving the pattern recognition problem. To verify the usefulness and superiority of the proposed statistical correlation coefficient formula, the proposed statistical correlation coefficient is compared with the existing information measures.
First, a classical pattern recognition problem under the SVNS environment is formulated as follows:
Problem formulation: Suppose that {P1,P2,...,Pn} are a series of known patterns that are characterized by the following SVNS Pj={(μPij,ηPij,ϑPij)|ui∈U}. Let R={(μRi,ηRi,ϑRi)|ui∈U} be an unknown pattern, then this pattern recognition problem is how to classify the unknown pattern R to one of the known patterns Pj(j=1,2,...,n).
There exist various information measures that can be used to solve this pattern recognition problem. In this section, we use three kinds of information measures to solve the pattern recognition problem, which are correlation coefficient, distance measure and similarity measure, respectively. The methods based on these information measures are given as follows:
(1) Correlation coefficient method
Let C(Pj,R) be the correlation coefficient value between SVNS R and Pj(j=1,2,...,n), then R is considered to be in the same pattern of Pj∗, where j∗=argmax{C(Pj,R)}, j=1,2,...,n.
(2) Distance measure method
Let d(Pj,R) be the distance value between SVNS R and Pj(j=1,2,...,n), then R is considered to be in the same pattern of Pj∗, where j∗=argmin{d(Pj,R)}, j=1,2,...,n.
(3) Similarity measure method
Let S(Pj,R) be the similarity value between SVNS R and Pj(j=1,2,...,n), then R is considered to be in the same pattern of Pj∗, where j∗=argmax{S(Pj,R)}, j=1,2,...,n.
To compare the proposed statistical correlation coefficient with the existing information measures in the pattern recognition problem, we introduce some information measures for SVNSs as follows:
Definition 4.1 [39]. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be two SVNSs on the universe of discourse U={u1,u2,...,un}, then the Hamming distance between SVNSs P and S are defined as:
Definition 4.2 [40]. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be two SVNSs on the universe of discourse U={u1,u2,...,un}, then the Euclidean distance between SVNSs P and S are defined as:
Definition 4.3 [41]. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be two SVNSs on the universe of discourse U={u1,u2,...,un}, then the Dice similarity between SVNSs P and S are defined as:
Definition 4.4 [41]. Let P={(μPi,ηPi,ϑPi)|ui∈U} and S={(μsi,ηsi,ϑsi)|ui∈U} be two SVNSs on the universe of discourse U={u1,u2,...,un}, then the cosine similarity between SVNSs P and S are defined as:
Here, we give a specific example about solving the pattern recognition problem as follows:
Example 4.1. There exist three known patterns P1, P2 and P3 that are modelled as the following SVNSs:
Let R be an unknown pattern that is modelled as the following SVNS:
We use the formulas (3), (4), (7), (9), (10), (11) and (12) to calculate the relationship value between the unknown pattern and each known pattern, and then we can classify the unknown pattern R into one of the known patterns Pj(j=1,2,3). The results of these information measures are given in Table 1.
In Table 1, it can be easily found that all these information measures including our proposed statistical correlation coefficient achieve the same results and then classify the unknown pattern R to P1. It shows that the classification result that is obtained by our proposed statistical correlation coefficient is consistent with those that were obtained by the existing information measures.
Next, we consider another example about the pattern recognition problem to show the efficiency of our proposed statistical correlation coefficient in the counter-intuitive situation.
Example 4.2. There exist three known patterns P1, P2 and P3 that are modeled as the following SVNSs:
Let R be an unknown pattern given in terms of a SVNS as
We use the formulas (3), (4), (7), (9), (10), (11), (12) to calculate the relationship value between the unknown pattern and each known pattern, and then we can classify the unknown pattern R into one of the known patterns Pj(j=1,2,3). The results of these information measures are given in Table 2.
From Table 2, we can find that the classification result obtained by the formula (4) is different from the classification results obtained by others including the formula (7). The formulas (3), (7), (9), (10), (11) and (12) can accurately classify the unknown pattern R to P1. By observing the data in Example 4.2 and the classification results in Table 2, it is easy to find that the formula (4) has erroneous judgement. The formula (7) of the proposed statistical correlation coefficient not only can calculate the negative correlation value of two patterns, but also obtain more robust classification result.
4.2. Application of the statistical correlation coefficient in medical diagnosis
Medical diagnosis is a process of finding the diseased part, judging the degree of the disease and then determining the disease according to the symptoms of the patient when their body is in an abnormal state.
This section attempts to validate the rationality and the feasibility of the proposed statistical correlation coefficient during the diagnosis process of patients with COVID-19 by Chinese medicine. We have collected diagnosis and treatment data of suspected patients with COVID-19 from the relevant medical departments. In the following part, we first give the problem background. Next, we will use various information measures to diagnose the suspected cases, and also compare these information measures to show the effectiveness and practicability of the proposed statistical correlation coefficient.
4.2.1. Problem background
As we know, the root of Tradition Chinese Medicine (also referred to as TCM) dates back more than 2000 years. Tradition Chinese Medicine (TCM) is a complete medical system that has been used to diagnose, treat and prevent illnesses for more than 2000 years. Nowadays, TCM is an indispensable part of Chinese culture which has made great contributions to the prosperity of China. Both TCM and western medicine are being used to cure people all around the world. The TCM, comparing with western medicine, with its unique diagnostic methods, long history and remarkable effects, have been used to treat cancer and other serious diseases. Unlike the western medicine, the TCM has many advantages. One of the most significant is that it has fewer side effects. According to the survey data, TCM is used by 75% of the areas in China and has been very effective in the treatment of diseases such as diabetes, liver cancer, tumors and bone fracture and so on. Great successes have been made in many areas through TCM cure. As to acute abdomen, there is no need to have an operation. All you need to do is drink a cup of Chinese herbs, while the western way takes more time and money. You may even take the risk of being infected after operation.
In particular, TCM has achieved gratifying results in the diagnosis and treatment of COVID-19. This also makes more and more people pay more attention to TCM. From ancient times to the present, there have been theories of yin yang philosophy in China. TCM is based on a belief in yin and yang, which is defined as opposing energies, such as earth and heaven, winter and summer and happiness and sadness. It can just be described by truth membership and falsehood non-membership in the fuzzy set. In terms of diagnosing, TCM diagnostics is to grasp the human system as a whole emerged as the goal established, the information obtained through the four clinics of the look and smell are the characteristics of information emerged about the body as a whole. It fully shows that in the diagnosis process of TCM, uncertain and vague information will be produced. For this part of the information, we use the indeterminacy-membership to describe it. Thus, we can completely introduce the related concepts of single-value neutrosophic sets in TCM diagnosis and treatment.
As the symptoms of patients with COVID-19 are similar to those of traditional pneumonia and viral influenza, a large number of suspected cases will make the diagnosis of doctors extremely difficult. Once the doctor makes a misjudgment, it will delay the opportunity for treatment and even endanger the patients' life. The notion of similarity measure, divergence measure, correlation measure and distance measure under the single-value neutrosophic set environment has played a key role in the medical diagnosis problems. Here we use our proposed statistical correlation coefficient to deal with the medical diagnosis problem. The symptom values of the diseases are assigned by one specialist doctor or by the aggregated opinion of several specialist doctors. The symptom values to a patient are assigned by the doctor dealing with the patient based on his/her expertise. Then the correlation between the symptoms of patients and the standardized symptoms of diseases is computed. The patient will be diagnosed with the disease, which has the highest value of correlation.
Now, we use our proposed statistical correlation coefficient in the medical diagnosis problem with the help of the following example.
4.2.2. Illustrative example
This application illustrates how the proposed statistical correlation coefficient can be used to address a problem of misdiagnosis in the medical diagnosis of COVID-19. The flowchart of proposed statistical correlation coefficient is shown in Figure 1.
Let us assume that there are four different patients P={P1,P2,P3,P4} and also five kinds of diseases D={D1,D2,D3,D4,D5}={Malaria,SARS,Viral−influenza,COVID−19,Typhoid} along with their symptoms S={S1,S2,S3,S4,S5}={Fever,Pharyngalgia,Dizziness,Rhinorrhea,Dry−Cough}.
Example 4.3. We have collected the patients' diagnosis and treatment data from the relevant departments and organized them into the following tables. Table 3 is the dataset of symptoms of various diseases and Table 4 is the dataset of symptoms of different patients. We assume that patient P1 is suffering from Typhoid, P2 and P3 are suffering from COVID-19 and P4 is suffering from Viral Influenza. Thus, the symptoms of the patients Pj(j=1,2,3,4) are deduced accordingly and shown in Table 4. We use the information measures for better diagnosis.
By using the formula (3), we calculate the correlation coefficient between the patient's symptoms and disease symptoms. The result values are given in Table 5.
As shown in Table 5, it can be found that the correlation coefficient value between the symptoms of P1 and the symptoms of Typhoid is highest. Thus, we can know that the patient P1 is suffering from Typhoid. Similarly, it is easy to know that P3 is suffering from COVID-19 and P4 is suffering from Viral Influenza. However, the correlation coefficient value between the symptoms of the patient P2 and the symptoms of COVID-19 is equal to the correlation coefficient value between the symptoms of the patient P2 and the symptoms of Typhoid. Thus, it is difficult to diagnose the disease condition of the patient P2.
By using the formula (4), we calculate the correlation coefficient between the patient's symptoms and disease symptoms. The result values are given in Table 6.
As shown in Table 6, it is not difficult to notice that P1 is suffering from Typhoid, while P2, P3 and P4 are suffering from COVID-19. This is not in line with the true state of affairs, which is P4 suffering from Viral influenza.
By using the formula (7), we calculate the correlation coefficient between the patient's symptoms and disease symptoms. The result values are given in Table 7.
As shown in Table 7, it is not hard to notice that P1 is suffering from Typhoid, both P2 and P3 are suffering from COVID-19 and P4 is suffering from Viral influenza. All the calculated results conform with the actual status.
By using the formula (9), we calculate the distance values between the patient's symptoms and disease symptoms. The result values are given in Table 8.
As shown in Table 8, it is not difficult to notice that P1 is suffering from Typhoid, while P2, P3 and P4 are suffering from COVID-19. This is not in line with the true state of affairs, which is P4 suffering from Viral influenza.
Similarly, the formula (10) is used to calculate the distance values between the patient's symptoms and disease symptoms. The result values are given in Table 9.
As shown in Table 9, it is easy to notice that the patient P1 is suffering from Typhoid, both P2 and P3 are suffering from COVID-19 and P4 is suffering from Viral influenza. All these results conform with the actual status.
By using the similarity formula (11), we calculate the similarity values between the patient's symptoms and disease symptoms. The result values are given in Table 10.
As shown in Table 10, it is easy to find that the patient P1 is suffering from Typhoid, both P2 and P3 are suffering from COVID-19 and P4 is suffering from Viral influenza. All these results conform with the actual status.
Similarly, the formula (12) is used to compute the similarity values between the patient's symptoms and disease symptoms. The result values are given in Table 11.
As shown in Table 11, it is not difficult to know that the patient P1 is suffering from Typhoid. P3 is suffering from COVID-19 and P4 is suffering from Viral Influenza. However, it is difficult to diagnose the disease condition of the patient P2 since the similarity value between the symptoms of the patient P2 and the symptoms of COVID-19 is equal to the similarity value between the symptoms of the patient P2 and the symptoms of Typhoid.
In order to compare and analyze the diagnosis results of different methods more clearly, all considered calculation results are summarized into Table 12.
Compared the diagnosis results of various methods with the actual situation, it is not difficult to find that the results of the formulas (4) and (9) produce misdiagnosis. Meanwhile, based on the results of formulas (3) and (12), it is unable to determine the disease condition of the patient P2. Using our proposed formula (7), the distance formula (10) and similarity formula (11), we can correctly diagnose the patient's disease condition.
5.
Conclusions
In this paper, we propose a novel statistical correlation coefficient formula to measure the correlation values between SVNSs and also develop a weighted form of this statistical correlation coefficient formula. The value of the statistical correlation coefficients falls into the interval [−1,1] which is in accordance with the range of statistical correlation coefficient between two variables or sets. Conversely, most of the existing correlation coefficient formulas of SVNSs only obtained the values in interval [0,1]. Some examples are given to demonstrate that the statistical correlation coefficient formula is more effective than the existing methods. In the practical application problems of pattern recognition and TCM medical diagnosis, we clearly found the superiority of the statistical correlation coefficient formula we proposed. In addition, compared the proposed statistical correlation coefficient formula with the existing correlation coefficients, distance measures and similarity measures, the statistical correlation coefficient formula we proposed can properly determine the patient's disease and reduce misjudgments. The advantages of our studies are summarized as:
(1) The value of our statistical correlation coefficients falls into the interval [−1,1]. The correlation coefficient formula can be used to calculate both the positive and negative correlation between single-value neutrosophic sets.
(2) In the application example of pattern recognition, our formula for calculating the correlation coefficient has been found to effectively classify the unknown pattern as the known pattern.
(3) In the application example of medical diagnosis, comparing with the distance measure methods and similarity measure methods, we can find that the correlation coefficient formula we proposed can accurately judge the patient's disease condition.
In the future research, we plan to extend our study to the interval single-value neutrosophic sets and do more dimensions practical applications.
Acknowledgments
This work was supported in part by the Natural Science Foundation of Fujian Province, China under Grant No. 2022J01958.
Conflict of interest
The authors declare that they have no conflict of interest.