Consistency Fuzzy Sets and a Cosine Similarity Measure in Fuzzy Multiset Setting and Application to Medical Diagnosis

)e main purpose of this study is to construct a base for a new fuzzy set concept that is called consistency fuzzy set (CFS) which expresses the multidimensional uncertain data quite successfully. Our motive is to reduce the complexity and difficulty caused by the information contained in the truth sequence in a fuzzy multiset (FMS) and to present the data of the truth sequence in a more understandable and compact manner. )erefore, this paper introduces the concept of CFS that is characterized with a truth function defined on a universal set [0, 1]2. )e first component of the truth pair of a CFS is the average value of the truth sequence of a FMS and the second component is the consistency degree, that is, the fuzzy complement of the standard deviation of the truth sequence of the same FMS. )e main contribution of a CFS is the reflection of both the level of the average of the data that can be expressed with the different sequence lengths and the degree of the reasonable information in data via consistency degree. To develop this new concept, this paper also presents a correlation coefficient and a cosine similarity measure between CFSs. Furthermore, the proposed correlation coefficient and cosine similarity measure are applied to a multiperiod medical diagnosis problem. Finally, a comparison analysis is given between the obtained results and the existing results in literature to show the efficiency and rationality of the proposed correlation coefficient and cosine similarity measure.


Introduction
Fuzzy set theory was introduced by Zadeh [1] in 1965 with the help of the concept of membership (truth) function that is used as an effective tool to overcome uncertainty in science, and it has applications in many different fields such as economics, engineering, decision-making, management, and medicine [2][3][4]. ere are many generalizations of the concept of the fuzzy set in the literature, and their applications to several areas such as decision-making and medical diagnosis are studied to model uncertain data that is encountered in science often. For example, Akram et al. [5] have proposed a new decision-making method in complex spherical fuzzy environment and Das et al. [6] have introduced a medical diagnosis model by using fuzzy logic and intuitionistic fuzzy logic. Moreover, a decision-making method, for the selection of an effective sanitizer to reduce COVID-19 which is one of the most up-to-date problems of recent times, has been presented in [7]. One of the generalizations of fuzzy sets is the concept of hesitant fuzzy set (HFS) [8], which is characterized by a membership (truth) function that is a set of crisp values in [0,1]. A HFS can model uncertain data better than a fuzzy set, thanks to its handy structure, so it has been frequently preferred by researchers to solve multicriteria (group) decision-making or multiperiod medical diagnosis problems [9][10][11][12]. However, the concept of HFS eliminates and ignores repetitive information because of the nature of the crisp sets. For example, suppose that a doctor evaluates a target patient's symptoms at four different times with membership degrees 0.1, 0.3, 0.7, and 0.3, respectively. If the result of this evaluation is expressed as a HFS, then the repetitive 0.3 assessment is lost due to the formation structure of the HFS. In such a situation, the concept of fuzzy multiset (FMS) is a useful method to express the ambiguous information which is lost. e concept of FMS was proposed by Yager in 1986 [13,14] with the help of a count function. In a fuzzy multiset setting, the membership degrees of elements in a universal set are presented as a sequence having different sequence lengths/cardinalities with the same or different fuzzy values. erefore, more accurate results can be obtained by preventing the loss of the repetitive information. Moreover, it is more appropriate to use this fuzzy set in solving multicriteria group decision-making problems and multiperiod medical diagnosis problems. Although FMSs have the property of saving repetitive information, the uncertainty increases as the length of the sequences in the FMSs increases. is situation causes a difficulty while expressing reasonable information and complicates the selection of the alternative in a decision-making problem. To make the information carried by the sequence in the FMS more understandable and to reduce the dependence of this information on the length of the sequence, some statistical methods such as arithmetic mean and standard deviation for the elements of this sequence can be used. Recently, Ye et al. [15] have used this idea in neutrosophic environment. Motivated from this, we propose a new concept which is called consistency fuzzy set (CFS). is concept is expressed as an ordered pair whose components are the average value and the consistency degree of the sequence, respectively. Later, we propose a correlation coefficient and a cosine similarity measure between CFSs.
Correlation analysis is an important research issue in the fuzzy set theory and in its generalizations because it can measure the relationship between two fuzzy sets. erefore, they have gained attention from researchers and their wide applications in various fields have been considered. For instance, Ye [16] has proposed a weighted correlation coefficient between intuitionistic fuzzy sets. Moreover, Guan et al. [17] have put forward a synthetic correlation coefficient between HFSs. Recently, Lin et al. [18] have developed the directional correlation coefficient measures for Pythagorean fuzzy information and have applied them to the medical diagnosis and the cluster analysis. Also, several researchers have proposed some correlation coefficients in various fuzzy environments (see, e.g., [19,20]). e concept of similarity measure plays an important role to determine the degree of similarity between two fuzzy sets. ere are several types of similarity measures in the literature (see, e.g., [21][22][23][24][25]). e concept of cosine similarity measure is one of them, and it is defined as the inner product of two vectors divided by the product of their lengths, that is, the cosine of the angle between the vector representations of fuzzy sets [26]. In this paper, we introduce a correlation coefficient and a cosine similarity measure between CFSs, and we give the multiperiod medical diagnosis approaches by using the proposed correlation coefficient and cosine similarity measure to show the efficiency of these new concepts.
e important contributions of the paper are listed below: (i) e concept of CFS reduces the dependence of information on the length of the sequence in FMS and presents the information carried by the sequence in FMS in a more compact form. (ii) A CFS that is based on the average values and the consistency degree can give reasonable information about sequences in a FMS. (iii) A CFS contains both the level of the average of the data that can be expressed with different sequence lengths and the degree of consistency of the data via fuzzy complement of standard deviation of a sequence in FMS. (iv) A CFS facilitates the understanding of the problem, so the decision-making process has compact information due to the ability of CFSs. (v) e proposed correlation coefficient and cosine similarity measure between CFSs provide useful ranking method, and they are beneficial mathematical tools for multiperiod medical diagnosis and multicriteria group decision-making problems in the FMS environment. (vi) e developed medical diagnosis approach not only improves the decision-making reliability but also supplies a new influential way for multiperiod medical diagnosis problems in the FMS environment. e remainder of this paper is set out as follows. In Section 2, we introduce the concept of CFS and we give a correlation coefficient between CFSs. Later, we apply it to a multiperiod medical diagnosis problem to demonstrate the efficiency of the proposed correlation coefficient. In Section 3, we propose a cosine similarity measure between CFSs. en, we apply it to the same multiperiod medical diagnosis problem. Moreover, we compare the results of the proposed correlation coefficient and the proposed cosine similarity measure with each other and the existing results in literature. In Section 4, we give a conclusion with some remarks.

CFSs and a Correlation Coefficient between CFSs
In this section, we recall the concepts of FMS and a correlation coefficient between FMSs. en, we introduce the concept of CFS and a correlation coefficient between CFSs. Next, we apply it to a multiperiod medical diagnosis problem.

e Concept of CFS
Definition 1 (see [14]). Let X � x 1 , . . . , x m be a finite set. A FMS A in X is characterized by a count membership

erefore, a FMS A is given by
where n j is the length of the sequence for jth element.
Obviously, a FMS reduces to a fuzzy set when n j � 1.
Now, we define the concept of CFS which reduces the dependence of information on the length of the sequence in a FMS and to present the information carried by the sequence in a FMS in a more compact form.
. , x m be a finite set and let A be a FMS in X. Average values and consistency degrees of the membership (truth) sequences in A are defined by for each Moreover, the consistency fuzzy element (CFE) in CFS en, we construct the corresponding CFS C A to FMS A by by using (2) and (3).
By using CFSs, we make a statistical inference for the information carried by the truth sequences in a FMS, and we express the information presented in these sequences as a compact and understandable way. us, we simplify the decision-making process by reducing the complexity created by the length of the truth sequences in a FMS. We also eliminate the dependence of the information on the length of these truth sequences in a FMS. e fuzzy set theory has been often preferred by researchers especially to solve real-life problems such as medical diagnosis and decision-making, since it can model uncertain information very well. While solving these problems, the optimal choice is usually determined by using an aggregation functions or information measures such as similarity measures, entropy measures, and divergence measures, after the uncertainty in the environment is modeled with fuzzy sets. e concept of correlation coefficient is a crucial measure that determines the relationship between two fuzzy sets. Now, we recall a correlation coefficient for FMSs.
Definition 3 (see [27]). Let X � x 1 , . . . , x m be a finite set and let be two FMSs in X. A correlation coefficient between A and B is given with where Proposition 1 (see [27]). e correlation coefficient ρ FMS satisfies the following properties: Now, we propose a correlation coefficient between CFSs by motivating from the definition of the correlation coefficients between FMSs.

Mathematical Problems in Engineering
Definition 4. Let X � x 1 , . . . , x m be a finite set and let A and B be two FMSs in X. e correlation coefficient between CFSs C A and C B is given with where Proposition 2. e correlation coefficient ρ CFS satisfies the following properties: us, we have Now, using Cauchy Schwarz inequality, we have en, we obtain e proofs of (P 2 ) and (P 3 ) are straightforward.

□
Now, we propose a weighted version of the correlation coefficient ρ CFS for CFSs as follows.
Definition 5. Let X � x 1 , . . . , x m be a finite set and let A and B be two FMSs in X. A weighted correlation coefficient between CFSs C A and C B is given with where w is the weight vector with w j ∈ [0, 1], for all j � 1, . . . , m, such that m j�1 w j � 1.

An Application.
A multiperiod medical diagnosis is a process of decision-making on a disease which has a target patient. In this process, the decision maker evaluates the effect of symptoms on the target patient several different times. e most important factor that discriminates this process from other medical diagnosis processes is the presentation of the solution algorithm that pays attention to the time variable [24]. erefore, it can be convenient to present the patient's symptoms and diseases with the help of a sequence of fuzzy values. Now, we adopt an illustrative example from [27] to show the applicability and effectiveness of the proposed correlation coefficient under FMS setting.
Moreover, assume that each disease Q i , for i � 1, 2, 3, 4, is given as a FMS with respect to all of the symptoms as follows: Now, we construct CFSs. Firstly, all patients in P are expressed as CFSs P * 1 , P * 2 , P * 3 , and P * 4 as follows: respectively, and all diseases in Q are expressed as CFSs Q * 1 , Q * 2 , Q * 3 , and Q * 4 as follows: respectively. Let the weight of each symptom be ω j � 0.2, for j � 1, 2, 3, 4, 5. Now, we apply the proposed weighted correlation coefficient Wρ CFS to determine the optimal disease for each patient. New results obtained in this study and some existing results in [27] are given in Table 1. e process of assigning each patient P * k to a disease Q * i is described by for fixed k ∈ 1, 2, 3, 4 { }. e numerical results in Table 1 show that third and fourth patients suffer from throat disease and typhoid, respectively, according to both correlation coefficients for FMSs [27] and the proposed correlation coefficient for CFSs. e rest of Table 1 is different for two approaches. e novelty of the approach used in this study may cause this difference.

A Cosine Similarity Measure.
e concept of cosine similarity measure is defined as the inner product of two vectors divided by the product of their lengths. In other words, a cosine similarity measure is the cosine of the angle between the vector representations of the two fuzzy sets.

Mathematical Problems in Engineering
Now, we introduce a cosine similarity measure and its weighted version for CFSs by motivating from [26] as follows.
Definition 6. Let X � x 1 , . . . , x m be a finite set and let A and B be two FMSs in X. A cosine similarity measure between CFSs C A and C B is given with If we take m � 1, the cosine similarity measure δ CFS reduces the correlation coefficient ρ CFS , i.e., δ CFS (A, B) � ρ CFS (A, B).

Proposition 3.
e cosine similarity measure δ CFS satisfies the following properties: Proof. Let X � x 1 , . . . , x m be a finite set and let (m where θ j be the radian measure of the angle between  Definition 7. Let X � x 1 , . . . , x m be a finite set and let A and B be two FMSs in X. A weighted cosine similarity measure between CFSs C A and C B is given with where w � (w 1 , . . . , w m ) is the weight vector with w j ∈ [0, 1], for all j � 1, . . . , m, such that m j�1 w j � 1.
It is clear that if we take w j � 1/m, for any j � 1, . . . , m, then δ CFS (A, B) � Wδ CFS (A, B). Obviously, the proposed weighted cosine similarity measure Wδ CFS (A, B) also satisfies the properties P 1 − P 3 .

An
Application. Now, we examine the same multiperiod medical diagnosis problem which is adapted from [27] to illustrate the applicability and effectiveness of the proposed cosine similarity measure for CFSs under the FMS setting. For this aim, we use CFSs for all of the patients and all diseases in Example 2.
Example 3. Let the weight of each symptom be ω j � 0.2 for each j � 1, 2, 3, 4, 5. Now, we apply the proposed weighted cosine similarity measure Wρ CFS to determine the optimal disease for all patients. New results obtained in this study and some existing results in [27] are given in Table 2. e process of assigning each patients P * k to a disease Q * i is described by for fixed k ∈ 1, 2, 3, 4 { }. e results in Table 2 show that second, third, and fourth patients suffer from tuberculosis, throat disease, and typhoid, respectively, according to both the correlation coefficient in [27] and the proposed cosine similarity measure in this study. e rest of Table 2 is different for two approaches. e novelty of the approach used in this study may cause this difference. e results in Table 3 show that first and fourth patients suffer from typhoid whereas the third patient suffers from throat disease according to both the proposed correlation coefficient and the proposed cosine similarity measure.

Comparison Analysis of the Proposed Two Approaches.
In this section, firstly, we compare the results of the proposed correlation coefficient ρ CFS with the results of the proposed cosine similarity measure δ CFS by using standard deviation of the obtained results. en, we explain the advantage of two approaches. e numerical results in Table 4 show that the best selections of these two approaches are consistent with each other for patients P * 1 , P * 3 , and P * 4 . However, we know that larger standard deviations show higher determination due to larger difference in calculation values, while smaller standard deviations show smaller determination. erefore, we look at the standard deviations for patients P * 1 , P * 3 , and P * 4 in both approaches. e standard deviation of the results of ρ CFS is greater than the standard deviation of the results of δ CFS except for patient P * 2 . In this case, ρ CFS has higher ability to determine the disease of the patients P * 1 , P * 3 , and P * 4 than δ CFS under FMS setting. e (weighted) correlation coefficient and (weighted) cosine similarity measure given in this paper provide the useful ranking method and they are beneficial mathematical tools for multiperiod medical diagnosis in the FMS environment because the new concepts simplify the decisionmaking process. erefore, developed medical diagnosis approaches not only improve the decision-making reliability but also supply a new influential way for multiperiod medical diagnosis problems in the FMS environment.   Mathematical Problems in Engineering Figure 1 shows the comparison of the results of the present paper and the results of [27].

Conclusion
In this paper, we introduce a new fuzzy set that is called consistency fuzzy set (CFS). e difference of this new concept from other existing multivalued fuzzy sets is that it uses not only the information from fuzzy multiset (FMS) but also the information provided by both the consistency degree and average of the sequences (truth sequences) in FMSs. erefore, CFSs contain more useful information than other multivalued fuzzy sets because they use two statistical comparison methods. e aim of this new fuzzy set is to obtain more reasonable results by facilitating the decisionmaking process and to offer more understandable methods. Since other methods cannot take the consistency degree and average into account, their results may be unreasonable in the decision-making process. Moreover, we also propose a correlation coefficient and a cosine similarity measure between CFSs by taking the advantages of CFSs to solve a multiperiod medical diagnosis problem. en, we compare them with some existing methods to show the usefulness of CFSs. ese proposed approaches can give more detailed information and valuable results to the decision makers as compared to the other existing ones. In the future, we focus on extending the theory under qrung fuzzy information or we shall develop new aggregation operators and some information measures algorithms in FMS setting.

Data Availability
No data were used to support the findings of the study.