Improved Approach Based on Fuzzy Rough Set and Sine-Cosine Algorithm: A Case Study on Prediction of Osteoporosis

In recent years, osteoporosis prediction has been paid more attention among healthcare experts and the public. It is a silent disease that causes many fractures and complications that impact the quality of human life; therefore, predicting osteoporosis is important to reduce the risk of fractures; however, many irrelevant descriptors can influence the prediction of osteoporosis, thus, computational methods are needed. In this article, we present a new method to predict osteoporosis, which starts by pre-processing the data to avoid an imbalanced dataset. Then, the sine–cosine algorithm, based on the information gain fuzzy-rough set, is applied to select the most discriminative descriptors. Finally, classifiers are used to predict the deficiency of osteoporosis samples based on the selected descriptors. To evaluate the efficacy of the proposed approach, two experiments were performed using benchmark datasets and real osteoporosis data. The results of the experiments show that the proposed approach achieved competitive results compared to the other methods in selecting the most appropriate descriptors for predicting osteoporosis. The selected descriptors show a high correlation with osteoporosis.


I. INTRODUCTION
Bones come in a complicated internal and external formation, have a variety of shapes and sizes, are lightweight, hard, strong, and serve multiple functions. A shortcoming between bone resorption and formation can cause various bone diseases, including arthritis, fractures, infections, tumors, and osteoporosis [1]. Osteoporosis (OP) is a disease found in adults as a systemic skeletal disorder identified by minor bone mass and the microarchitectural decline of bones. It features declined bone strength and a heightened risk of fractures, particularly of the wrist, hip, spine, humerus, and pelvis [2]. Brouns and Vermeer [3] verified that a serious number of The associate editor coordinating the review of this manuscript and approving it for publication was Xiao-Sheng Si . patients die from complications during the first year of recovery from the fracture; half of those who survive will never have the ability to move around without the assistance of walking aids and wheelchairs. A 10% loss of bone mass in the vertebrae can double the risk of vertebral fractures, and a 10% loss of bone mass in the hip can result in a 2.5 times greater risk of hip fracture [4].
Numerous aspects that cause bone loss are age-a major reason for osteoporosis-smoking, calcium, alcohol excess, and vitamin D shortage, muscle mass, low weight, anticonvulsants, and corticosteroids. Bone loss can also be a result of comorbid conditions, for instance, rheumatoid arthritis [5], surgical removal of the ovaries, kidney disease, hyperthyroidism, and anorexia [2]. All around the world, the lifetime hazard of osteoporotic fractures in women is 30%-50% and in men is 15%-30% [6]. In 2000, nine million recent osteoporotic fractures were predicted, with more than 1.5 million of those being hip fractures, more than 1.6 million being forearm fractures, and more than 1.3 million clinical vertebral fractures. The Americas and Europe accounted for 50% of all these fractures, whilst the rest mostly occurred in Southeast Asia and the Western Pacific region [7]. Earlier studies by Cheung et al. [8] in 2018 projected that the cost of hip fractures will grow from 9.50 billion USD in 2018 to reach 15.00 billion USD in 2050. They indicated that a 2%-3% decrease annually in the incidence rate of hip fracture is needed to keep the number of hip fractures constant over time.
Because health is our most basic human right, many computer science researchers utilized numerous computational methods and expert systems to predict osteoporosis at an early stage. In this context, the authors of [9] applied the ensemble artificial neural networks (ANN) and genetic algorithm to predict hip bone fracture risk; they used a general dataset (i.e., questionnaire surveys) to classify their samples. Lim et al. [10] utilized a deep neural network model for predicting osteoporosis using statistical medical records of patients. Yu et al. [11] proposed a diagnosis system for osteoporosis based on X-ray images; they applied an ANN to perform this task. In the same trend, Zhou et al. [12] developed a classification model using a convolutional neural network (CNN) to diagnose osteoporosis using photoacoustic signals. Aouache et al. [13] segmented the cervical vertebra shape boundary and extracted the features to classify osteoporosis.
Despite the attempts of these studies to predict the osteoporosis and the risk of hip bone fracture, the most important aspect for osteoporosis is still a challenge; they used statistics, medical records, and X-ray image datasets instead of using laboratory samples and did not apply feature selection methods to identify the most relevant features and minerals that play a role in osteoporosis. Therefore, in this study, we applied a new feature selection method to determine the most important features and extract irrelevant data.
Feature selection (FS) is an appropriate step to take to select important features (descriptors) from the data. Accordingly, this step has a significant role in reducing the computation time and cost as well as improving the predictive accuracy [14]. These methods decrease the number of features while preserving the relevant information in the dataset. Many techniques are used for this purpose and applied in several applications such as in [15]- [18].
In general, the feature selection methods are classified into two techniques: wrapper and filter methods. The wrapper methods depend on the classification algorithm to evaluate the selected features subset; however, the accuracy of these methods is influenced by the attributes of the used classifier; therefore, if the parameters of the classifier are not accurately chosen, this may lead to a degradation of the accuracy of the selected features [19], [20]. Unlike the wrapper methods, the filter methods do not depend on the classification algorithm to evaluate the selected features and they are also less expensive than the wrapper methods.
Several filter measures have been applied to feature selection for example, information, distance, and correlation measures that are less expensive. Meanwhile rough set and fuzzy set measures are more reliable than other filter measures.
The rough and fuzzy sets are used to deal with the uncertainty, which is considered one of the main problems in computational processing [21]. The hybrid between the fuzzy-rough set (FR) can directly work with numerical or continuous data [22] and is utilized to solve feature selection problems [23], [24]. The hybrid was also applied to reduce the attributes of the categorical data as in [22], [25].
This motivated us to use the FR set to determine the relevant features from the original features; however, the traditional FR methods suffer from some limitations, such as taking a long time to select suitable features since they are sequentially applied to all features. This required more CPU time(s), especially to handle high-dimension and real-world datasets. To overcome these limitations, the FR can be combined with meta-heuristic (MH) algorithms, such as the swarm intelligent techniques including particle swarm optimization (PSO) [26], genetic algorithm (GA) [27], social-spider optimization (SSO) [28], and modified cuckoo search [29].
According to the advantages of the FR as a feature selection method and MH techniques, in this article, we provide an alternative FS method to improve the prediction of osteoporosis. This method combines the FR and a MH technique, named the Sine-Cosine Algorithm (SCA) to find the most relevant features of osteoporosis data, where the gain information uses the FR as a fitness function.
We used the SCA algorithm due to its advantages, such as fast convergence, its ability to balance between exploration and exploitation phases, its effective escape from local optima, its ease of use, and its good search characteristics, as well as it having few predefined parameters. It was also successfully applied in many applications.
According to these properties, SCA was successfully used in previous studies for feature selection such as the authors of [30] applied SCA to select the most appropriate features from UCI datasets, the results showed an advance over PSO and GA. SCA was also combined with DE in [17] and applied to select the important features, this method showed good results against the compared algorithms; however, these studies used SCA, but they did not try to combine it with FR-the purpose of our study was to add value in this context-whereas the FR avoids the limitations of traditional rough set, which leads to improving the quality of the selected features. In addition, to the best of our knowledge, this is the first time someone has attempted to apply FR and SCA to real osteoporosis data.
The main contribution of our paper can be summarized as: 1) We propose an alternative feature selection method based on a binary sine-cosine algorithm that uses the fuzzy-rough set as an objective function to check the quality of each selected features. 2) We propose a prediction osteoporosis approach that determines the optimal subset of features related to the serum concentration of calcium. 3) We evaluated the proposed method using UCI benchmark datasets and a real dataset related to osteoporosis. The rest of this article is arranged as follows: Section II presents preliminaries, where a brief overview of fuzzy-rough sets and the sine-cosine algorithm is introduced. The proposed approach is described in Section III. Section IV presents the experiments and analysis, illustrating the properties of the proposed approach. The conclusion comes with some recommendations for future work, and is given in Section V.

A. FUZZY-ROUGH SETS
This section presents some basic concepts about information measures for the fuzzy-rough set model, which can be found in [31]. The main feature of the fuzzy-rough set model is its ability to tackle real-valued features, which the traditional RS cannot deal with [21]. Suppose that the fuzzy approximation space is given by < U , A >, in which the U and A are two finite sets. U = {x 1 , . . . , x n } is the universe of primitive instances and A is the family of features [21]. R denotes a fuzzy equivalence relation determined on X , signified by a relation matrix M (R) [32]: where b ij is the fuzzy similarity measure. It is determined as: where, value equals to 0.25. The partition of fuzzy of U constructed by R, represented by U /R, can be determined as in Equation 3: where [x i ] R represents the fuzzy equivalence class built by R and x i . The entropy for the fuzzy equivalence relation is computed as in [23]: where |[x i ] R | is the cardinality; also, the entropy of two subsets P and Q of fuzzy feature set A (i.e., P, Q ∈ A) is defined as: where n q and n p are the number of classes created by Q and P, respectively, and [x p ] P and [x q ] Q are the fuzzy equivalence partitions containing x p /x q created by P and Q. The joint entropy (H (PQ)) of P and Q is computed as in [23]: In addition, the conditional entropy (H (P|Q)) of P conditioned to Q can be calculated as: Therefore, the mutual information I (PQ) for P and Q is represented by Both Q and P do not participate any information if the I (PQ) equals zero, whereas, in the case that Q and P are highly non-linearly correlated, then a high I (PQ) will be obtained.
In this study, P is the condition feature and Q is the decision feature (group label for osteoporosis disease dataset). The gain measure can be calculated as in [32]: where B ⊆ P and a is the feature (∀a ∈ P − B).

B. SINE-COSINE ALGORITHM
In this section, the basic concepts of the sine-cosine algorithm (SCA) are illustrated [33]. The solution is updated using either sine or cosine function as in the following equations [33]: These two equations are blended to update each solution by sine or cosine function: where S i is the current solution, λ 1 , λ 2 , λ 3 and λ 4 represent random variables, S b denotes the destination position, and |.| denotes the absolute value [33]. As in [33], the parameter λ 1 indicates the next position regions. λ 2 determines how far the movement should be towards or outwards the target. The parameter λ 3 sets a weight for the target to deemphasize (λ 3 < 1) or emphasize (λ 3 > 1) the effect of desalination in the distance definition.
Finally, the parameter λ 4 denotes a random number in [0,1] and equally alternates between the sine and cosine in Equation (14). The range of sine and cosine in Equation (14) is calculated by Equation 15 to switch between exploration and exploitation: where t represents the current iteration, T represents the max iterations number, and a denotes a constant. Following [33], the values of λ 2 , λ 3 and λ 4 is updated at each iteration as λ 2 = 2π rand,λ 3 = 2 rand, and λ 4 = rand; where rand ∈ [0, 1] is a random number.

III. THE PROPOSED PREDICTING OSTEOPOROSIS APPROACH
The general architecture of the predicting osteoporosis disease approach is given in Figure 1. Where the proposed approach, called SCAFRG, depends on combining the properties of SCA with the information gain in fuzzy-rough approximation that is used as the fitness function. The SCAFRG consists of three phases: 1) pre-processing; 2) feature selection; 3) prediction phase.
In the first phase, the proposed SCAFRG approach receives the osteoporosis dataset, then, a suitable pre-possessing method is used to balance the dataset in case the number of samples in one class is greater than the other classes. The balanced dataset is used as an input matrix (A) to the feature selection phases. Thereafter, the relation matrix M (R) for each feature in A is calculated, as well as the information gain. The next step is to generate a set of solutions that represent the population of SCA, each solution is converted into a binary solution. The Boolean form helps to determine which feature will be selected and which one will be ignored. In order to determine the quality of each solution, the fitness function is computed among the selected features. Then, the best solution is selected and the other solutions will be updated using the operators of SCA, as discussed in Section II-B. The updating process is iterated until reaching the stop conditions. Thereafter, the best solution is passed to the prediction phase. In this phase, the dataset is split into train and test sets after reducing the size of features. The train set is applied to learn the classifier and the test set is applied to assess the trained model by computing the prediction accuracy of the output.
The full description of the proposed method is given in the following subsection with more details.

A. IMBALANCED DATASETS AND PRE-PROCESSING PHASE
The problem of imbalanced datasets is one of the main problems faced in real applications, and it occurs whenever the number of samples inside one group of a dataset is larger than the number of samples in the other groups. The problem of imbalanced datasets reduces the efficiency of the classifiers used to predict the labels of groups in training and testing stages because the minority of the samples (the smallest number of samples in one group) is misclassified in a frequent form. There are several approaches used to avoid this problem, such as kernel-based methods, cost-sensitive methods, and sampling methods. Sampling methods are popular approaches used to solve the problem of imbalanced datasets. They balance the samples inside each group by modifying the prior distribution of the minority and majority groups in the training stage to obtain an appropriately balanced dataset. The synthetic minority over-sampling technique (SMOTE) is one of the sampling methods that aims to construct samples in the group of minority samples [34]. Its effective application was presented in literature such as [35], [36]. The SMOTE achieves this by calculating the similarities between the minority samples in these groups. e.g., let x i be a sample in the minority group G min , the SMOTE selects the k-nearest neighbors (kNN) for x i , and a new synthetic sample x new is generated by using the following equation: wherex ij ∈ G min is one sample from the kNN of x i and δ ∈ [0, 1] is a random number [34]. The output of this phase is the balanced dataset in which all the groups have the same number of samples. VOLUME 8, 2020

B. FEATURE SELECTION PHASE
In this phase, the SCAFRG algorithm starts by computing the relation matrix M (R) for each feature of a dataset (A) using Equation (1), then for each relation matrix, the information gain Gain(a, U , d) is computed (for each a ∈ U ). The next step is to generate a population of size N in which each solution represents (reduct set); however, the solutions are converted from real numbers to binary numbers using the following formula: where σ is a random value and S i denotes the ith solution, the feature corresponds to 1 in S is selected and otherwise 0 in S not selected. For each solution, the objective function is computed by using the following equation: where U S i and L represent the subset of selected attributes, and the total number of the selected attributes using the current solution S i , respectively.
To decrease the number of evaluations of the fitness function, the SCAFRG algorithm is running in a parallel environment. Each fitness function Fin i is compared with the global best fitness Fin b . If Fin i is better than Fin b , then Fin b = Fin i and S b = S i (S b represents the reduced set which is corresponding to the best feature subset). After that, the solutions are updated based on the value of coefficient λ 4 as follows, if the value of λ 4 ≥ 0.5, the cosine equation is used; otherwise, the sine equation is used as in Equation (14). These previous steps are repeated until the stopping conditions are reached.

C. CLASSIFICATION PHASE
The classification phase starts with dividing the dataset into two sets (training and testing) by using the k-fold cross-validation (CV) method. Based on k-fold CV, the samples in the dataset are divided into k groups (nearly with the same size), then the experiment is run k times. One group is selected to test the model at each run and the remaining groups are selected as a training set. The output is the average accuracy of k runs in which the predicted output is compared with the actual values.
Based on the classification accuracy, the selected features may be relevant and either sufficient or insufficient. Different types of classifiers, in literature, are used to calculate the accuracy of experiments' outputs including support vector machines [37], k-nearest neighbor [38], and so on. In this study, we applied an SVM along with four classifiers, namely, naive Bayes (NB), repeated incremental pruning to produce error reduction (RIPPER or Jrip), logistic regression (LOG), and logistic model tree (LMT) to evaluate the accuracy of the selected features.

D. COMPUTATIONAL COMPLEXITY
In this section, we detail the computational complexity (CC) of the SCAFRG approach. The fuzzy condition (FC) feature, the fuzzy equivalence partition (FEP) matrix for each condition feature, and the decision feature to be constructed before computing the gain are used to compute the complexity. The CC to build a (N FEP × N obj ) FEP matrices is O(N FQP N obj ), where N FQP and N obj represent the number of fuzzy equivalence partitions (FEPs) and the number of samples (objects) in the dataset, respectively.
However, if we consider two FEP matrices with size (p 1 × N obj ) and (p 2 × N obj ) have to be constructed to calculate the gain of the FC feature according to fuzzy decision (FD) feature (note that p 1 and p 2 represent the number of FEPs of FC feature and FD feature, respectively). Therefore, the total time complexity to compute the relevance of a FC feature is O( Hence, the overall time complexity to calculate the gain of a FC feature is O(p 1 p 2 N obj ) = O(N obj ) as p 1 ; p 2 <N obj .
Also, the number of the selected features (N sel ) by the proposed approach has an influence on the CC, in which CC becomes O(N obj N sel ) where this is the first order incremental search approach.

IV. EXPERIMENTAL RESULTS AND DISCUSSION
To evaluate the proposed approach, we compared the proposed approach to three algorithms: PSO, harmony search (HS), and SSO-these algorithms commonly use the rough set as a fitness function in literature. The parameter settings of each algorithm are given in Table 1, which were taken from our previous studies that were tested and showed good results in these studies; as such, we applied them to the current study. The experiments were performed using MATLAB and run on Windows 10 with an Intel Core 2 Duo 64-bit CPU. The global parameters were set as the following: the size of the population was set to 25, the max number of iteration was set to 100, and the stopping condition equaled the max number of iterations.
In this section, we describe some experiments used to test the proposed algorithm on several datasets, including osteoporosis prediction and other public data.

A. PERFORMANCE MEASURES
In order to evaluate the performance of the classification phase according to the selected features, a set of classification measures was used including precision, recall, accuracy, F-measure, specificity, and negative predictive values. The definitions of these measures are listed below.
The precision rate (Pre) was calculated as: The recall rate (Rec) is calculated as: The accuracy (Acc) is calculated as: The F-measure (FM ) is calculated as: The specificity (Spec) is calculated as: The negative predictive value (NPV ) is defined as: here T P denotes the true positive samples. F P denotes the false positive samples. T N denotes the true negative samples. F N denotes the false negative samples.

B. FIRST EXPERIMENT: UCI DATA
In this experiment, four datasets, including Wine, Iono-Sphere, Sonar, and Pima Indians, were used to evaluate the proposed approach. These datasets were selected from the UCI Repository Machine Learning Database [39], and these datasets all have different properties-see Table 2.
The selected features obtained from the proposed approach and other methods are given in Table 2. The dataset was split according to these selected features into training and testing sets using 10-fold CV. Then, five classifiers were used (i.e., SVM, NB, LOG, LMT, and RIPPER-in this study, we refer to RIPPER as Jrip) to evaluate the accuracy of these features. These classifiers were chosen since they are well-known algorithms and have shown good results in different literature. Table 3 shows the prediction accuracy of each algorithm along with each dataset using the selected features. From this table, it can be noted that the proposed approach and HS, PSO, and SSO methods have higher classification accuracy than when compared to using all the features as input for the classifier overall datasets; however, the proposed approach has the best accuracy when compared to HS, PSO and SSO methods in most cases. Figure 2 shows the accuracy of each classifier along with the dataset according to the selected features. This confirms that the prediction accuracy of SCAFRG outperforms PSO, SSARS, and HS approaches. The previous results indicate that the proposed SCAFRG can help to improve the classification accuracy by selecting the optimal subset of relevant features. The time-efficiency for all methods is shown in Figure 3.

C. SECOND EXPERIMENT: OSTEOPOROSIS PREDICTION
The goal of this experiment was to improve the accuracy of predicting osteoporosis by removing the irrelevant features, which will lead to a more effective use of time. The proposed method was compared to the SSO algorithm that was allocated as the second rank based on the accuracy in the previous results.

1) DATASET DESCRIPTION
Fifty mature female Sprague-Dawley rats (250-260 gb wt and 10-12 weeks old) were used in this study. The rats were obtained by the third author from the Animal Colony Laboratory, Helwan, Egypt. The rats were kept under hygienic conditions at a room temperature of 22 ± 2 • C, a humidity of 50%, and a 12 hr light/dark cycle. Rats were granted free access to food and distilled water. The biological experiments were carried over according to the guide for the care and use of laboratory animal resources, Commission on Life Sciences, National Research Council [40].
The rats lived on a basal diet containing 14% casein (protein > 80%), corn oil 4%, fiber 5%, salt mixture 3.5%, vitamin mixture 1%, choline chloride 0.25%, and corn starch [41]. The vitamin composition of the diets were in line with [42]. Before their use in the experiment, rats were kept for one week in order to acclimatize to laboratory conditions. After this period, they were split into two groups; the first group (i.e., 20 rats) were orally administered normal saline (4.5 ml/kg body weight / twice per week) as a basal healthy group. The second group (i.e., 30 rats) got oral Glucocorticoid (prednisone acetate; 4.5 ml/kg body weight / twice per week) to activate osteoporotic models as stated by [43].
Biochemical Studies After the experiment had ended (i.e., six weeks), the animals were fasted overnight, then the rats were anesthetized and sacrificed. A blood sample was collected from the aorta, left standing for 10 minutes to clot, and centrifuged at 12000 rpm for 15 minutes to detach the serum, which was kept frozen at −20 • C until biochemical analyses were performed. The serum concentration of calcium and the other serums were predicted as in the literature.    Table 4 shows the feature names and their abbreviation of the osteoporosis dataset.

2) EXPERIMENT
In this experiment, the proposed approach was compared with well-known methods and all features (note that, in the following, the word ''ALL'' is used to indicate that all features were used as inputs to the classifiers).
The selected features and the reduction rate of the dataset are recorded in Table 5. Table 5 shows that the proposed method achieved a better selection rate than the HS and SSO algorithms, in which the SCAFRG has a smaller selection rate (41%) and is ranked first followed by the HS, whereas the SSO achieved (65%) and is ranked last. In addition, the ''Selected Features'' column shows the most relevant features achieved by the algorithms. The selected features by SCAFRG can be considered the most important features for the prediction of osteoporosis and can provide specialists with good information about relevant features responsible for osteoporosis (these features are Serum Ca, FB Ca, BMD, TSH, UA, BUN/Cr, and ALP).
To evaluate the SCAFRG, the selected features were passed to classifiers and the accuracy of classification was computed-see Table 6. According to the results, we can conclude that the SCAFRG method achieved good results, especially when compared to the HS, SSO, and ALL. In addition, the SCAFRG based on SVM, LMT, and kNN classifiers performed better than the SCAFRG based on all other classifiers in terms of measures. The least accuracy was obtained by the SCAFRG based on the LOG classifier, also, the SCAFRG based on NB performed better than the SCAFRG based on the Jrip classifier.
These results provide strong proof that the proposed SCAFRG approach can be used to predict osteoporosis; feature selection can also help inform specialists when it comes to making decisions for their osteoporosis patients.  In addition, to evaluate the effectiveness of the SMOTE technique versus other methods, a new experiment was performed to compare the results of SMOTE to randomover-sample (ROS), random-under-sample (RUS), and condensed nearest neighbor (CNN) using all features. This comparison applied five classifiers and three performance measures: accuracy, precision, and recall. The results of this experiment are recorded in Table 7. This table concludes that the SMOTE method achieved the better results in all measures and it is ranked first based on the average of the results of all classifiers followed by ROS, and RUS, whereas, CNN was ranked last. Therefore, SMOTE is the most effective method that can be used in solving imbalanced datasets classes.

3) THE RELATION BETWEEN OSTEOPOROSIS AND THE SELECTED FEATURES
Given that the health implications of osteoporotic fractures indicate that the immediate objective of osteoporosis therapy should be to limit fractures, which has an increased risk of occurring due to decreased or eliminated bone loss, keeping bone strength, and limiting or eradicating factors that may be a reason for fractures [44].
Among biochemical markers of bone and mineral disorders, mean values of serum Ca (mmol/L) for the second group of rats showed a serious decrease p < 0.05 when compared to the serum Ca values of the healthy group. Calcium certainly has an important relationship with osteoporosis.
Calcium is a crucial nutrient that is critical for various functions in the human body, and necessary for good health. Calcium is the greatest ample mineral in the body, with 99% of calcium discovered in teeth and bone, whereas only 1% is found in serum. Bone formation and maintenance is a process that happens throughout life. Early consideration of solid bones in childhood and adulthood will produce more balanced bone mass through the aging years [45]. The calcium of the skeleton can be a capital supply of calcium that allows the body to meet its needs in the case of calcium inadequacy.
Calcium inadequacy is easily induced because of the compulsory losses of calcium via the bowel, kidneys, and skin [46]. It is widely acknowledged that calcium is useful as a phenotype marker for bone construction. In [47], the authors confirmed that dietary supplementation with calcium and vitamin D increases bone health, limits the risk of fractures, and advances the performance of pharmacological management. The North American Menopause Society [48] predicted that acceptable intakes of calcium for pre-menopausal and post-menopausal women according to proof relating to osteoporosis hindrance. At least 1200 mg/day of calcium is vital for most women; to provide acceptable calcium consumption, a daily intake of 400-600 IU of vitamin D is preferred, either through sun exposure or through diet or supplementation.
As stated in the study, using glucocorticoids (GC) to induce osteoporosis caused serious shortcomings in bone mineral density (BMD g/cm2) when compared to the BMD of the negative control group.
As stated by the World Health Organization (WHO), BMD measurement can be used to assess the fracture hazard in hip, 203198 VOLUME 8, 2020 lumbar, or spine and to establish the diagnosis and severity of osteoporosis. Individuals with the lowest BMD are at the highest risk of fracture; this is estimated by dual-energy X-ray absorptiometry (DXA), which is the gold standard used to diagnose osteoporosis. BMD testing can be used to assess developments over time (monitoring) in treated and untreated individuals [44].
A general consensus, confirmed by the U.S. Preventive Services Task Force USPSTF, 2002 [49], is that all women aged 65 and older should have a BMD test, and that women at-risk of bone disease who are under age 65 should also be screened. The National Osteoporosis Foundation (NOF) also favors BMD testing for men who present with fractures or are getting treatment for prostate cancer, as well as for all people who have primary hyperparathyroidism or are on long-term glucocorticoid treatment [50]. BMD testing was shown to be necessary [51] and should also be performed on any person who has other possible risk factors for osteoporosis, specifically anyone who exhibits any clinical symptoms of osteoporosis, such as hyperthyroidism or hyperparathyroidism, any person who has had a low-trauma fracture or medications that cause bone loss like glucocorticoids and diseases that cause poor intestinal absorption.
Taking the current study into consideration, kidney function, uric acid (mg/dl), and blood urea nitrogen/creatinine (BUN/Cr) ratio of osteoporosis female rats had significant changes p < 0.05 compared to healthy group (control negative). Patients whose kidney function is damaged have bone and mineral disturbances causing extraskeletal calcifications and complicated developments in bone turnover, which predisposes them to an increased risk of fracture along with increased morbidity and mortality [52]. Furthermore, patients with chronic renal disease were reported by [53] to not only at risk of developing rickets and osteomalacia, but also renal osteodystrophy, a complicated bone disease. This condition is defined by stimulation of bone metabolism caused by a rise in parathyroid hormone and by a setback in bone mineralization that is caused by decreased kidney production of 1,25-dihydroxy vitamin D [54]. Patients who have end-stage renal disease were proven by [55] to be at increased hazard of osteopenia and hip fracture. Dialysis and transplantation may not limit the progression of bone disease, although they may lengthen the life-expectancy of these patients. Managing the patient through dialysis could cause increased bone irregularities, which become superimposed on the underlying osteodystrophy, hence increasing the hazard of fractures. Furthermore, the authors of [56] proved that hip fracture in dialysis patients is related to increased mortality. Cross-sectional and longitudinal relation with measures of renal function and bone mineral density (BMD), bone loss, and osteoporotic fracture in older individuals were investigated in 2007 by [57].
There was a serious linear relation between creatinine or glomerular filtration rate and hip BMD. It was found by [58] via logistic regression analysis that creatinine levels are affected slightly by muscle mass, which is related to age, gender, and weight. Even though the estimated glomerular filtration rate (eGFR) can be used as an improved marker of renal function, it is also calculated from serum creatinine and eGFR was related positively related with femur BMD. The authors of [59] confirmed that BUN can be elevated in patients who are receiving corticosteroids, those with raised catabolism or those with gastrointestinal tract bleeding. Also, the Food and Drug Administration reports that blood urea nitrogen/creatinine ratio raise is found among individuals with osteoporosis, particularly females who are 60+ old, who take medication Fosamax and have bone density unnatural.
Unnatural thyroid status in childhood has been shown to disturb bone maturation and thin growth, while in adulthood, it causes modified bone reshaping and a raised hazard of fracture. Also, population studies show that both thyroid hormone shortcoming and excess are related to a raised hazard of fracture [60]. Thyroid hormones (T3 and T4) raise the energy manufacturing of all body cells, including bone cells. They improve bone growth by triggering osteoblasts. A lack of thyroid hormones can decrease growth in children, while enormous loads of it can result in too much bone breakdown and reduce the skeleton's development [61]. The pituitary hormone that controls the thyroid gland, thyrotropin or TSH, may have immediate influences on bone too [62].
The authors of [63] revealed that the hormones most important for growth during childhood are the insulin-like growth factors (IGFs), which are produced by the liver and bone tissue. IGFs encourage osteoblasts and enhance the synthesis of the proteins necessary to construct new bone. Furthermore, the pancreas' hormone, insulin, enhances bone growth by raising the synthesis levels of bone proteins [64].
Experimental studies in mice lacking either the thyroid hormone receptors (TR) (i.e., TR-alpha or TR-beta) suggest a bone loss is mediated by TR-alpha [65]. Hence, thyroid hormone can influence bone calcium metabolism by a direct action on osteoclasts or effect on osteoblasts that one turn mediates osteoclastic bone resorption [66]. Through experiments, it was discovered that triiodothyronine T3 (ng/dl) and thyroxine T4 (µg/dl) from female rats suffering from osteoporosis rise dangerously (p < 0.05) in contrast to the negative control group.
Diseases that hinder the function of the liver (mainly alcoholic cirrhosis, biliary cirrhosis, cirrhosis due to hepatitis B and C, and chronic active hepatitis) could cause disruptions in vitamin D metabolism and could also result in bone loss by other mechanisms. Primary biliary cirrhosis is related to particularly severe osteoporosis. Fractures are more constant in patients with alcoholic cirrhosis than any other types of liver disease [67].
Biochemical monitoring of bone metabolism depends upon a measurement of enzymes and proteins released during bone construction and of degradation products produced during bone resorption [68] and total alkaline phosphatase (ALP) is counted as one of the bone construction markers. When ALP concentration is immensely high, it is fundamental to differentiate the cause from hepatic diseases, osteomalacia, secondary osteoporosis, or bone tumor [69].
Secondary osteoporosis is often suspected if the alkaline phosphate level is unusual. Immensely high levels of ALP are usually encountered in elderly individuals, although in many cases, liver function is within the usual limit. Because ALP combines numerous isoenzymes from other tissues, a definite diagnosis cannot be completed by exclusively depending on high levels of ALP [70]; however, it was proved by [71] that after surgery, increases in the concentration of total ALP and subsequent decreases could reflect the usual healing process. They also confirmed that the values of ALP and bone markers are influenced by bone fractures. One of the causes of elevation in bone-specific ALP (BSAP) and ALP could be the healing of unrecognized bone fractures in elderly women with osteoporosis. It was confirmed by [70] that high bone turnover is the primary cause of elevated ALP in postmenopausal women. They concluded that alkaline phosphatase (ALP) is still a good marker for bone turnover or therapy evaluation instead of bone-specific ALP. As is evident in the results ALP (U/L) concentration was usually higher in female rats suffering from osteoporosis with a serious difference at p<0.05, than ALP concentration in female rats from health group (NC).
Based on our results, we can conclude that the proposed SCAFRG method provides superior results when compared to the other methods and can effectively select the best features and attributes in the benchmark and real datasets, and can effectively predict osteoporosis and the risk of hip bone fractures, even though most features are closely related to each other in the datasets, which makes the identification of relevant features a challenging task.
The proposed method also had the highest accuracy results in both experiments. The advantages of the SCAFRG method can be due to: the use of SCA as a fast convergence algorithm, the simplicity of implementation, and its ability to effectively escape from local optima, which leads to maintaining the population and promoting the ability of FR to find the best features in the dataset; therefore, the behavior of the proposed SCAFRG method, which benefited from the characteristics of SCA and FR, can effectively select the most important features and delete the irrelevant ones in the dataset. These promising results encourage us to use it in different fields and tasks.

V. CONCLUSION AND FUTURE WORK
In recent years, the etiology of osteoporosis has got more attention in the medical field because of the effect on the treatment outcome. Treatment can be accelerated and the quality of life can be enhanced by determining the relationship between the nutrients that are associated with any given health problem. In order to contribute to this goal, in this article, we proposed a new computational method that combines the SCA algorithm with the fuzzy-rough set (FR) theory to select the most relevant features and improve the prediction accuracy of osteoporosis. The proposed approach consists of three stages, in the first stage, the SMOTE sampling method is used to obtain a balanced dataset. In the second stage, the most features are selected using the SCAFRG algorithm; in this algorithm, the gain information based on fuzzy-rough theory is used as the fitness function to distinguish between the features. In the third stage, some classifiers were used to predict osteoporosis. In order to investigate the superiority of the proposed method, a set of experiments were performed using different kinds of datasets: the first kind is taken from UCI benchmark datasets, whereas the second is a real osteoporosis dataset. The experimental results showed that the proposed approach improved the classification accuracy of UCI datasets, as well as having a good ability to predict osteoporosis. Moreover, the proposed SCAFRG with SVM is better than using any other classifier in terms of all measures and time complexity. According to the promising results of the proposed approach (SCAFRG) in predicting the osteoporosis, we will apply this approach in future to other applications such as classification of galaxy images, as well as improving it by using chaotic maps in its first stage.