Application of Data Science Approaches to Investigate Autoimmune Thyroid Disease in Precision Medicine

In recent times, the application of artificial intelligence in facilitating, capturing, and restructuring Big data has transformed the accuracy of diagnosis and treatment of diseases, a field known as precision medicine. Big data has been established in various domains of medicine for example, artificial intelligence has found its way into immunology termed as immunoinformatics. There is evidence that precision medicine tools have made an effort to accurately detect, profile, and suggest treatment regimens for thyroid dysfunction using Big data such as imaging and genetic sequences. In addition, the accumulation of data on polymorphisms, autoimmune thyroid disease, and genetic data related to environmental factors has occurred over time resulting in drastic development of clinical autoimmune thyroid disease study. This review emphasized how genetic data plays a vital role in diagnosing and treating diseases related to autoimmune thyroid disease like Graves’ disease, subtle subclinical thyroid dysfunctions, Hashimoto’s thyroiditis, and hypothyroid autoimmune thyroiditis. Furthermore, connotation between environmental and endocrine risk factors in the etiology of the disease in genetically susceptible individuals were discussed. Thus, endocrinologists’ potential hurdles in cancer and thyroid nodules field include unreliable biomarkers, lack of distinct therapeutic alternatives due to genetic difference. Precision medicine data may improve their diagnostic and therapeutic capabilities using artificial intelligence.


Introduction
A breakthrough was launched for the field of personalized medicine when the president of the United States of America announced precision medicine in January 2015, presenting it for review and implementation by all healthcare professionals [1]. Since then, molecular characterization of patients which are more precise has been developed in the area which includes an increasing number of 'omics': (proteomics, genomics, transcriptomics, lipidomics, metabolomics and epigenomics), integration of genomic data, the rapid exchange of knowledge among researchers, bioinformatics which involves the retrieval and analysis of data stored in the large databases, and the growing world of Big data and artificial intelligence [1,2]. These factors are introduced to drive clinicians towards diagnosis, follow-up and therapeutic decisions in precision medicine [2].
Data science applies the use of machine learning algorithms to audio, video, images, text, and numbers to develop artificial intelligence (AI) systems which are used in data processing and preparation of analysis, optimization and construction of integral models, which is further used in the combination of certain algorithm and consequently produce insights that analysts can translate to add value to existing knowledge [3].
One of the principal challenges in clinical endocrine practice is thyroid disease management. During the last years, continuous progress has been experienced in medical science. Also, some factors have improved our knowledge of this field from arithmetical to geometrical proportions. Some of the lists of these factors include accurate clinical assessment, understanding inter or intracellular reactions, and the environment's influence on this reaction [2]. Most fields of science have undergone a big data revolution. The use of data science in personalized medicine is important for treating variability in autoimmune disorders, especially in patients with the presence of varying autoimmune diseases [4,5]. Studies have also shown how data like the electronic health records (EHRs) initially designed to facilitate patients registration has been used as a tool in predicting thyroid diseases, as seen in some reports that link the EHRs data to extant genotypes to identify new gene locus like forkhead box E1 (FOXE1), which is associated with autoimmune thyroid diseases [6][7][8].
Genomic data is an important data in precision medicine. Therefore, most thyroid diseases such as autoimmune thyroiditis are known to have high heritability [8,9]. Studies have reported high rate of Graves' disease in monozygotic twins compared to dizygotic twins (in the range of 50-70%, compared with 3-25% respectively). Also, Hemminki and his co-worker reported the familial standardized incidence ratios for Graves' disease to be 4.49 (for individuals whose parent was affected), 5.04 (for individuals with only a single sibling affected), while 310 (if the individual has two or more siblings affected), and 16.45 in twins [1,8,10]. For Hashimoto's thyroiditis (HT), the sibling risk ratio was found to be 28 and this risk was confirmed in data obtained from Germany [8,11,12]. All this evidence shows the association of genetic susceptibility to autoimmune thyroid diseases.
A genome-wide association study (GWAS) of hyperthyroidism was carried out with a sample of 1317 hypothyroidism cases and 5053 controls which was algorithmically determined from five EMRDs (electronic medical record databases), one association was found with near forkhead box E1 (also known as thyroid transcription factor 2 (TTF-2)) [7]. Gene studies have also linked autoimmune hypothyroidism with PTPN22 (protein tyrosine phosphatase, non-receptor type 22), CTLA4 (cytotoxic T lymphocyte antigen 4) and HLA II (human leukocyte antigen class II region) [7,8]. On the other hand, Graves' disease has been studied in several genome-wide association studies, with the discovery of many loci [1,7]. These associations are important in the diagnosis and treatment of autoimmune thyroid diseases.

Autoimmune thyroid diseases and data science 2.1 Autoimmune thyroid diseases (AITDs)
Autoimmune thyroid diseases (AITDs) are the most common autoimmune diseases in humans and it is divided based on the grade of lymphocytic infiltration [13]. They are more prevalent in females than males (i.e. they are 5-10 less frequent in men). Graves' disease which is a disease associated with hyperthyroidism and Hashimoto's thyroiditis which is also associated with hypothyroidism are the major types of AITDs [13].

Graves diseases (GD)
Graves' disease is the most common cause of hyperthyroidism, which affects people at any age but most prevalent in adults, the incidence of this disease peaks between 30 and 50 years [14]. It is also characterized by goiter, ophthalmopathy [15].

Hashimoto's thyroiditis (HT)
HT has now been considered the most common AITD [16], the most common endocrine disorder [17] and also the most common cause of hypothyroidism [18,19]. It can be divided into primary and secondary forms, the primary form is the most common thyroiditis and the secondary is the more recent description of thyroiditis [20].

Causes of AITDs
The factors that result in AITDs are genetic factors and environmental factors. Various susceptibility genes like HLA-DR gene locus and non-MHC genes which includes CTLA-4, CD40, PTPN22, CD25, FOXP3, thyroglobulin and TSH receptor genes have been identified and characterized [21]. The major environmental triggers that have been identified are; iodine, selenium, medications, smoking and stress, infection, sex steroids, pregnancy, fetal microchimerism and radiation exposure [22,23].
The risk of developing Graves' disease is influenced by genetic factors accounting for up to 80%, while environmental factors account for up to 20% [24][25][26]. The mechanisms involved in immune tolerance are destroyed by these environmental factors in genetically predisposed people leading to the onset of the disease [24,26].
In Hashimoto thyroiditis, genetic and environmental factors also contribute to the development of HT.

Pathogenesis of AITDs
Many factors play a role in the pathogenesis of AITDs, mostly involving the complex interaction of the genetics and environmental factors, immune system and cytokines [27]. The pathogenesis of AITDs results from either cell-mediated autoimmune and endocrine autoimmunity [26]. Thyroid peroxidase antibodies are potent marker of AITDs [27]. Its levels associated with the expression of MHC on thrococytes and with a degree of infiltration by lymphocytes may sensitize and trigger the synthesis of autoantibodies [28]. They are involved in both the immune system and directly targeting the thyroid follicular cells [27]. Their presence has been identified within inflammatory and thyroid follicular cells [29]. Cytokines enhance inflammatory responses by stimulating both B and T lymphocytes, resulting in antibody production and damage to the thyroid tissue by apoptosis in particular HT [30]. In addition, T cells subtypes have also been recently discovered to play a role in the pathogenesis of AITDs [31][32][33].
In Graves' disease, pathogenesis is a complex process, it involves the TRAbs which are antibodies against the thyroid-stimulating receptors [34]. TSH receptor antibodies (TRAb) mimics the function of TSH and it causes the disease by binding to the TSH receptor thereby stimulating or inhibiting thyroid cells in producing thyroid hormones (T3 and T4) [35]. The TRAbs binding to the TSH receptors leads to continuous and uncontrolled thyroid stimulation associated with the synthesis of thyroid hormone in excess and thyroid hypertrophy [35].
In Hashimoto thyroiditis, the pathogenic mechanism involves the contribution of cellular immunity in the form of the defect in the suppressor T cells as well as regulatory T cells, follicular helper T cells, cytotoxicity and apoptosis and humoral immunity in the form of TPO/TG antibodies and immunoglobin subclass, sodium iodide symporter (NIS) and pendrin antibodies, thyroid-stimulating hormone receptor (TSHR) antibodies and also the role of cytokines and DNA fragments and micro RNA [36]. All these have been observed to play an important role in the pathogenesis of HT.S.

Management of AITDs
The recent landmark in the management of HT disease and GD disease will be discussed as it is the major form of AITDs.

Hashimoto's thyroiditis
Since it discovery, various understanding has been made about this condition. It has been reviewed that a grading system might be a better method of classifying hypothyroidism due to the continuous change that is observed in the serum level of TSH and free thyroxine (T4) than differentiating it into clinical and subclinical forms [37]. With this consideration, it becomes difficult to determine a starting point for thyroid hormone therapy supplementation which is ideal enough. A randomized trial (TRUST) initiated by the European Commission (2012) aids the understanding of the effects of levothyroxine (LT4) in the treatment of subclinical hypothyroidism [37].
Reoccurrence of symptoms was observed in 5-10% of patients with hypothyroidism despite receiving LT4 treatment and having a normal serum TSH levels [38]. A guideline has been provided by European Thyroid Association (ETA) on the combination therapy of LT4 and LT3 as superior to T4monotherapy and LT4 mono-therapy [38].

Graves's diseases
Since the inception of GD, it has been treated by antithyroid drugs, radioactive iodine and surgery. Preexisting guidelines were used in the management of GD but recently a detailed guideline has been provided separately for subclinical hyperthyroidism, although they are not supported by randomized clinical trial [39]. Radioiodine is used in the treatment of Grave's disease [40]. It connects to thyroid autoimmunity through thyroid cell death in which self-antigens are liberated from the thyroid gland following the exposure to the therapy until complete ablation has been achieved [40]. Treatments of GD with antithyroid drugs gives favorable and unfavorable response in patients [40]. With all the recent studies on the management of GD, each management plan is associated with its limitation and a definite plan for the management of GD has not been confirmed. To provide a permanent treatment plan for the disease, researchers are: looking at the aspects of creating a new drug that will d preventing the disease without destroying or removing the thyroid gland and also avoiding the recurrence of the disease. The results of recent in vivo experiments are quite promising [41].
In both diseases, vitamin D has been reviewed to play a significant role in the modulation of the immune system, enhancing the innate immune response while it also exerts an inhibitory action on the adaptive immune system [42].

General investigation of AITDs
This is based on clinical features and laboratory investigation. The circulating antibodies is a core determinant of AITDs as they are measured against TPO and TG. A negative test excludes AITDs, but a positive test infers AITDs, each type of disease depending on the presence of either antibody. The measurement is done using thyroid receptors assays or bioassays [37].

Data science approaches to investigate autoimmune diseases
At a time when computer processing power keeps increasing exponentially while networks keep expanding, data available at the same time becomes overwhelming and it becomes imperative to marry the field of data processing and computer so as to take full advantage of the available data as it already exceeds the processing capacity of manual methods and conventional database approach [43]. Data science as a field supports the process of taking data-driven decisions while depending largely on "Big data" storage, engineering and analysis [43]. Therefore thinking data science application in a field implies the intention to gather data, process such data, analyze and utilize such data for the purpose of understanding illness, understanding the reason for such illness (diagnosis), understanding how the illness is progressing (prognosis), understanding the possible endpoint of such illness (prediction) and understanding the intervention that could bring the best out of such situation (treatment/recommendation) [44].
Autoimmune diseases are dangerous or disruptive disease conditions that affect the tissues of the body, which is facilitated by the susceptible genes present in the host and environmental factors where the body's immune system attacks itself through the presentation and recognition of specific antigens and the response of the target organs [45].

Data science approaches
In an attempt to harness the recent and innovative development taking place with regards to computing infrastructure, methods of data processing and tools for data analysis, the discipline of data science is evolving with serious evolving challenges. Cluster computing and cloud computing are fundamental components of data science that enhance usage of powerful algorithms necessary to access, visualize, interpret, organize, analyze, and rapidly with a reasonable degree of efficiency manage cross-scale big data necessary for enhanced use of artificial intelligence. The availability of big data and the advancement in the field of artificial intelligence has led to the development of various machine learning algorithms, deep learning algorithms and deep neural networks algorithms to process big data considering its high volume and complexity.

Machine learning
One big question that has been raised in the field of computing is the question of how to design and enable computers that are capable of improving automatically through the various experience without explicit instructions and limited human intervention. Such question was answered by the birth of the field of machine learning which stands as one of the most rapidly growing technical fields today which is a point where computer science intersects with statistics and stands as the heart of artificial intelligence and data science [46]. The mechanism of machine learning, a rapidly developing arm of computational algorithms, is to simulate and emulate human reasoning and intelligence by allowing the designed system to learn from the environment. Low cost of computation, online access and availability of data, discovery of new theories and new learning algorithms among other are forces that drives machine learning [46]. Different machine-learning algorithms has been made with the intention to solve various machine learning related problems and use the large variety of data types [47,48]. Conceptually, what machine-learning algorithms do can be perceived as running through a large selection of the program to select a program of choice and this choice is guided by experience acquired through training and the choice would be a program that optimizes the performance metric. The great range of variation seen in machine-learning algorithms depends in part on the method by which the algorithm represents its candidate programs (e.g., mathematical functions, decision trees, and general programming languages) [47]. The variation is also dependent on the method through which such algorithm search through this list of programs (e.g., optimization algorithms with well-understood convergence guarantees and evolutionary search methods that evaluate successive generations of randomly mutated programs) [47]. Supervised learning stands as the most widely employed method of training machine learning algorithms [47].

Deep learning
Deep learning involves the use of computational models that are made up of multiple layers of processing, which are capable of learning using representations of data with multiple levels of abstraction. Deep learning methods have rapidly and progressively improved technologies available for recognizing and processing speech, recognizing and identifying visual objects, and many other domains. Deep learning has also been useful in fields such as drug discovery and genomics. Conventional machine-learning techniques were limited in their ability to process natural data in their raw form. However, deep learning using multiple levels of abstraction and representation that is obtained by making simple but non-linear modules that can transform the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level and with the composition of enough of such transformations, very complex functions can be learned [49].

Deep neural network
Multiple levels of non-linearity in the networks of artificial neurons that makes up deep multi-layer neural networks enables such algorithm to compactly represent functions which are non-linear and highly-varying. Some interesting characteristics of neural network-based systems include the fact that they can learn and adapt while learning because they consist of an architecture of artificial neurons which are wired to form networks that are arranged in layers, has a loss or optimisation function driving the learning process and possess a training algorithm constantly run through changing parameters [50].

Application of data science in the treatment of autoimmune thyroid diseases
Data science is known to encompass the preparation of data for analysis, this includes aggregating, cleaning, and manipulating the data to uncover patterns and draw out insights. Exploiting historical clinical datasets to improve future treatment choices has proved beneficial for both patients and physicians [43,51]. Through machine learning (a branch of artificial intelligence), it is very possible to obtain patterns within patient data, the exploitation of these patterns helps to predict and treat patients in order to improve clinical disease management [52].
Machine learning also features selection algorithms such as Kruskal-Wallis' analysis, Fisher's discriminant ratio, and Relief-F. In some research, these algorithms have been used to analyze databases containing clinical features (such as U.S. Surveillance Epidemiology and End Results (SEER) database) from identified thyroid disease patients [51].
Also, the discovery of data mining has been essential in the health care sector as its application have been reported in drug delivery, disease predictions and abnormality detections. Electronic health records have provided access to vast clinical data, the application of data mining techniques has helped transform this data information into valuable knowledge for making health care decisions [53]. Also, data mining algorithms have been used on health record data sets to analyze factors contributing to autoimmune diseases such as those associated with thyroid disease [54].
Although the major autoimmune thyroid disease include Graves' disease and Hashimoto's thyroiditis [55], these diseases are different clinically. Genetic data shows that their pathogenesis shares immuno-genetic mechanisms. Some shared susceptibility genes include human leukocyte antigen DR containing arginine at position (β74 HLA-DRβ1-Arg74). Exploring the genetic-epigenetic interactions of autoimmune thyroid pathogenesis is essential to uncover new therapeutic targets [55], this suggests how important genetic datasets are in developing therapeutic targets.
Precision medicine has also been implemented in a therapeutic approach to autoimmune thyroid disease such as Graves' disease [1]. Therefore, recent therapies are targeting a key co-stimulatory molecule usually expressed on antigen-presenting cells (CD40), due to this, anti-CD40 monoclonal antibody has been developed [56]. Studies on genetic data suggest that genetic polymorphisms in the CD40 gene drive its expression and response to anti-CD40 monoclonal antibody like Iscalimab (also known as CFZ 533), which is a full human IGg1 [56,57]. Furthermore, studies established that thyroglobulin antibody (TgAb) and thyroid peroxidase antibody (TPOAb) are the most characteristic autoimmune antibodies to Hashimoto's thyroiditis [58].
The aim of analyzing datasets (such as genomic datasets and electronic health records) in precision medicine of autoimmune thyroid disease is to determine the treatment options, manner of implementation and choice of therapy. Lastly, this section demonstrate that existing medical datasets has been a reliably strength in clinical predictions, thus, it helps medical practitioners to make an informed and optimized treatment decisions. Figure 1 illustrates the steps in the application of data science to treat autoimmune thyroid disease.

Biological agents in treatment of Graves's disease
Biological agents are usually precise for a specified target, a few have subsequently renowned standard target (e.g. rituximab for B-lymphocytes) [59]. Considering specific agents with specific targets is the strategy that aid to achieve cure for this autoimmune disease [60]. Some biological agents involved in novel treatment of Grave's disease include: a. Rituximab (RTX): rituximab is an anti-B cell agent (monoclonal chimeric antibody) that is against the transmembrane protein CD20 on B cells (but not plasma cells) [61]. Intraorbital administration of rituximab has been shown to be effective as opposed to high dose of systemic glucocorti-coids in the treatment of thyroid-related orbitopathy in grave disease [62,63].
b. Adalimumab: T-cells expressing IGF-1 receptors are assumed to show a central role in mediating the autoimmune process in severe grave's disease [64]. Adalimumab is one of the anti-T-cell agents which seems to have efficacy similar to that of infliximab. It is a human monoclonal IgG1 antibody which clings to both soluble and membrane-bound TNF (tumor necrosis factor), it also repairs complement and induces lysis of cells expressing membrane-bound TNF [64,65].
c. Intravenous immunoglobulin: strategically using anti-auto-antigen to stimulate the thyroid but not blocking autoantibodies are highly predominant in severe and vigorous thyroid-associated orbitopathy [66]. Therapeutic measures aiming at the autoantibodies may be effective, even though such consideration must be cross-checked in determining if the presence of such autoantibodies is truly causal or a threat [67].

Application of data science in the diagnosis of Graves' disease
The most common cause of autoimmune hyperthyroidism is Graves' disease, which primarily affects the thyroid gland. In Graves' disease, the main auto-antigen is the TSH receptor (thyroid-stimulating hormone receptor (TSHR)), expressed primarily in the thyroid and secondarily in adipocytes, fibroblasts, among others sites. It also appears to be closely related to the insulin-like growth factor 1 (IGF-1) receptor [68]. This disorder presents a systemic clinical manifestation that affect vital organs like the heart, liver and eyes. Failure to diagnose this disease on time can predispose thyroid storm, which carries high morbidity and mortality. Therefore, it is imperative to diagnose and manage the disease early in other to prevent severe cardiac complications such as atrial fibrillation, atrial flutter, and high output cardiac failure [69].
Data mining and machine learning have been reported to play an important role in diagnosing diseases, as they provide a vast classification of accurate techniques for the prediction of disease. Patient data collected from healthcare organizations is useful for accessing the risk factors analysis of diseases such as autoimmune thyroid disease. Classification algorithms is one of the most important applications in the data mining field, which can be used to make decisions in many real-world problems [51,54]. A recent study uses 34 unique clinical data (variables) such as patients' age at the time of diagnosis and information regarding lymph nodes to build novel classifiers that distinguish patients who probably live for over ten years since diagnosis from those who did not survive at least five years. This report also shows there is 94.5% accuracy in distinguishing patients in terms of prognosis using machine learning [51].
The diagnosis of Graves' disease begins with a thorough historical and physical examination. The historical examination includes the data recorded from family history for Graves' disease, while the physical examination includes assessing goiter size by ultrasound [69,70]. Dr. Cech began the discussion of precision medicine in the domain of thyroid disease, according to him, the use of radioisotopes to treat hyperthyroidism and thyroid cancer is one of the first uses of precision medicine in thyroid disease [71]. Researchers from the field of endocrine practice investigated Graves' disease retrospectively by collecting data such as disease severity, smoking rate and severity of orbitopathy [70]. Studies have also reported that TSHR antibodies and activated T cells play a major role in the pathogenesis of Graves' orbitopathy, this role is by activating adipocyte TSHR, retroocular fibroblast and IGF-1 receptors, also plays an important role by initiating a retro-orbital inflammatory environment [68].
Since the advent of precision medicine, its future application in thyroid dysfunction suggests developing new approaches in quantifying, detecting, and analyzing biomedical information. Since the description of Graves' disease by Robert Graves, it is known that several environmental and epigenetic factors influence the onset of this disease. Also, some susceptibility elements, such as particular genotypes of HLA, CTLA-4, CD40 or thyroglobulin have been identified. Furthermore, recent data has shed more light on how an epigenetic-genetic interaction between a noncoding single nucleotide polymorphism (SNP) (coded within the TSH receptor (TSHR) gene) alters the thymic expression of TSHR, which further triggers Graves' disease [72][73][74].

Application of data science in the diagnosis of Hashimoto's thyroiditis (HT)
Hashimoto's thyroiditis (HT), also known as chronic lymphocytic thyroiditis or chronic autoimmune thyroiditis, is one of the common autoimmune thyroid diseases that can cause an increased tumor vulnerability and raise the chances of developing chronic heart disease diseases especially in individuals with Hashimoto's thyroiditis [75]. The biochemical markers for Hashimoto's thyroiditis are thyroid peroxidase and thyroglobulin autoantibodies in the serum, with greater dominance in females than males. The most significant biochemical etiology of this disease is the presence of thyroid autoantibodies (TAbs) in the patients' serum against two vital thyroid antigens, which are thyroid peroxidase (TPO) and thyroglobulin (TG) [76]. The diagnosis of Hashimoto's thyroiditis (HT) usually causes many controversies, and sometimes until the late stage of occurrence before proper diagnosis can yield result. The use of data science to predict the presence of this dysfunction is key to modern day precision medicine. Firstly, through epidemiological study of the disease pattern in areas where iodine intake is normal or excessive, considering age factor, pathogenesis of autoimmune thyroiditis in monozygotic twins as compared with dizygotic twins [77].
Diagnosis of Hashimoto's thyroiditis (HT) is made by examining a diffuse, smooth, firm goiter in a young woman, with strongly positive titers of TG Ab or TPO Ab and a euthyroid or hypothyroid metabolic condition. This disease caused by immunological damage show conditions that are severe and can cause further complications. Reviewed works of autoimmune hypothyroidism in monozygotic twins, shows there is a corresponding rate below 1 which is traceable to environmental factors and thus, making this factors to be etiologically significant [78]. In precision medicine, the study of genomics can be used to diagnose autoimmune thyroid disease, most especially Hashimoto's thyroiditis. Genotyping analysis to show the genes that are susceptible to environmental factor endocrine disruptors, taking note of the influence of age, weight, sex, timing, and race to show endocrine levels [76].

Pathogenesis of autoimmune Hashimoto thyroiditis
The presence of TAbs (thyroid autoantibodies) in the patients' sera is the principal biochemical characteristic of HT disease. The Tabs is against two major antigens which are, thyroid peroxidase (TPO) and thyroglobulin (Tg). The TPO antigen is crucial for thyroid hormone synthesis and they are located on thyrocyte's apical membrane, while the Tg are large glycoprotein within the follicular cells of the thyroid gland and they serves as storage for thyroid hormones [76][77][78].
The principal factor that drives the pathogenesis of HT is the antibodies against TPO (TPOAbs) and Tg (TgAbs) (in immunoglobulin G (IgG) class). Unlike TgAbs, the TPOAbs damage thyroid cells due to its antibody dependent cell cytotoxicity but both shows great affinity for their respective antigens. Furthermore, studies reported that they both have limited role in the pathogenesis of HT but both T-cell cytotoxicity and apoptotic pathway activation influence the disease onset [77,78]. Although, the TAbs serves as a biomarker for thyroid autoimmunity but TPOAbs are presented in over 90% of HT patients, while 80% of the patients presents TgAbs [77]. Also, T helper cell type 2 (Th2) has been reported to lead to an excessive stimulation of B cells and production of plasmatic cells that produce antibodies against thyroid antigens leading to autoimmune thyroiditis [78]. Table 1 shows some factors that can influence HT [77,79].

Importance of data science in thyroid diseases
Studies have reported a vast prediction algorithms that help in classifying, monitoring and suggesting treatment regimen for thyroid diseases, therefore the importance of data science is to serve as early approach to diagnosis, prognosis and treatment of thyroid diseases. Below are studies that achieve a high percentage of accuracy with new data approaches to investigate and treat thyroid diseases.
Since proper interpretation of thyroid functional data is an important issue in the classification of thyroid disease [80], thyroid disease dataset from UCI machine learning database has been used in comparative thyroid disease diagnosis. This was attained by using probabilistic, multilayer and learning vector quantization neural networks [81]. Likewise, Polat et al., also make use of dataset from UCI machine learning repository to diagnose thyroid diseases by hybridizing AIRS (artificial Application of Data Science Approaches to Investigate Autoimmune Thyroid Disease in Precision… DOI: http://dx.doi.org/10.5772/intechopen.101220 immune recognition system) which was first proposed by A. Watkins, with developed Fuzzy weighted pre-processing. The classification obtained from this study is about 85% accurate [80].
Moreover, Ruggeri et al., use data recordings of medical history, assessment of selected autoantibodies profiles and physical examination to delineate clinical patterns in patients with Hashimoto thyroiditis from pediatric/adolescent to adult age. It was found out that there is high prevalence of non-thyroidal autoimmune diseases (NTADs) in HT patients and this is also influenced by the patient's age [82]. Therefore, NTADs should be watch out for in patients confirmed to be affected by Hashimoto thyroiditis. Hence, exploring clinical dataset with data science has helped in the prognosis of autoimmune thyroid disease.
Some of the recently proposed algorithms with high accuracy are Expert System for Thyroid Disease Diagnosis (ESTDD), this is an expert system that diagnose thyroid diseases via neuro fuzzy rules with about 95% accuracy [54,83].
In addition, classification based data mining has also played important role in providing significant diagnosis, decision making and proper treatment for thyroid diseases at early stage. Some data mining algorithms have shown a very high accuracy, speed, performance and low cost for treatments [54]. Example of these  algorithms that helps to find better treatments for thyroid patients are kNN (k nearest-neighbor), support vector machine, ID3ara and Naïve bayes [54]. Lastly, novel intelligent hybrid decision support system was utilized in the diagnosis of thyroid disorder, the classification analysis made by algorithms were sensitive, specific and high in accuracy (94.7%, 99.7% and 98.5% respectively). It was also reported that this approach can be applied to other deadly diseases [84].

Challenges in diagnosing and treating autoimmune thyroid disease
Given the ease of diagnose and treatment of thyroid disease, expectations are high on the specific and personalized approach to the diagnosis and treatment of such disease. However, some aspect of the methods of diagnosis and treatment needs improvement to enhance the health of thyroid disease patients. Table 2 discusses few of the challenges that has been identified or associated with the management of thyroid related diseases.

Conclusion
Data science has been shown to be a useful tool in preparing, aggregating, cleaning, and manipulating clinical data to uncover disease patterns and draw insights into how the disease can be treated. Also, genomic datasets in databases have been utilized in precision medicine to diagnose and treat patients. These facts show green light for data science usage by medical practitioners and researchers in the near future.

Recommendations
It is recommended that data science be incorporated into clinical practice to improve precise targeted immune therapy for autoimmune thyroid diseases. Also, it is recommended that more research be carried out using genomic data to further bolster the precision from these data in the diagnosis and treatment of individual patients.
© 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.