Computational Modeling Studies of the LRRK2 Protein in the Mechanism of Parkinson’s Disease

Parkinson’s disease is a disorder of the nervous system, affecting movement. The most common gene mutation is said to be in the Leucine-rich repeat kinase 2 (LRRK2) gene, which provides instructions to make the LRRK2 protein. The functions that the protein carries out results in abnormalities, leading to Parkinson’s disease. If a molecule binds to the active site of LRRK2 protein, it could lead to potential treatment. To create the potential drug molecule for treatment of Parkinson’s disease, the IC50 value is an essential experimental parameter to know, because it determines the effectiveness and strength of binding to the protein. Therefore, this research determined whether or not there was a correlation between the IC50 value and the total energy, dipole moment, or energy gap of a molecules that are said to bind to LRRK2, with experimentally measured IC50 values. The energy gap was established by subtracting the HOMO from the LUMO. The total energy, dipole moment, and HOMO/ LUMO values were found by modeling and running all molecules using the molecular modeling program, Gaussian 09. After all the values were found, combinations were made, where either IC50, 1/IC50, or log IC50 was the x-axis, and the total energy, energy gap, or dipole moments was the y-axis. A strong correlation was found between 1/IC50 and total energy, with the R 2 value being 0.94. Therefore, these results support the conclusion that a strong correlation was found. This can now be used as a reference to find the IC50 value when doing further experimentation on designing new molecules in which the IC50 values have not be experimentally measured. This analysis is the first step in assisting researchers to reach potential drug molecules for treating Parkinson’s Disease.


Introduction
Globally, cancer has been one of the biggest leading causes of health issues as of right now. More specifically, Parkinson's disease is one of the most common cases in the United States, receiving more than 200,000 cases per year. It is said that approximately 10 million patients worldwide are living and suffering from this disease [1]. Approximately 10% of the cases received have been because of gene defects. Consequently, this makes Parkinson's disease the 14th top cause of death in the United States, and there is currently no treatment for this disease. Nearly every patient diagnosed starts off small, but progression soon takes over the entire body [1,2]. Although there is no cure for Parkinson's, there are some medications which help with the symptoms. However, these medications aren't enough. Thus, the intention of this project is to develop potential drug treatments to modulate the continued development of the Parkinson's disease.
Parkinson's disease is a severe neurodegenerative disorder of the central nervous system, affecting movement. This chronic and progressive disease is characterized by numerous malfunctions, such as tremors, bradykinesia, rigidity, and postural instability. The neuropathology of Parkinson's disease was distinguished and examined in postmortem brains, and it can be described by the following: The neurotransmitter, dopamine, is produced and secreted in the striatum by a loss of pigmented neurons in the substantia nigra, that are additionally accompanied by Lewy bodies [3]. A continuous loss of dopaminergic neurons results in reducing the capacity of the remaining neurons in the striatum to produce a sufficient amount of dopamine at a time. This deficiency alters the neural circuit in the basal ganglia, which is a region of the brain responsible for controlling movement [2]. The most common genetic cause is from a mutation occurring in the Leucine-Rich Repeat Kinase 2 (LRRK2) gene. A function of this gene is to give instructions in order to make a protein called dardarin. Because one section of this protein contains a large number of amino acids, it is found to have a leucine-rich region. Proteins with these regions appear to be a part of required interactions with other proteins, for instance when carrying signals or assembling a cell's structure. Additional studies exhibit that the functions the dardarin protein is represented as kinase activity, which assist in attaching/transferring phosphate groups in proteins. This transfer is known as phosphorylation. When the phosphate group is added to the protein, the brain is told to change its cell manufacturing process. Despite that, when a protein gets attached a phosphate group by a LRRK2 gene with an abnormalitie, the brain starts to over-manufacture the protein. When an excess of protein is created, brain cells start to die [1,2].
A large number of people and organizations are working to join the treatment of Parkinson's, experimenting ideas from doing drug therapy to using technology. Many focus on the LRRK2 gene specifically because it is responsible for the defect when referring to Parkinson's disease. A recent study, concerning the Dawson Duo, demonstrates that removing the phosphate group that attaches to the ribosomal s15 could actually prevent degeneration and brain cell death [2]. They further show that regulating a low dose of anisomycin will help tremendously by blocking protein production [3]. The Michael J Fox (MJFF) Foundation is taking an innovative approach to create drug development around this gene and the resulting protein [4]. This foundation sponsors researchers to first help understand the structure and function of the LRRK2 gene in the development of Parkinson's disease. After studying the mutation and its surroundings, they plan to set a stage to determine how to design a drug. Ted and Valina Dawson did research at the National Parkinson Foundation Center of Excellence at John Hopkins University [5]. They recently published a paper discussing how the most common gene defect leads to cell death and degeneration.
As shown in numerous studies, having a molecule bind to LRRK2 gene may result in a big step towards creating a drug to help in the treatment of Parkinson's disease [5]. Because of this, the hypothesis of this experiment states "If certain molecules can bind to the LRRK2 active site, then the mutated structure of the LRRK2 protein, associated with Parkinson's disease and its enzyme function, is seriously modified by the structural modifications of the mutated protein molecule LRRK2 and a correlation can be found between its binding affinity and energies" [5][6][7][8]. In computational molecular modeling, the first step to achieve the stated goal, is to create a molecule that could potentially bind to the target protein. One goal of this research is to model many molecules that are said to bind to the LRRK2 protein, using the quantum computational molecular modeling program. After collecting all data and recording all values that were calculated, such as the total energy, lowest unoccupied molecular orbital(LUMO)/highest occupied molecular orbital values (HOMO), dipole, energy gap (which was given by subtracting the HOMO from the LUMO), etc, the plan is to plot each value versus the IC50 (measure of the effectiveness of a substance in inhibiting a specific biological or biochemical function or the concentration of a potential drug molecule with 50% inhibition) value to find a correlation [7,8]. This correlation will help in the future when creating a new molecule, because the IC50 value can be predicted from the line of best fit. Figure 1: All of the molecules that are used and said to bind to LRRK2.

Methods
The project began with researching and analyzing the native structure of the LRRK2 enzymatic protein. Along with this, the seven molecules, that are said to bind the LRRK2, were identified and studied (Figure 1), and the IC50 value of each molecule was recorded. Because there were ranges given for some molecules, three groupings were made: the lowest IC50 value, the highest IC50 value, and the average of the highest and lowest IC50 value.
Using the computational molecular modeling program Gaussian 09 [9], each molecule was modeled by first opening the program, and creating a new molecule group. After doing this, ring fragments inserted and elements and bond types/ sizes were made and changed corresponding the molecule being modeled. Each molecule was then optimized to the lowest energy structure using the Gaussian 09 computational molecular modeling program. LUMO/HOMO Energy Gaps, dipole moments and total energy were calculated.

Figure 2:
IC 50 values were gotten from a previous experimental research already done, and the total energy was found using the program, Gaussian 09.Because a range of IC 50 values were found, the lowest, highest, and average of the two were graphed. Depicted on the graphs are how the IC 50 values are proportional to the total energy, and the linear equation and R 2 value is given.   Figure  3. However a point was removed as an outlier (which is defined more than two standard deviations away from the trendline). The R 2 value turned out to be greater.
From the data calculated, several plots were generated to see if there would be some type of correlation. The data was placed into an Excel spread sheet, and plotted. For every plot, there were different combinations of calculated parameters, with the "x" axis always being various functions of IC50 (i.e. IC50, 1/ IC50, log IC50). As a result, the following were graphed: IC50  Figure 2 shows the IC50 values vs. the total energy. Figure 3 shows the 1/IC50 values for the lowest, highest, and average vs. the Total Energy with the R 2 is shown. Figure 4 shows the two graphs are the same as the ones in Figure 3. However, with one point removed as an outlier (which is defined more than two standard deviations away from the trendline). The R 2 value turned out to be higher with the removal.
Conclusions were made based on the "R 2 " value and the trendline. If there was a data point more than two standard deviations away from the trendline, the point was removed. Also what is meant by low, average and high, is that experimental data often had a range of values. What were plotted was first the lowest values vs. the particular property; then the average of the high and low values vs the particular property; and then the high values vs. the particular property. In addition, although all of the above cited plots were made, only the plots relevant to the parameter which showed correlation is shown in this manuscript.
In order to determine whether or not there was a correlation between the IC50 and the other values, such as the LUMO/HOMO gap, total energy, dipole moment, etc., the molecules that are said to already bind to the LRRK2 protein were modeled using the Gaussian 09 molecular modeling program. After running each molecule, the data for each molecule was recorded, and numerous graphs were made to see if a correlation could be determined. Knowing the strength of binding to the LRRK2 protein molecule has important consequences. Binding affinity is the capability of ligands to form corresponding bonds with a receptor, and the interaction forces of attraction affect the strength of a binding affinity. This includes calculating the total energy, LUMO/HOMO energy gap, dipole moment and correlating it to the IC50 value.
As seen in Figure 3 & Figure 4, there was very good correlations found between the 1/IC50 (average) and the total energy, and there was an especially strong correlation found between the 1/ IC50 (high) and the total energy. The R 2 in the "1/IC50 (average) vs. the total energy" graph was about 0.60. The R 2 in the "1/IC50 (high) vs. the total energy" graph was about 0.59. An "outlier" is defined as "being more than two standard deviations away from the trendline." Therefore, the point was identified as an outlier, and was removed from the data set for those two graphs. Taking out that one point increased the R 2 as displayed in Figure 4 with a new R 2 for the "1/IC50 (average) vs. the total energy" of 0.94, and the new R 2 for the "1/IC50 (high) vs. the total energy" was determined to be 0.96. This strong correlation of these graphs show that the total energy is proportional to the inverse of IC50, with the nature of the system dictating this mathematical relationship.
As a conclusion, the other figures in this research illustrate either no correlation or a very weak correlation, meaning the R 2 value is only about 0.30 or less. A majority of the graphs that compared the IC50 value to the energy gap had no correlation, bringing about a 0.0R 2 value. The graphs that involved the dipole moment had a bit of a correlation, though it appeared to be weak.

Discussion
The mutation in LRRK2 causes abnormalities when doing its function, resulting in Parkinson's disease [10,11]. This unusual expression can be due to numerous reasons, such as week binding to an active site or simply mutations in the genetic code. We, therefore, investigated in discovering whether or not there is a correlation between the IC50 value and total energy/energy gap/dipole moment.
Regarding the strong correlations obtained as an outcome, there was a correlation in the graph "1/IC50 (average and high) vs. Total Energy," as shown in Figure 3. As 1/IC50 increased, the total energy decreased, with the R 2 value being about 0.60. Because an outlier was identified in the two graphs, that specific point was taken out, and the resulting values were plotted and shown in Figure 4. Taking the outlier out of the data set increased the R 2 value up to about 0.95. Such a high value resembles that the data set was very close to the trendline, giving out more accurate results than previously. This correlation indicates a mathematical relationship that portrays the total energy proportional to 1/IC50.
Each figure that compared IC50 to the energy gap and dipole moment had no strong correlation, and the R 2 value was about 0% and 10%, respectively, indicating and supporting the outcome of no correlation. As the IC50 value increased, the energy gap and dipole moment varied greatly. Because the energy gap was the HOMO value subtracted from the LUMO value, we speculate that this energy gap doesn't relate directly because of the LUMO and HOMO values, which affects the energy gap. However, the authors hypothesize that there may be some type of correlation if the energy gap was the difference between the bonding and antibonding values. Essentially, in this case, no speculation can truly be done because the nature of the system dictates the mathematical relationship, leaving no say in matter.
In computational molecular modeling, finding a reference when creating new molecules is necessary and help. Consequently, one of the biggest applications of this project is the fact that it can be used as a reference for future procedures. As a continuation of this research, new molecules can be designed to bind to the active site of LRRK2. However, the IC50 value needs to be accessed because it informs you on how strong your binding affinity is. In order to retrieve the IC50 value of a molecule that was created, long experimental procedures would have to be taken accordingly. Despite that, the new molecule that got assembled can get modeled on the Gaussian 09 program. After running the molecule, you have the ability to figure out the total energy. That total energy established grants you the y-value. Then, you can find that specific number on the y-axis of the graph, and go horizontally, till the trendline is reached. When the trendline is reached, the x-axis of that point can be found. Because the definition of IC50 states the concentration of a potential drug molecule with 50% inhibition, the lower the value, the stronger it is. For the graph with the outcome of a strong correlation, the x-axis is "1/IC50." This means that the highest value of 1/IC50 produces the lower value of IC50. Therefore, the graph can be used to estimate an approximate IC50 value, increasing efficiency and saving time. If the IC50 value is good, then the experimental procedures can be taken in the future.
The results of this experiment supported the hypothesis, which states that a correlation can be found between the molecules IC50 value and energies. The hypothesis was supported because the graph of 1/IC50 (average and high) vs. the total energy not only shows a strong correlation, but verifies the strong correlation because the R 2 value is 0.95. With further investigation and future experimentation, this research can assist other researchers develop new potential drug treatments for Parkinson's disease.

Future Studies
In this study, it was found that there was a great correlation between the "1/IC50 (average and high) vs. the total energy," with the R 2 value (goodness of fit of a model) being fairly high: about 60% with all points, and about 0.94 with all points except the ones with 2 standard deviations away from the trendline. This experimentation provided results that advances a step further the potential treatment for the LRRK2 mutation in Parkinson's disease. The graph that has the best correlation can be used in the future when creating a potential new molecule to bind to the LRRK2 protein. After modifying new molecules, these molecules can be calculated using the Gaussian 09 molecular modeling software. The total energy value of the molecule can assist one in finding the best IC50 value. Using these graphs, further experimentation can be performed to see what modifications are best to produce the best IC50 value.
Furthermore, to continue this research given these promising results, many new molecules could be created to bind to the LRRK2 active site. The inhibitory constant (Ki) values of promising molecules can be tested as well, seeing whether it affects it positively or negatively. Noting how certain elements affect the results can additionally be tested to see what works best. Using the information from this, artificial intelligence could be used to possibly test the efficiency and effectiveness of each molecule.