Design New Compound of Meisoindigo Derivative as Anti Breast Cancer Based on QSAR Approach

Received: 6th December 2019 Revised: 17th July 2020 Accepted: 6th August 2020 Online: 30 September 2020 Design new compounds of Meisoindigo using the QSAR approach with semiempirical methods have been successfully made with better biological activity as anti-breast cancer results. The research method used to design the new compound of meisoindigo derivatives uses the semiempirical methods. The research procedure divide into tree step, molecular modeling, QSAR equation analysis, and design of new compounds. The PM3 method was chosen as a better method because it has model results that are more representative of physicochemical aspects. The selection of the best model is made by multilinear regression statistical analysis. New compounds derived from the design are expected to bind to the Cyclin-Dependent Kinase 4 (CDK4) enzyme that helps in the cell cycle to prevent cell division. Based on the design, the compound of (E)-2(2-oxo-i-(2-(trifluoromethyl) benzyl)indolin-3-ylidene)-N-(quinoline-7-yl) acetamide choose as a new compound with more better biological activity (log1/IC50 = 6.992) than before (log 1/IC50 = 5.823)


Introduction
The discovery of new high-efficacy compounds requires a long experimental stage, including design, synthesis, identification, purification, and activity testing. One solution that can be offered to overcome this problem is the introduction of modeling using computers. Using modeling, a compound can be searched for a relationship model between structures, both electronic and geometric, from one or a group of molecules suspected of having certain activities [1].
In the field of health, the use of computer-aided modeling is constructive in research. For example, is the treatment of diseases that attack the brain is complicated because the delivery of drug molecules to the brain is blocked by the blood-brain barrier (BBB) molecules. One method to overcome this problem has been developed by using a new method using a computer molecular dynamics approach [2].
The scientific discipline in pharmacy, which is much helped by these developments, is medicinal chemistry, especially for the study of Quantitative Structure-Activity Relationship (QSAR). This is synergistic with the development of new drug discoveries that are increasingly expected to be more effective and efficient [ 3 ]. In computer experiments performed using prescription algorithms written in programming languages, using theoretical experts ' models. This method allows the calculation of complex molecular properties with results that correlate significantly with laboratory experiments [ 4 ].
Drugs usually consist of complex molecules. Research and development of new drugs in the laboratory experiment are needed many times and high costs. Also, the results obtained are likely to be unsatisfactory, so that a series of laboratory work and experience becomes wasted and useless. Application of computational chemistry can be crucial in medicinal chemistry, mostly for drug design, chemical properties theory, and biological activity of a molecule [ 5 ].
This research attempts to design the new meisoindigo derivative compound structure as antibreast cancer through the QSAR approach. Cancer is an abnormal growth of cells in the body ' s tissues, which calculations are gradually become malignant. These cells grow faster and cannot experience apoptosis. Based on data from GLOBOCAN, the International Agency for Research on Cancer (IARC), it was reported that in 2012, there were around 14 million new cases of cancer and 8.2 million deaths from cancer. The most cancer cases causing death were lung cancer ( 19.7 %), breast cancer ( 12.9 %), liver cancer ( 9.5 %), and stomach cancer ( 8.9 %). The number of cancer patients is expected to increase every year and is estimated to reach 23.6 million new cases per year by 2030 [6].
The type of cancer that most causes death in women, especially those aged 40 years and over, is breast cancer. Most cancers attack the left breast at the top near the arm. Based on data from the Indonesian Ministry of Health ' s Data and Information Center ( 2015 ), the prevalence of breast cancer in West Nusa Tenggara Province was 0.2% [6].
One of the potential compounds for anticancer is the meisoindigo derivative compound. The in-vitro test for breast cancer cells shows that meisoindigo derivative compounds have good activity against these cancer cells.
Meisoindigo derivative compounds will bind to the Cyclin-Dependent Kinase 4 (CDK 4 ) enzyme, which plays a role in the cell cycle, thereby preventing cell division This study was designed by a computer device with specification Pentium core i 5 6600 processor, 2 GB RAM, 1TB Hard Drive. While the software is Hyperchem 8.0 for Windows to build a 3 D compound and SPSS 16.0 software for analysis of the QSAR equation.

Tool and Material
The research material used was meisoindigo derivative compound data, which amounted to 20, obtained from the literature that had inhibitory activity on CDK 4 . The structure of the parent compound of meisoindigo derivatives can be seen in Figure 1. Data on compounds with biological activities divided into twopart first is training set data and second is test set data. The data meisoindigo derivative compound can be seen in Table 1 The structure of the parent compound from the meisoindigo derivative can be seen in Figure 1. Furthermore, the data of compounds with biological activities can be seen in Table 1.

QSAR Equation Analysis. QSAR equation analysis
was performed by statistical analysis using SPSS with the backward and entered the combination method. This is done by entering all the variables and including a few selected independent variables (data descriptors). Several equation models are produced: the relationship between the physicochemical properties of the independent variable and the anticancer activity (Log 1 / IC 50 ), which is the dependent variable. The models are then tested for validity [10]. The chosen QSAR model must meet all statistical criteria used and had specific descriptors representing hydrophobic, electronic, and steric parameters. Validity testing of several equation models was done by calculating the value of r (correlation coefficient) and r 2 (coefficient of determination) close to 1, adjusted r 2 with the most significant value, smallest SE (standard error), Fcount/Ftabie > l, and PRESS (predicted residual sums of squares) with the smallest value [n].
New Compound Design. The design of the new compound was done by modifying the position and type of the substituent. The position of the substituent was focused on the active site of the compound. The active site was chosen because it has atoms responsible for anti-breast cancer activity from meisoindigo derivative compounds. The next step was to perform calculations using the best semiempirical method according to the method used in determining the QSAR compound against all new compounds. Theoretical anti-breast cancer activity, log 1 / IC 50 from the yield compound, was calculated using the QSAR equation that had been selected. Compounds with a high log i / IC 50 value indicate that these compounds have high anti-breast cancer activity, and these compounds can be proposed for synthesis [12].
The method used in this study is the semiempirical method. Researchers used two semiempirical methods (AMi and PM 3 ). The best method will be used as a basis for the design of a new compound. The selection of the best method is based on the effectiveness of using descriptors and statistical parameter aspects.

. Results and Discussion
The method commonly used to model organic compounds is using semiempirical methods, especially AMi and PM 3 methods. The meisoindigo compound is also an organic compound, so in this research also used the same method. One of the several semiempirical methods certainly has results that are closer to the actual molecule compared to other methods. Therefore, it is necessary to compare the results of molecular modeling between the two semiempirical methods.
The parent compounds used in this study were meisoindigo derivatives (Figure 1), which were optimized using a previously validated method (semiempirical PM 3 and AMi methods). These compounds are then described as model sticks equipped with an atomic net charge for each atom, as shown in The biological activity obtained from the literature [6] is the minimum inhibitory value or IC 50 . However, in this study, the researcher changed the IC 50 data to form a logarithmic value assuming to facilitate data analysis, so that the distribution is not too far away. The logarithmic form used is log i / IC 50 , it is mean that the higher the log i / IC 50 value, the compound has better activity than compound whose lower of log i / IC 50 .
Descriptors used in this study are descriptors representing electronic parameters, hydrophobic, and steric. The electronic parameter consists of the atomic net charge (q), isolated atomic energy (Eat.is), dipole moment (p), hydration energy (EH), binding energy (Eb). Descriptors representing hydrophobic parameters are the partition coefficient of n-octanol / water (Log P); meanwhile, descriptors representing steric parameters consist of the gradient (Gr) and surface area approx (SA).
The reason to choose of atomic net charge descriptor (q) is carried out considering that the atomic charge is critical in determining chemical reactions and physicochemical properties of a compound, and it is useful for measuring intermolecular interactions. The net charge of an atom can be a positive or negative value. The net charge of the atom depends on the groups that are bound to the atom. Positive atomic charges are caused by electron pulling groups ' presence so that the electron density becomes smaller. As for charges that have negative values due to methyl groups, alkyl groups, and halides, which are electron donor groups, the electron density becomes greater.   The QSAR equation formula can be obtained from various multivariate statistical methods that produce satisfactory results. The most widely used primary method is regression analysis. The method correlates several independent variables x (in the form of physicalchemical parameters in the Hansch method or the value of indicator variables in the Free-Wilson method) with non-independent or bound variables y (in the form of biological activity parameters of the compound). The AMi method validation test was conducted by predicting breast anticancer activity using a model chosen according to the statistical parameters in table 2 ( models 4 , 6, 7 , and model 9 ), then making a curve of the relationship between log 1 / IC 50 experiments with log 1 / IC 50 predictions. Table 4 shows the results of a comparison between the proposed models.  Figure 3 shows the relationship between log 1 / IC 50 experiments with log i / IC 50 predictions for models 4 , 6, 7 , and 9.  The best model obtained for the AMi method uses isolated atomic energy descriptors, dipole moments, and atomic charges for C9, O10, and C 13 atoms. The model does not involve oil / water partition coefficients and only involves three atomic charges. The partition coefficient, abbreviated P, is defined as the ratio of certain solute concentrations between the two solvents (dual-phase liquid), especially for non-ionized solutes. If one of the solvents is water and the other is a non-polar solvent, then the log P-value is the size of the lipophilicity or hydrophobicity. For this reason, theoretically, the log P parameter must be included in the calculation of the best model. Based on this reason, researchers conclude that the PM 3 semiempirical method was chosen as the best model. On the other hand, this method is suitable for many organic molecules, and this method is also proven to be able to optimize the carbon structure [ 13 ].
Based on those reasons, the PM 3 semiempirical method was chosen to be the best method. Model 5 with the PM 3 method was used as a guiding model in designing and predicting new breast anticancer activities from meisoindigo derivative compounds. In designing  The best model for the AMi method is model 7 with equation formula: Log 1 / IC 50 = 3.523 + (-0.00003062* Eat.is) + (-0.237* p) + ( 3 -495* qC 9 ) + ( 3 -729* qOlo) + (0.862*qCl 3 ) The validation test using the PM 3 method was conducted by predicting anticancer breast activity by selecting the chosen model according to the statistical parameters in table 3 ( models 4 and 5 ). The comparison data between log i / IC 50 experiments with log i / IC 50 predictions for the PM 3 method can be seen in table 5 . new compounds, selecting the main compound structure is based on the meisoindigo derivative compound, which has the best breast anticancer activity. The design of the compound begins by modifying the type and position of the substituent.
Descriptors of new compounds obtained from these calculations are then included in the best QSAR equation model ( model 5 PM 3 Method). In designing a new compound, the parent compound ' s structure selection is based on the meisoindigo derivative compound, which has the best inhibitory activity. New meisoindigo derivative compounds that have been theoretically designed have relatively good inhibition values with insecticidal activity values from 5.120 to 6.992. The result of the calculation can be seen in table 6. The higher log 1 / IC 50 value of a compound, or the smaller IC 50 value, means the compound ' s activity is getting better.
From several new compounds that have been designed, compounds with log 1 / IC 50 values can be selected, which are higher than the price of the synthesized log i / IC 50 compound. Based on this, the researchers concluded that the best compounds with higher activity than previous experiments were compounds (E) -2-(2-0x0-1-(2-(trifluoromethyl) benzyl) indolin-3ylidene) -N-(quinoline) -7 -yl) acetamide, the best compound image shown in Figure 5 . The proposed compound has a log 1 / IC 50 value of 6, 992 more high than log 1 / IC 50 before.