Multi-Target In-Silico modeling strategies to discover novel angiotensin converting enzyme and neprilysin dual inhibitors

Cardiovascular diseases, including heart failure, stroke, and hypertension, affect 608 million people worldwide and cause 32% of deaths. Combination therapy is required in 60% of patients, involving concurrent Renin–Angiotensin–Aldosterone-System (RAAS) and Neprilysin inhibition. This study introduces a novel multi-target in-silico modeling technique (mt-QSAR) to evaluate the inhibitory potential against Neprilysin and Angiotensin-converting enzymes. Using both linear (GA-LDA) and non-linear (RF) algorithms, mt-QSAR classification models were developed using 983 chemicals to predict inhibitory effects on Neprilysin and Angiotensin-converting enzymes. The Box-Jenkins method, feature selection method, and machine learning algorithms were employed to obtain the most predictive model with ~ 90% overall accuracy. Additionally, the study employed virtual screening of designed scaffolds (Chalcone and its analogues, 1,3-Thiazole, 1,3,4-Thiadiazole) applying developed mt-QSAR models and molecular docking. The identified virtual hits underwent successive filtration steps, incorporating assessments of drug-likeness, ADMET profiles, and synthetic accessibility tools. Finally, Molecular dynamic simulations were then used to identify and rank the most favourable compounds. The data acquired from this study may provide crucial direction for the identification of new multi-targeted cardiovascular inhibitors.


Dataset collection, curation, and descriptor calculations
The molecular structural and biological data for 983 compounds which includes ACE (474) and NEP (509) with inhibitory activity against ACE and NEP enzymes of Rattus norvegicus (Supporting file S1) were extracted from the ChEMBL database (https:// www.ebi.ac.uk/ chembl).The selected compounds dataset was subjected to biological and chemical curation 49 by removing the molecules that lack information such as SMILES, units of activity, and duplicates.Structures were standardised, neutralised, and cleaned and salts were removed to get a dataset of 715 compounds which makes ACE (357) and NEP (358) (Supporting file S2) followed by 3D optimization using MMFF94 force-field by OpenBabel 50,51 .we have calculated a variety of descriptors, namely, Py-Descriptor (Constitutional, geometric, circular fingerprint, quantum chemical and topological, no. of descriptors > 16,250), Alvadesc-Descriptors (Constitutional, topological, connectivity indices, 2D matrix-based, ETA indices, atom type E-state, functional group count, 2D atom pair, atom-centered fragments, molecular properties and drug like indices, no. of descriptors ≥ 3117), PaDEL-Descriptor 2.20 (for extended topochemical atom indices, no. of descriptors = 242) through freely available web server OCHEM (https:// ochem.eu/ home/ show.do) (Supporting file S3).
Alvadesc descriptors calculated based on chemical structure do not show any discrimination when a specific molecule is assayed under more than one experimental condition.Box-Jenkins operators provide a solution to the above problem, as they calculate successive average values of a defined property at different time intervals.In Box-Jenkins operators are used to calculate modified descriptors which are capable of discriminating the influence on the chemical structure when a specific molecule is assayed under more than one experimental condition (Supporting file S4).In Box-Jenkins-based mt-QSAR modeling, the arithmetic average of any molecular descriptors for a specific experimental condition is calculated as followswhere the avg(D i )c j is thus the arithmetic mean of the descriptors (D i ) for a specific experimental condition (c j ).After generating the avg(D i )c j values, the final modified descriptors (∆(D i ) c j ) are subsequently generated using the following formulawhere ∆(D i )c j is a deviation descriptor that measures to what extent a chemical structurally deviates from a set of compounds assigned as active and tested against the same experimental condition.

Development of mt-QSAR model
The QSAR-Co program develops mt-QSAR models employing GA-LDA and RF algorithms simultaneously predict two endpoints, angiotensin-converting-enzyme, and neprilysin enzyme inhibition, under a variety of experimental and theoretical conditions using a single QSAR model Eq. 52.Three experimental conditions (Tn-ACE/NEP, St-IC 50 /KI, At-B/F) were considered for developing respective mt-QSAR models (denoted as 'C').The experimental condition Cj, a combination of conditions, is represented as an ecosystem with Tn, At, and St. Data points are tagged for target inhibition (TN) under Cj, with TNi(Cj) representing active or inactive classes.In the class assignments, compounds with IC 50 /Ki values ≤ 600 nM were classified as active, while the other data samples were considered inactive.The cutoff value selected in the sub-micromolar range confirms the meticulous search for potent hits 53,54 .The selected cut-off upholds the number of molecules annotated as active as high as possible and magnifies the chemical diversity, necessary to rationally design new molecules.This facilitates a way to have a balance between the number of molecules assigned as active and those labeled as inactive.The dataset in this work was randomly split into a test set (30%) and a training set (70%) using the random approach, to create an mt-QSAR model for dual endpoint ACE and NEP inhibition prediction.Two machine learning approaches, Random Forest (RF) 55 and Linear Discriminant Analysis (LDA) 56,57 , were used to construct the final mt-models in the QSAR-Co programme.The mt-QSAR models were constructed using the QSAR-Co software's default parameter settings.The sequential steps involved in the development of mt-QSAR models are depicted in Fig. 1.

Validation of mt-QSAR model
Based on the qualitative metrics for validation calculated for the training set, the best LDA and RF models were assessed and selected.The selected models were then externally validated using the test set.For internal and external validation, respectively, qualitative validation measures including accuracy, precision, sensitivity, specificity, MCC, and F-score were computed for the training and test sets 58 .In order to ascertain the developed LDA and RF models' ability to discriminate, the receiver operating characteristics (ROC) curve and area under the curve (AUC) were examined 58 .Using the Y-randomization test, which involves randomly generating 50 LDA models after the dependent variable (response class) of the training set is scrambled 50 times, the robustness of the LDA model was additionally evaluated 59 .To rule out the possibility that the original LDA model was created by chance, the Wilks λ parameter of the model was also compared to the Wilks λ values of 50 random models.Ultimately, the standardization technique was used to identify the applicability domain for both LDA and RF models 60 .
The chemical similarity analysis with the compounds database in CAS reveals that designed chemical structures are novel and have not been studied previously for RAAS and NEP inhibition.All the designed compounds were sketched and were 3D optimized using MMFF94 force-field by OpenBabel, and descriptors (PyDescriptor, Alvadesc, Estate, and Padel) were calculated.The optimized designed compounds' applicability domain was predicted using the most robust and statistically acceptable QSAR models.

Screening of designed derivatives using mt-QSAR models
In addition to model development, the QSAR-Co tool also allows the screening of large datasets.Using this screening facility, the mt-QSAR-LDA and RF models were used to screen external validation sets of designed chalcone, 1,3-thiazole, 1,3,4-thiadiazole, and quinazoline derivatives.This helps in the classification or determination of two classes of active dual inhibitors viz.i) ACE and NEP enzymes.

Constraint-based molecular docking study of selected designed derivatives
In this analysis, we have applied the molecular docking studies to investigate the binding pattern of selected screened designed molecules with the cACE (PDB ID: 1O86) and NEP (PDB ID: 5JMY) obtained from the protein data bank (https:// www.rcsb.org). (1)

A platform for molecular docking
The computational docking assessment of selected compounds with ACE as a target was executed via MOE 2019 software (Chemical Computing Group, Montreal, Canada), software.

Protein selection
The X-ray crystal structures of the target proteins selected for molecular docking are given in (Fig. 3).i) cACE (PDB ID: 1O86, crystal structure of human ACE in complex with Lisinopril, Resolution: 2.00 Å) and ii) NEP (PDB ID: 5JMY, NEP complexed with LBQ657, Resolution: 2.00 Å).

Molecular docking
Protein preparation.The selected models were prepared by deleting co-crystallized water molecules, unwanted chains, and nonstandard residues.The free target protein was then subjected to the QuickPrep procedure of MOE including corrections for missing atoms, alternate geometries, or other crystallographic artifacts, removing water molecules farther than 4.5 Å from any receptor or ligand atom, and 3D protonation.In the case of cACE and NEP docking, pharmacophore constraints were generated using the pharmacophore query editor containing metal chelation constraint and one positional constrain amide/amine Nitrogen with 1 Å constraint sphere (Fig. 4).
Ligand preparations.All chemical structures are sketched in MarvinSketch 61 and converted into SDF format using OpenBabel considering 2D geometry optimization 62 .The ligands dataset was further subjected to 3D optimization using MMFF94 force field using the Konstanz Information Miner (KNIME) workflow (https:// www.knime.org/) 50 .www.nature.com/scientificreports/Docking procedure.The selected metal chelation constraint and one positional constraint amide/amine nitrogen constraints were implicated in the docking using the pharmacophore placement method at the site centered on co-crystallized ligand atoms and the top 1000 poses ranked by the London dG scoring function.From these poses, the best 30 poses were ranked and then minimized using MMFF94 × forcefield within a rigid receptor.The resulting poses were then refined and scored using the Generalized-Born Volume Integral/Weighted Surface area (GBVI/WSA) dG scoring function which estimates the binding free energy for an obtained pose of the  ligand.The final results were analyzed, and visualized based on docking scores and pose using Discovery Studio 2020 Client 63 , and PyMol software 64 considering bound ligand as standard.Visualization of protein-ligand interaction reflects, the number of interactions and active residues responsible for the significant binding at the active site target enzyme.

Molecular dynamics study and MM-GBSA calculations
Desmond version 2020.1 with OPLS3e force field from Schrodinger was used to study the dynamic behavior of selected molecule complexes in the presence of explicit water molecules 65 .The System Builder module was used for system preparation using the SPC module for solvation and volume occupancy in an orthorhombic box with periodic boundary conditions.The solvated system was neutralized by the addition of appropriate anion (Cl -) and cation (Na + ) with a salt concentration of 0.15 mol.The Nose-Hoover chain coupling approach was employed to build up the NPT ensemble with a temperature of 300 K, leisure time of 1.0 ps, and pressure of 1 bar, which was once as soon as maintained in all simulations using a 2 fs time step.The barostat approach with the Martyna-Tuckerman-Klein chain coupling scheme was originally utilized for pressure control with a leisure time of 2 ps.The particle mesh Ewald technique was used to calculate long-range electrostatic interactions with a radius of 9 for Coulomb interactions.The non-bonded forces were estimated using the RESPA integrator.The root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), and protein-ligand interactions were assessed to check the stability of the complex in MD simulations 66 .
Moreover, Prime Molecular Mechanics with Generalised Born Surface Area (MM-GBSA) Schrodinger, NY, 2019 (Release, 2017) was used to ascertain the relative binding affinity of the ligands towards selected target proteins.The solvent model and force field for the MM-GBSA 67 computations were OPLS3 65 and VSGB 68 .The binding free energy in MM/GBSA was calculated using the equation that follows 69,70 .www.nature.com/scientificreports/

Drug-likeness, PAINS assay, and In silico toxicology study of selected design compounds
In this study, we conducted a comprehensive evaluation of selected design compounds through a multi-step process.First, compounds were filtered based on drug-likeness using Lipinski, Ghose, Veber, Egan, and Muegge rules via the SwissADME webserver (http:// www.swiss adme.ch).Subsequently, a PAINS assay was performed to identify potential interference compounds.In silico toxicology, predictions were then carried out, including LD 50 estimation using Protox-II webserver (https:// tox-new.chari te.de) and calculation of mutagenicity, carcinogenicity, hepatotoxicity, cardiotoxicity, etc. using VEGA-QSAR software (https:// www.vegah ub.eu/ portf olio-item/ vega-qsar) 71 .The integrated results informed the identification of promising compounds with favorable drug-like properties and reduced risk of adverse effects, guiding further drug discovery efforts.

Ethics approval
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Development of multi-target QSAR models to screen novel-designed chalcone, 1,3-thiazole, and 1,3,4-thiadiazole derivatives as dual inhibitors of ACE and NEP enzymes
The developed models have significant discriminating power, as demonstrated by the optimum values obtained for statistical parameters including accuracy, precision, sensitivity, specificity, F-measure, and Mathew's Correlation Coefficient (MCC).All of the models demonstrated similar results for each compound in the test set.Thus, it can be concluded that the developed mt-QSAR models (LDA and RF) are properly capable of determining two endpoints for dual inhibition of ACE and NEP of newly designed compounds.

Linear discriminant analysis (LDA) based mt-QSAR model development
The QSAR-Co tool's multi-target modeling leverages the Box-Jenkins method.This method modifies molecular descriptors for each dataset compound, incorporating diverse conditions to enhance the predictive power of the model.The mt-QSAR model was constructed using a dataset consisting of a sub-training set (n = 501) and a test set (n = 214).The best-fit mt-QSAR-LDA model was chosen from all models with the least Wilks (λ) train and highest MCC train , as shown in in Table 1 and accompanied by statistical parameters.
Table 1 shows standardized coefficient values, indicating selected descriptor's contribution to inhibitory activity.'JGI8_tn' has maximum (positive) contribution (coefficient value = 137.2343)and 'VE2sign_Dz(i)_at' has the maximum (negative) contribution (coefficient value = − 63.265) towards enzymes inhibition.A succinct explanation of the significance, source, and contribution of each descriptor used for the final LDA model, as shown in Table 2.
The mt-QSAR-LDA model meets the requirements for robustness, quality of fit, and statistical significance.The Wilks λ statistic, with a value of 0.479, indicates the model's adequate discriminatory power.Table 3 presents the classification results for both the sub-training and test sets, providing an overview of the overall performance of the mt-QSAR-LDA model.
Table 3 demonstrates the model's strong discrimination ability.Accuracy reached 91.22% and 88.79% for the sub-training and test sets, respectively.Moreover, it accurately classified 94.37% of active samples and 83.56% of inactive ones in the sub-training set, while achieving similar performance in the test set with 93.17% accuracy for active samples and 75.47% for inactive ones.These results support the high degree of efficiency of the  72 .
Figure 5 shows the receiver operating characteristic curve (ROC) plot for the training and test set.The area under the ROC curve (AUROC) values of 0.9007 and 0.8138 were obtained, indicating good statistical significance of the mt-QSAR model.The higher AUROC for the training set is expected, as the model is built on this data.However, a value of 0.8138 on the test set suggests good generalizability to unseen data.
The Y-randomization test 59 indicates that the mt-QSAR model is not created by chance, as shown in Fig. 6, with Wilk's lambda values (50 model average λ random = 0.9201) significantly higher than the original value (λ train = 0.4791).The QSAR-Co software's standardization approach 60 determined the applicability domain,   www.nature.com/scientificreports/revealing 6 out of 501 training data points and 5 out of 214 test data points as possible outliers and outside the applicability domain.

Non-linear mt-QSAR (Random Forest) model development for dual inhibition of ACE and NEP enzymes
The Random Forest (RF) technique was used to construct a non-linear classification-based mt-QSAR model using training and test sets, developed using QSAR-Co software and Weka version 3.9.3library 73 .The RF model of QSAR-Co, with its default parameters (tenfold cross-validation procedure), demonstrated superior overall statistical prediction quality compared to the LDA model, as detailed in Table 3.The RF model outperforms the LDA model in predicting inactive and active compounds due to its superior specificity, precision, and accuracy values.Accordingly, employing both models would be always beneficial to perform consensus predictions for queries or newly designed compounds.Further, Fig. 7 shows the plots of the corresponding ROC curves for the RF model, and the AUC values for both the training set (= 0.8453) and the test set (= 0.8225) show that the model has significant discriminatory power.Non-linear models with all computed descriptors often produce better predictive models than linear models with a subset of descriptors, but their interpretability is inferior.RF is a method for group categorization that averages predicted outcomes from several different decision trees to produce its predictions.The great precision and superiority of RF have drawn a lot of interest recently [74][75][76][77] .RF has multiple advantages, one of which is that it is less vulnerable to constructing overfitted models.Consequently, RF can be favoured over several other nonlinear machine learning techniques to generate highly accurate mt-QSAR models 78,79 .

Screening of designed derivatives using validated developed mt-QSAR models
Using the facility available to screen large designed datasets in QSAR-Co software, the mt-QSAR models were used to screen externally designed derivatives set of four heterocyclic scaffolds viz.chalcones and its derivatives (235 compounds), 1, 3-Thiazole (24 compounds) and 1,3,4-Thiadiazole (107 compounds).Correspondingly, both LDA and RF models are faster in screening large-size databases with an accuracy of equivalent to 90% and an MCC value greater than 0.5.The selection of ligands is based on two primary criteria.Firstly, the ligand must fall within the applicability domain of our LDA models.Secondly, they are required to show a positive score for activity against both ACE and NEP under any given set of experimental conditions, as indicated by the results from our developed LDA and RF models for each of the designed heterocyclic derivatives.Details of this screened dataset and calculated descriptors, as well as the results of the predictions, are provided in Supporting file S6.Out of the 235 designed chalcone derivatives, 85 compounds meet all the criteria.Likewise, out of the 24 designed thiazoles, 8 meet the criteria; out of the 107 designed thiadiazoles, 12 compounds exhibit positive results.Thus, all designed compounds are screened using developed models, and only those molecules that follow the applicability domain and show positive prediction are processed further for molecular docking study (Fig. 8).Altogether, these diverse statistics demonstrate the high internal quality as well as the predictive power of the derived mt-QSAR models.

Molecular docking studies of selected screened designed derivatives
In an effort to identify new starting point leads for novel multi-target inhibitors of ACE (C-domain selective) and NEP enzymes, molecular docking simulations were performed for the screened active hit suggested by mt-QSAR models from a combinatorial library of designed derivatives.There are a total of 85 chalcone derivatives, 8 thiazoles, and 12 thiadiazoles derivatives selected for molecular docking which passes the AD criteria of developed mt-QSAR models.

Validation of docking protocol
The docking procedure was validated by re-docking the natural ligands (lisinopril and LBQ657 compounds) from crystal structures (PDB ID: 1O86 and 5JMY) into its binding pocket before virtual screening of chosen compounds.The large size and shape of the binding site pose challenges for molecular docking on targets.In cACE and NEP molecular docking calculations, constraints are applied to obtain reliable orientations of ligands, www.nature.com/scientificreports/including hydrogen bonds with His353 and/or His513 (cACE) and His711 (NEP) along with metal-chelator interactions 43,80 .The study found that native ligands in protein structure are maximally superimposed with co-crystallized ligands, confirming the docking protocol's agreement with previous work and confirming the interaction of native ligands(Fig.9) 17,81 .This suggests that the docking methodology is suitable for the virtual screening of dataset compounds.

Molecular docking results
The binding affinity results of the standard drug molecules, Omapatrilate against the cACE and NEP enzyme are summarized in Table 4. Conventional ACE inhibitors bind to the catalytic region of the active sites of cACE and NEP via chelation with the central Zn 2+ , while the groups P2′, P1′, P1, and P2, mimicking substrate peptides, are placed inside these subsites.Most of the selected ligands were successfully docked with a plausible pose into the active sites using one of the applied constraint conditions.The resulting docking pose that has metal-acceptor interaction and bonding interactions with His353 and/or His513 (cACE) or bonding with His711 (nACE) is selected for assessment 80 .The results of molecular docking of selected screened designed derivatives with higher or equivalent docking scores compared to standard drugs with their interacting residues are given in Tables S1 to S3 (Supporting file S8).Chalcone derivatives exhibited varying docking scores for both cACE and NEP (See Supporting file S7).Notably, C115 demonstrated a docking score of − 5.7009 for cACE and − 7.1057 for NEP, while C148 had a docking score of − 5.6612 for cACE and − 7.1215 for NEP.Compound C105, a member of the chalcone derivatives, displayed a favorable docking score of − 5.6880 for cACE.These results suggest variations in the binding affinities of chalcone derivatives to the target enzymes.
Likewise, the 1,3-thiazole derivatives showed a range of docking scores.T1 exhibited a high affinity for the binding of cACE and NEP, with docking scores of − 7.7185 and − 6.5466, respectively.The docking scores for T3 interactions with the enzymes were − 6.9584 for cACE and − 7.1859 for NEP.In comparison, T10 and T20 received a docking score of − 4.9357 and − 4.7410 for NEP and no pose for cACE.
Among the 1,3,4-Thiadiazole derivatives, TD7 displayed a docking score of − 5.4932 for cACE and − 5.2422 for NEP.TD104 also exhibited a strong binding affinity with a docking score of − 8.1032 for cACE and − 7.5842 for NEP.TD98 achieved an impressive docking score of − 9.4353 for cACE and − 8.7272 for NEP.TD101 had a minimal docking score of − 0.0854 for cACE and − 7.4954 for NEP.TD64 had a docking score of − 6.1358 for cACE and − 6.1683 for NEP, while D103 had no pose for cACE.TD107 also had no pose for either cACE or NEP.www.nature.com/scientificreports/Visualization of docking poses demonstrated the importance of chelation with the central Zn 2+ , formation of hydrogen (conventional), and hydrophobic (π-π stacking, π-alkyl, and alkyl) interaction required with key residues at S 1 and S 2 ' for the inhibition both cACE and NEP enzymes.From the docking results, the compounds with functional moiety forming the above interactions appeared to be an ideal scaffold to be dual inhibitors.The final docking poses and binding interactions of selected top ligands which have favorable binding energies and poses are illustrated in Figs. 10, 11, 12, 13.

Molecular docking analysis
Omapatrilate is a comprehensively experimented dual ACE/NEP inhibitor showing a binding energy of-5.2088kcal/mol against cACE because of the metallic interactions with Zn 2+ and hydrogen bonding interactions with Ala356 and Ser355 through an oxygen atom.Further, the sulfur atom and benzene ring enhanced stability by forming π-sulfur interactions with His513 (subsite S1′) and His353.It also shows alkyl, π-alkyl, and π-π stacked interactions with residue His387, Val518, Val380, Ala354, and His353 respectively.The interaction between the omapatrilat molecule and NEP is particularly stable (binding energy-5.2360kcal/mol), as seen in Fig. 10.The P1′ carbonyl group of omapatrilat forms a hydrogen bond interaction with Phe106.Although the seven-membered fused ring only partially reaches into the S2′ pocket, it nevertheless interacts hydrophobically with Val580, His583, and Ala543 residue.Both oxygen atoms of the Omapatrilat P2′ peptide bond form metallic interactions with Zn 2+ and conventional hydrogen bonds with His711 and Arg717 residues.
In the docking simulation between C115 and cACE, it forms three hydrogen bonds with Gln281 (S2' subsite), Tyr523, and Glu384 residues Fig. 11.It also forms attractive charge interactions (π-cation, π-anion, and others) with five amino acid residues including His353, Asp377, Glu162, Glu376, and Lys511.In addition, four residues (Ala354, Val380, Val379, and His383) were included in hydrophobic interactions (alkyl, π-alkyl, and π-π T-shaped interactions) to give stability to docked pose of the ligand (binding energy,-5.7009kcal/mol).The molecular interaction pattern of compound C115 showed docking interactions with the NEP enzyme accompanied by a docking score of − 7.1057 kcal/mol.It forms two hydrogen bonds with Glu584 and His583 similar to that of omapatrilate.Further, the stability of the protein-ligand complex, on one hand, is supported by the formation of charged interactions (π-cation, π-anion, and others) with Arg110, His711, Arg717, and Asp650.It also forms hydrophobic interaction with Phe106 (π-π T shaped), and Val580 (π-alkyl) through phenyl ring.
These hydrogen-bonding interactions and hydrophobic interactions between selected designed derivatives (C115, T3, and TD104) and catalytic amino acid [His353 and/or His513 (cACE) and His711 (NEP)] residues of cACE and NEP enzymes make a favorable orientation to interact with zinc ion through metal-chelator interactions.Our docking results indicated that C115, T3, and TD104 could inhibit both targets by inhibiting the active site rather than other secondary sites.Moreover, compound TD104 exhibited greater binding potential for the cACE and NEP compared to C115 and T3.
Further, the selected top ligands that form favorable interactions with targets were screened using ADMET studies.Additionally, molecular dynamics simulations of all selected ligand docking poses were run to verify the key residues (catalytic residue) from docking poses.

In silico ADME and toxicity studies of selected screened compounds
Prediction of Pharmacokinetic profile (ADME) parameters before experimental studies is among the most vital aspects of the drug design and discovery of drug molecules.These parameters along with toxicity predictions of the compounds are considered important attentive parameters during the transformation of a molecule into a potent drug.

Drug-likeness and ADME studies of the selected screened compounds
The drug-likeness capability of compounds can be prophesied using Lipinski, Ghose, Veber, Egan, and Muegge rules which are predicated on specific physicochemical parameters like logP (for oral in range of 1.35-1.8,sublingual > 5) tPSA (should be < 140 Å), no. of donors (< 10), acceptors (> 5), etc. 82 .The predicted drug-likeness, PAINS, and synthetic accessibility properties of the selected compounds are shown in Table 5.According to the results, the selected compounds in Table 5 showed no violations of the Lipinski and Ghose rules.However, compounds TD75 and TD104 are acceptable with only one violation according to Veber, Egan, and Muegge rules.All of the different compounds in the PAINS investigation were not exhibiting any alerts, except for compounds C93 and C229.
Table 6 presents the predicted ADME properties of these compounds, including gastrointestinal (GI) absorption, blood-brain barrier permeation (BBB), inhibition of the CYP 450 system, and permeability glycoprotein (P-gp) substrate.Furthermore, aqueous solubility and BBB values of the ligands preferably lie in the range of − 6.5-0.5 and − 3.0-1.2respectively 83,84 Also, p-glycoprotein (P-gp) non-substrate causes drug resistance 85 .
According to the results, all the compounds showed high GI absorption, except for compounds TD75 and TD106.BBB permeation potential was predicted for compounds C93 and C105.Compounds C115, C165, and C229 showed potential to be a substrate of P-gp.The potential to inhibit cytochrome P450 (CYP) isoforms was observed for compounds TD104 and TD106 (for 4 isoforms); T3 and T4 (for 3 isoforms); C97 and TD75 (for 2 isoforms); and C93 (for 1 isoform).Chalcone derivatives C105, C115, C146, C165, C191, and C229 were predicted to show no inhibitory activity against any of the CYP isoforms.The computed bioavailability score for all the compounds placed them within the 55-56% probability class, except for compound TD75 (0.11).
After evaluating drug-likeness, and ADME characteristics, and studying PAINS alerts, the compounds C93, C229, and TD75 have been excluded from consideration for further investigation.A total of 10 compounds are now undergoing additional toxicological studies, utilizing VEGA QSAR and ProTox-II software.

In silico toxicological results
Predicting the toxicological properties and pharmacokinetic parameters of a compound plays a crucial role in the drug discovery process, as they collectively contribute to 60% of the failures in converting a lead compound into an effective drug 86 .

Toxicological properties prediction using VEGA-QSAR models
To evaluate toxicological data, the QSAR modeling method was performed using VEGA-QSAR (https:// www.vegah ub.eu/ portf olio-item/ vega-qsar).The software-incorporated algorithm provides the evaluation of reliability prediction as Applicability domain index (ADI) value (Tables 7 and 8).It gives positive results with ADI > 0.5, as indicators of reliability effect; low (0.5 < ADI < 0.6), medium (0.6 < ADI < 0.8), and high (0.8 < ADI < 1).www.nature.com/scientificreports/All of the selected chalcone derivatives designed in this study were found to exhibit no developmental toxicity (PG model assessment) 87 , and inactive for estrogen and androgen-mediated effect (IRFMN/CERAPP model) 88,89 .Furthermore, they were confirmed to be non-reactive for Thyroid hormone receptor α/β (NRMEA model).All compounds have the potential for skin sensitivity, though the reliability of this prediction is low.Nevertheless, among them, only compounds C146, C165, and C115 were predicted to be non-mutagenic as indicated either by CAESAR or SarPy/IRFMN model assessment 90 and non-carcinogenic according to the CAESAR or ISS models assessment 91 (Table 7).
Similar to chalcone derivatives, skin sensitization predictions using the IRFMN/JRC model indicated sensitization potential in both thiazole and thiadiazole compounds, except for compound T3.Both thiazole derivatives (T3 and T4) were predicted to be non-mutagenic by the AMES toxicity (SarPy/IRFMN model), non-carcinogenic (ISS model), non-indicative of developmental/reproductive toxicity (PG model), and inactive for both androgen receptor-mediated effects and thyroid receptor effects (IRFMN/CERAPP and NRMEA models).The two selected thiadiazole derivatives and standard drug Omapatrilate pass all screening parameters when evaluated through the applied QSAR model and their assessment scores.However, skin sensitization predictions have indicated sensitization potential in these compounds.Additionally, hepatotoxicity predictions have raised concerns about their potential toxicity (Table 8).

GHS toxicity classification and prediction of LD50 using ProTox-II
The GHS toxicity categorization places thiazole in Class III and the designed derivatives of chalcone and thiadiazole, in Class V.This indicates that compounds may be harmful if swallowed (2000 < LD50 ≤ 5000) 92 .The LD 50 between 2000-5000 mg/kg indicates a safety range and values showed less potent toxic effects (Table 9).

Molecular dynamics and MM-GBSA results
MD simulation studies were carried out to understand the stability of protein-ligand interaction.As discussed earlier, the selected designed compounds with favorable screening properties of AD domain values of mt-QSAR models, docking score, and ADMET were selected for MD simulation studies.Omapatrilate was considered the standard multi-target inhibitor of ACE and NEP enzymes.Backbone RMSD analysis was evaluated (RMSD difference ≤ 2.0 Å) suggesting the stability of omapatrilate into the binding pocket of cACE (PDB ID: 1O80) and NEP (PDB ID: 5JMY) proteins.When the RMSD data were compared, each simulation including 20 ns revealed stable conformation.It was observed that Omapatrilate formed H-bond with various amino acid residues of cACE such as His353, Ala356, Glu384, Tyr523 and His410 whereas hydrophobic bond interaction with Trp357, Val380, His383, Phe457, Lys511, Phe512, His513, Val518, Tyr520 and Tyr523.The strong affinity of Omapatrilate with cACE was observed due to the formation of ionic interactions with amino acid residues His383, His387, His353, and Glu411.Additionally, it also forms salt bridge interaction with Glu143, His353, Ala354, Trp357, Glu403, and Pro519 with protein.
Similarly, MD simulation of the Omapatrilate-NEP complex reveals the formation of strong ionic interactions with Arg110, His583, Glu584, His587, Glu646 and hydrogen bonding interactions with Asn542, Ala542, Ala543, Glu584, Trp693, His711, Arg717 amino acid residues.The formation of a few hydrophobic interactions with Phe106, Ile558, Val580, and Val710 favored the stability of the complex.Apart from RMSD, the RMSF value of a protein is widely used to access ligand-induced changes in the protein's internal chains.Figures 14 and 15 show the RMSF plot of the Omapatrilate in complex with cACE and NEP enzymes, respectively.www.nature.com/scientificreports/Although hydrophobic and hydrogen bonds are weaker compared to ionic bonds, they are too exploited most for the design of new drug candidates 93,94 .Ligand interactions of compound C115 at different time intervals were analyzed and checked for stability which showed that the proteins got stabilized and the ligand was forming interaction with the protein (RMSD difference ≤ 2.5 Å).In the cACE domain, C115 was forming H-bond interactions with Gln281, Ala354, Asp377, Lys511, His513, and Tyr520 amino acid residues while hydrophobic interactions with His353, Ala354, Val380, and His513 amino acid residues.Moreover, the salt bridge interaction with Ala356 and Tyr523 gives additional stability to the complex.The RMSD and RMSF plots of protein-ligand and the ligand-protein contacts for compound C115-cACE and compound C115-NEP enzyme complexes are shown in Figs.16 and 17 respectively.On the other hand, compound C115-NEP complex stabilized by hydrogen bond formation with key amino acid residues Asn 542, and Ala543.Amino acid residues Phe106, Ile558, Val580, His583 and Trp 693 in the S1' subsites contribute to stability by forming important hydrophobic interactions similar to that of the standard omapatrilate.
On the other hand, compound T3 showed stable interactions throughout the simulation period (20.0 ns) which indicates the stability of the ligand in the binding site pocket of the protein (RMSD difference ≤ 2.0 Å).T3 was linked to amino acids like His383, Glu384, His387, and Glu411 by a metal ion coordinated within 3.4 Å of the protein's and ligand's atoms.Additionally, the selectivity towards the cACE domain is due to the formation of hydrogen bonding with Glu403, Asp415, Asp453, and Lys454 and hydrophobic interaction with Val380, His383, His513, and Tyr523 amino acid residues.The simulation interactions of the T3-NEP enzyme complex show that, the ligand occupies tightly in the S1 and S2 pocket of the enzyme through ionic interactions with Glu584, His587, and Glu646 amino acid residue.It also forms, an important hydrophobic interaction with catalytic residue His711 amino acid.Unlike, Ompatrilate the stability is contributed by the formation of multiple type contacts with His583, Val541, Ala543, and Tyr545 amino acid residue (Figs.18 and 19).
MD simulations trajectories revealed that compound TD104 was well stabilized (RMSD difference ≤ 2.0 Å) and made favorable metal ionic contacts and hydrogen bonding with an important catalytic site responsible for inhibition of cACE when compared to compound T3 over the entire simulation trajectory.Along with this Gln281, Ala354, Cys370, Asp377, and His513 amino acid residues play key roles in docked pose stability via H-bonding interaction.Also, hydrophobic contacts with Val380, His383, Lys511, Tyr523, and Phe512 are well maintained during the simulation trajectory.Correspondingly, compound TD104 shows the formation of hydrophobic interaction with Phe106, Trp693, and water bridge interaction with His711, and Asn542 in the catalytic domain suggesting its NEP enzyme inhibition probability.Furthermore, stability is provided by ionic interaction made by His583, His587, Glu584, Asp590, and Glu646 amino acid residues (Figs.20 and 21).Design compounds, C115, T3, and TD104 were found to form significant key interactions similar to omapatrilate towards the substrate binding pocket of cACE and NEP enzymes, and could probably be the novel lead molecules for multi-target inhibition against the target enzymes.
So, the binding potential scores derived from the results of constraint-based docking provide a strong foundation for the MD simulation trajectories.Inferring that the compounds C115, T3, and TD104 have a strong affinity for the cACE and NEP enzymes.Following Lipinski, Ghose, Veber, Egan, and Muegge's rule, all three compounds exhibited acceptable drug-likeness parameters.Furthermore, selected designed compounds (C115, T3, and TD104) do not have AMES mutagenicity and are Non-carcinogenic., no developmental/reproductive toxicity (PG model), and are found to be inactive for hormone receptors (estrogen, androgen, and thyroid α/β).All these parameters suggest compounds are safe to use.Hence, they could be considered for further synthesis and subsequent screening as promising scaffolds for the development of dual inhibitors targeting cACE and NEP enzymes, potentially contributing to the management of hypertension.
The study conducted MM-GBSA calculations to estimate ligand binding energies or affinity (dG Bind), with the results presented in Table 10.
Based on the MM-GBSA calculations, the ΔG bind values for molecules such as C115, T3, and TD104 were found to be − 16.   www.nature.com/scientificreports/Based on the two findings, TD104 and C115 exhibited the highest binding affinities for cACE, with C115 demonstrating strong binding with NEP.Hence, the designed compounds C115 and TD104 show promising potential as novel therapeutics and could serve as valuable leads for the synthesis of cardiovascular agents.

Conclusion
The study demonstrates the effectiveness of in-silico-based mt-QSAR modeling for recognizing structural prerequisites and identifying potential ligands against multiple biological targets under varied experimental conditions.The combinatorial library of three different scaffolds viz.Chalcone and its analogue, 1,3-Thiazole, and 1,3,4-Thiadiazole derivatives were designed and screened utilizing developed classification-based (LDA and RF) QSAR models to get probably potent ligands.Among these, Chalcone (85 compounds), 1,3-Thiazole (8 compounds), and 1,3,4-Thiadiazole (37 compounds) derivatives were identified as potentially effective and potent.Subsequently, molecular docking studies were conducted to understand their molecular interaction capabilities.Molecular simulations confirmed their interaction capabilities, with 13 compounds showing favorable binding scores.Furthermore, ADMET screening was performed on the selected screened molecule, revealing seven identified compounds (C165, C115, T3, T4, TD104, and TD106) that had favorable pharmacokinetic and toxicological profiles, paving the way for subsequent synthesis.The study also suggests that Chalcone and 1,3,4-Thiadiazole scaffolds hold promise for developing multi-targeted cardiovascular agents.Considering further, the research points to the future potential of Chalcone and 1,3,4-Thiadiazole derivatives in developing next-generation multi-target inhibitors.Modifying these scaffolds could lead to the creation of novel molecules with enhanced dual-system blocking activity.The insights gained from this study provide valuable guidance for prospective researchers and medicinal chemists, emphasizing the importance of target-specific binding assays and structure-activity relationship studies to unravel the mechanisms of action and key interactions for multitarget inhibition.

Figure 1 .
Figure 1.Sequential steps involved in the development of mt-QSAR models for dual endpoint detection (ACE and NEP enzyme inhibition).

Figure 4 .
Figure 4.An illustration of the metal chelation and positional constraints created and implemented for docking chemical compounds against cACE (PDB ID: 1o86) and NEP (PDB ID: 5JMY).

Figure 5 .
Figure 5. ROC (using tenfold cross-validation) plots for the best LDA model for dual inhibition of ACE and NEP enzymes.

Figure 6 .
Figure 6.Y-randomization test results for the developed LDA model for dual inhibition of ACE and NEP enzymes.

Figure 7 .
Figure 7. ROC (using tenfold cross-validation) plots for the best RF model.

Figure 8 .
Figure 8. Screening of designed compounds using developed mt-QSAR models (LDA and RF).Compounds found active through both models are only selected for molecular docking study.

Figure 9 .
Figure 9. Validation of docking protocol by re-docking the native ligands (Lisinopril and LBQ657) at active binding site and interacting amino acid residues (Magenta color original poses, green color redocked pose of lisinopril and cyan color redocked pose of LBQ657).

Figure 10 .
Figure 10.Schematic representation of 2D (a2 and b2) and 3D (a1 and b1) docking poses of standard drug omapatrilate against cACE and NEP target, binding to the catalytic region of the active sites via a chelation interaction with the zinc atom.

Figure 11 .Figure 12 .
Figure 11.Schematic representation of 2D (a2 and b2) and 3D (a1 and b1) docking poses of Chalcone derivative (C115) against cACE and NEP target, binding to the catalytic region of the active sites via a chelation interaction with the zinc atom.

Figure 13 .
Figure 13.Schematic representation of 2D (a2 and b2) and 3D (a1 and b1) docking poses of 1,3,4-thiadiazole derivative (TD104) against cACE and NEP target, binding to the catalytic region of the active sites via a chelation interaction with the zinc atom.

Figure 14 .
Figure 14.MD simulation analysis of Omapatrilate-cACE complex (a) Simulation interactions diagram (b) Protein-ligand contacts histogram (c) RMSF of the amino acids comprising the cACE (d) RMSD of the protein backbone.

Figure 15 .
Figure 15.MD simulation analysis of Omapatrilate-NEP enzyme complex (a) Simulation interactions diagram (b) Protein-ligand contacts histogram (c) RMSF of the amino acids comprising the NEP enzyme (d) RMSD of the protein backbone.

Figure 16 .
Figure 16.MD simulation analysis of compound C115-cACE complex (a) Simulation interactions diagram (b) Protein-ligand contacts histogram (c) RMSF of the amino acids comprising the cACE (d) RMSD of the protein backbone.

Figure 17 .
Figure 17.MD simulation analysis of compound C115-NEP enzyme complex (a) Simulation interactions diagram (b) Protein-ligand contacts histogram (c) RMSF of the amino acids comprising the NEP enzyme (d) RMSD of the protein backbone.

Figure 18 .
Figure 18.MD simulation analysis of compound T3-cACE complex (a) Simulation interactions diagram (b) Protein-ligand contacts histogram (c) RMSF of the amino acids comprising the cACE (d) RMSD of the protein backbone.

Figure 19 .
Figure 19.MD simulation analysis of compound T3-NEP enzyme complex (a) Simulation interactions diagram (b) Protein-ligand contacts histogram (c) RMSF of the amino acids comprising the NEP enzyme (d) RMSD of the protein backbone.

Figure 20 .
Figure 20.MD simulation analysis of compound TD104-cACE complex (a) Simulation interactions diagram (b) Protein-ligand contacts histogram (c) RMSF of the amino acids comprising the cACE (d) RMSD of the protein backbone.
46, − 12.19, and − 27.76, respectively, for the cACE protein.These binding energies of T3 and C115 closely resemble those of Omapatrilate, indicating a strong binding affinity to the cACE enzyme.Additionally, the ΔG values against NEP were − 8.267, − 5.465, and − 5.574 for the compounds C115, T3, and TD104, respectively, which are in proximity to the ΔG values of Omapatrilate.

Figure 21 .
Figure 21.MD simulation analysis of compound TD104-NEP enzyme complex (a) Simulation interactions diagram (b) Protein-ligand contacts histogram (c) RMSF of the amino acids comprising the NEP enzyme (d) RMSD of the protein backbone.

Table 1 .
The best-fit mt-QSAR-LDA model (Standard coefficients and fitness scores) for dual inhibition of ACE and NEP enzymes.

Table 2 .
Symbols and definitions for the descriptors selected in the mt-QSAR (LDA) model for dual inhibition of ACE and NEP enzymes.
faccS4B_tnFrequency of occurrence of sulfur atoms exactly at four bonds from acceptor groups Circular fingerprint C_O_1B_tn Presence of carbon atoms at a distance of 1 bond i.e. attached to oxygen atoms Topological fNC7B_at Frequency of occurrence of carbon atoms exactly at seven bonds from nitrogen atoms Circular fingerprint fringCO4B_tn Frequency of occurrence of oxygen atoms exactly at four bonds from ring carbon atoms Circular fingerprint Vol.:(0123456789) Scientific Reports | (2024) 14:15991 | https://doi.org/10.1038/s41598-024-66230-7www.nature.com/scientificreports/mt-QSAR-LDA model to distinguish between active and inactive inhibitors.MCC values (0.7858 for sub-training, 0.6954 for test) further confirm the model's statistical robustness

Table 3 .
Overall statistical performance of the final mt-QSAR (LDA and RF) models for dual inhibition of ACE and NEP enzymes.

Table 5 .
Predicted drug-likeness, PAINS study, and synthetic accessibility measures of the screened compounds.

Table 6 .
Predicted pharmacokinetics (ADME) parameters of the screened compounds.

Table 7 .
Toxicological data of selected designed chalcone derivatives using VEGA-QSAR.

Table 10 .
MM-GBSA values for the selected designed compounds.