Anchovy ’ s protein as a potential precursor of Angiotensin - I Converting Enzyme (ACE) inhibitory peptide and Dipeptidyl Peptidase - IV (DPP - IV) inhibitory peptide by an in silico approach

Protein from fish is known as the precursor of biologically active peptides that exert various health benefits such as antihypertensive, antitumor, and immunomodulatory properties. Hence, this study aimed to perform an extraction of anchovy, LC - MS/MS analysis and in silico evaluation of the major proteins in anchovies as potential precursors of biologically active peptides in addition to determining whether such peptides can be released by selected proteolytic enzymes. Anchovy was subjected to protein extraction followed by total soluble protein concentration determination using Bradford assay. The sample was subjected to in - solution trypsin digestion, which was then analysed by liquid chromatography - mass spectrometry/mass spectrometry (LC - MS/MS). A bioinformatic approach by PEAKS Studio was used to identify the protein. A total of four anchovy ’ s proteins which are myosin light chain 1, V(D)J recombination - activating protein, 60 kDa chaperonin and heat shock protein 90AA1 were identified. Then, the identified proteins were subjected to in silico approach using the BIOPEP database. The biological potential of the theoretically released angiotensin - I converting enzyme (ACE) inhibitory peptides and dipeptidyl peptidase (DPP) - IV inhibitory peptides were predicted by determining the frequency of occurrence of fragments with a given activity. 60 kDa chaperonin and heat shock protein 90AA1 predicted the highest number of biological activities for ACE inhibitory peptides (284 and 264 fragments) and DPP - IV inhibitory peptides (395 and 409 fragments). The most promising enzyme for the generation of bioactive peptides from anchovy protein was anticipated to be pepsin (pH > 2), which theoretically released a high number of DPP - IV inhibitory peptides and ACE inhibitory peptides through the action of in silico proteolysis. Overall, this work highlighted that anchovy protein could be a promising precursor of bioactive peptides that have ACE and DPP - IV inhibitory activities for developing functional food or nutraceutical products.


Introduction
Anchovies are the pelagic species that are widely found throughout the world's oceans.It is a small fish belonging to the Engraulidae family, which includes 140 species in 16 genera (Thienchai and Chaiyanan, 2012).Encrasicholina and Stolephorus are two genera that are habitually found in Malaysian coastal waters (DOF, 2019).The main commercial species of anchovy in Malaysia are Encrasicholina heteroloba, Encrasicholina punctifer, Stolephorus commersonii and Stolephorus andhraensis (Froese and Pauly, 2018).The saltwater fish can be identified by its silvery blue-green back appearance and can grow up to 20 cm (Shiriskar et al., 2010).In the oceans and main seas such as the Atlantic, Indian and Pacific Oceans, anchovy populations can be found in most temperate and productive coastal areas (Thienchai and Chaiyanan, 2012;Checkley et al., 2017).
Approximately 1.7 million tonnes of fishery production in 2017 were contributed by Malaysia's fisheries sector whereas pelagic marine fish contributed to 556, 342 tonnes of Malaysian fishery capture in 2019.Anchovies account for 6% of RM 2.4 billion in the income generated by the Malaysian fisheries sub-sector (FAO, 2019).Fish is one of the main sources of protein diet.In Southeast Asia, anchovies are diversely added in dishes, whether in fresh form or dried form.In addition, anchovies are the main ingredient in fish sauce, which is a food product that is commonly eaten in the Southeast Asia region (Tanasupawat and Visessanguan, 2014).
The identification of bioactive peptides by conventional method is tedious, and costly, and leads to a lower yield of isolated bioactive peptides and loss of the potential bioactivities.Thus, computational simulation tools or in silico methods that can predict the release of a bioactive peptide can be used as an alternative to the traditional method (Pooja et al., 2017).Current available online tools are Peptide Cutter (in silico peptide digestion tools), PeptideLocator and PeptideRanker (bioactive peptide prediction tools), mMass and BIOPEP (an online database that combines in silico digestion and bioactive peptide prediction) (Anekthanakul et al., 2018).In this paper, the BIOPEP database has been selected as an online tool to predict the biologically active peptides in anchovies.BIOPEP is a database program that predicts the profiles of the potential biological activity of protein fragments as well as the prediction of bonds that are susceptible to hydrolysis by an enzyme in the protein chain (Minkiewicz et al., 2008).
A study by Darewicz et al. (2014) has identified peptides with angiotensin I-converting enzyme inhibitory (ACE) activity in products of salmon protein hydrolysis/ digestion using the BIOPEP database and mass spectrometry (MS) approach.Panjaitan et al. (2018) conducted a study where major bioactive peptides predicted from dried giant grouper roe using proteomic analysis and in silico approach are angiotensin-I converting enzyme (ACE-I) and dipeptidyl peptidase-IV (DPP-IV).To the best of our knowledge, little information is available regarding the sequences of bioactive peptides released from anchovy proteins and the selection of optimal proteinases for producing different bioactive peptides.Hence, the study aimed to identify the major proteins contained in selected anchovy using a proteomics approach using LC/MS-MS, perform an in silico evaluation of the major proteins in anchovies as potential precursors of biologically active peptides (BIOPEP) and determine whether such peptides can be released by selected proteolytic enzymes.Such data can be useful in further research for deciphering the profile of protein sequences and could be helpful in the production of specific bioactive peptides with required biological functions.

Materials
Anchovies from the species Encrasicholina devisi were selected for this study.The samples were purchased at Pantai Tok Bali, Terengganu and transported to the laboratory.The anchovies were stored at -80 o C until use.All chemicals used in this research were of analytical grade and electrophoresis grade.For further analysis, the anchovies were thawed and minced into smaller pieces beforehand.

Protein extraction
Protein from anchovy was extracted by using the precipitation approach, which is TCA/Acetone precipitation extraction method as described by Isaacson et al. (2006) with a slight modification.Approximately 2 g of anchovy was suspended in a buffer consisting of 50 mM Tris-HCl (pH 8.5), 100 mM KCl, 5 mM EDTA and 2% (w/v) Mercaptoethanol.The solution was then homogenized for 1 min.This was followed by centrifuging the sample at 11, 000 rpm for 15 mins at 4 o C. The supernatant was collected and precipitated with TCA/acetone overnight at -20 o C followed by FULL PAPER centrifugation for 15 mins at 4 o C. Acetone was discarded and the pellet was re-suspended in TCA/acetone for 1 hr at -20 o C and centrifuged to collect the final pellet.The pellet was then air-dried at room temperature for 5 mins and solubilized with solubilization buffer for further analysis.

Determination of total soluble protein concentration by Bradford assay
Bradford assay was selected as a method to determine the concentration of protein in the solution of extracted protein from anchovy as described in Bradford (1976) and modified by Shukla (2015).The sample that contains a blank, a protein standard and the unknown concentration of the solution to be assayed was mixed with the Bradford reagent.The procedure was performed at room temperature where the absorbance was measured at 595 nm.Then, a standard curve of absorbance vs. protein concentration of each standard was plotted.The concentration of the unknown protein sample was determined by comparing the Net A 595 values against the standard curve.

In-solution trypsin digestion
In-solution trypsin digestion was done according to Kwan and Ismail (2018) where the protein samples were evaporated and re-suspend in 6 M urea, 100 mM Tris buffer at 10 mg/mL.200 mM dithiothreitol (DTT) was added to each fraction and incubated at room temperature for 1 hr.After that, 200 mM of iodoacetamide was added and kept at room temperature for 1 h, followed by 20 µL of 200 mM DTT again.Next, 775 µL of water was added to dilute the samples.Digestion was carried out by introducing 20 µg of bovine trypsin to each sample and incubating overnight at 37°C.The digestion was stopped the next day by adjusting the pH of the buffer to pH <6 using concentrated acetic acid.The digested samples were then concentrated to less than 20 µL each.

LC-MS/MS analysis
Liquid chromatography mass spectrometry/mass spectrometry (LC-MS/MS) analysis was performed according to Kwan et al. (2016).For this analysis, LTQ-Orbitrap Velos Pro mass spectrometer coupled with Easy -nLC II nano liquid chromatography system was used.An aliquot of 100 µL of 0.1% formic acid in deionized water was added to each of the peptides and then filtered using the 0.45um regenerated cellulose (RC) membrane syringe filter (Sartorius AG, Goettingen, Germany).Easy column C18 was used (100 mm, 0.75 mm i. d 3 μm; Thermoscientific USA) as the analytical column while Easy Column C18 (20 mm, 0.1 mm, 5 μm; Thermo Scientific USA) was used as pre-column at flow rate of 3 μL/mm for 15 μL.The analytical column was equilibrated at flow rate of 0.3 μL/min for 4 μL.Then, 3 μL of prepared vial samples were injected at a flow rate of 0.3 μL/min.Running buffer of 0.1 % formic acid in deionized water and 0.1% of formic acid in acetonitrile were used.Sample eluent was sprayed into a mass spectrometer at 220 o C capillary temperature and a 2.1 kV source.A full scan mass analysis from mz 300-2000 was used to detect protein and peptide at a resolving power of 60,000 with a data dependent MS/MS analysis (ITMS) triggered by the eight most abundant ions from a parent mass list of predicted peptide, with rejection or unassigned charge states.The fragmentation technique used was collision induced dissociation (CID) with a collision energy of 35.Each sample was done in replicates.

LC-MS/MS data analysis using PEAKS Studio
The LC-MS/MS raw data were further proceeded to an analysis by using PEAKS Studio software (Bioinfor Inc., CA, USA) to produce a predicted peptide sequence according to Kwan et al. (2016).The de novo sequencing and database matching analysis was done by PEAKS Studio Version 7.5 (Bioinformatics Solution, Waterloo, Canada).The database matching used was the UniProt database (UniProt Consortium, 2019).The carbamidomethylation and methionine oxidation were set as fixed modifications and the maximum missed cleavage was set at 2. 0.1 Da is the value set for parent mass and precursor mass tolerance.For protein acceptance, a false detection rate (FDR) <0.1% and a significant score (-10logP) for protein >20 were used.Minimum unique peptide and maximum variable posttranslational modification were set at 1 and 4 subsequently.The fixed PTMs in the parameter chosen were phosphorylation, methylation, oxidation, hydroxylation, biotinylation, acetylation, sulfation, amidation, myristoylation, carboxylation, ubiquitination and farnesylation while the average local confidence (ALC) was set at >15%.

2.7
In silico assessment of anchovy proteins using BIOPEP database

Potential biological activity profile
The protein sequences of identified proteins and their accession ID were obtained from UniProt database (http://www.uniprot.org/) in the FASTA format.BIOPEP database (http://www.uwm.edu.pl/biochemia/index.php/en/biopep)was used to predict the profiles of potential biological activities of selected anchovy proteins using the "profiles of potential biological activity" option in the database (Minkiewicz et al., 2008) Through this option, BIOPEP ID, name of peptides, the potential activity of the peptide, number of peptides and location of bioactive peptide in protein sequences were acquired.Meanwhile, the frequency of occurrence of fragments with given activity (A) in the selected protein was taken as the evaluation parameter and calculated based on the equation: Where a = number of bioactive peptides and N = total number of amino acid (AA) residues in the protein chain.In addition, the total frequency of bioactive fragments (∑A), in each four sequences was also calculated.

In silico proteolysis
The sequences of anchovy proteins were subjected to in silico proteolysis by using BIOPEP's 'enzyme/s action' tool.A total of 33 enzymes were selected during in silico proteolysis which are chymotrypsin A, trypsin, pepsin (pH 1.3), proteinase K, pancreatic elastase, oligopeptidase V-8, protease (pH 4), thermolysin, chymotrypsin C, plasmin, cathepsin, clostripain, chymase, papain, ficin, leukocyte elastase, metridin, thrombin, pancreatic elastase II, stem bromelain, glutamyl endopeptidases II, oligopeptidase B, calpain 2, glycyl endopeptidase, oligopeptidase F, proteinase p1, xaa pro dipeptidase, pepsin (pH>2), coccolysin, subtilisin, chymosin, ginger protease and V-8 protease (pH 7.8).Predicted degree of hydrolysis (DH %) for each enzyme used was measured.The efficiency of the released bioactive fragments was measured by the frequency of release of the peptide with given activity by selected enzymes (A E ) and also the relative frequency for the release of peptides with given activity by a selected enzyme (W).
Where d is the number of fragments with given activity in the protein sequence, which may be released by enzymes and N is the number of amino acid residues in the protein chains.

Total soluble protein concentration and protein yield of selected anchovy sample.
Table 1 shows the total soluble protein concentration and yield of the anchovy sample (Encrasicholina devisi).The total soluble protein concentration obtained from Bradford assay determination of the anchovy was 18.44 ±0.73 mg/mL in content, whereas the protein yield from the extraction was reported as 20.28 mg/g.The amount of total soluble protein of the anchovy protein was relatively low compared to the previous study reported Ceruso et al. (2015) where the total soluble protein concentration of M. galloprovincialis obtained was 23.33 mg/ml.This may be due to poor compatibility between the extraction method and the sample of anchovy.Thus, future research can focus on improving the extraction technique to increase the yields of protein extraction.

Protein identification through LC-MS/MS analysis and data analysis using PEAKS Studio
Protein identification using LC-MS/MS and PEAKS Studio managed to make identification of 97 proteins in total, which is based on the registered sequence in the UniprotKB library database.Table 2 shows a list of 20 proteins out of 97, which were listed based on the confidence of the results according to the -10lgP from the LC-MS/MS and coverage (%) to match proteins in the database.There were several major proteins identified, which were myosin light chain that has the highest coverage (8-21%), followed by V(D)J recombination-activating protein (6-10%), HSP90AA1 (3%), 60 kDA chaperonin (2%) and rhodopsin (1%).
Identification of myosin light chain as the highest confidence was found to comply with previous studies, which had shown myofibrillar proteins make up to 77% of total protein in anchovy (Stolephorus sp.), consisting of actin and myosin (Dewi, 2002).Choi et al. (2004) also proved the presence of a myosin heavy chain, along with actin and tropomyosin in the myofibrillar protein of anchovy (Engraulis japonicus) using SDS-PAGE.The proteins were also identified from various species including Engraulis sp., Amazonsprattus sp., Thryssa sp., Coila sp., Anchoa sp. and Anchoviella sp.
However, among the identified proteins, only four major proteins were chosen for in silico evaluation (Table 3).The selected protein was myosin light chain 1 (accession ID: Q9IB21), V(D)J recombination activating protein 1 (accession ID: A0A2R4GAA0), 60 kDa chaperonin (accession ID: A0A2H4PU29) and heat shock protein 90AA1 (A0A3G1CVR4).These proteins were selected based on their composition as components of muscle and structural protein from the anchovy body.For example, myosin is known as the myofibrillar protein that constitutes 65-75% of total muscle protein in

Sample
Protein concentration (mg/mL) Protein yield (mg/g) Encrasicholina devisi 18.44±0.7320.28 Table 1.Total soluble protein concentration and protein yield of anchovy sample extracted using TCA/Acetone precipitation method FULL PAPER fish, along with other minor proteins (Medina and Pazos, 2010).
In addition, the selection of the proteins for in silico evaluation was based on their molecular mass where higher masses were preferred.This preference was implemented since proteins with higher molecular mass, could probably have a higher probability of producing bioactive peptides after enzyme treatment rather than lower molecular weight proteins.Table 3 shows protein ID, protein group, percentage of coverage, average mass and description of the protein selected.A similar study has been done by Huang et al. (2015) where a total of 7 tilapia proteins were identified by proteomic techniques in which LC-MS/MS analysis revealed structural proteins from tilapia co-products using proteomic approaches.Myosin light chain 1 (MLC1), also known as myosin essential light chain, is a vital structural component of the actomyosin cross-bridge that also helps with muscle contraction and force development (Hernandez et al., 2021).There are two main types of essential light chains which are short and long isoforms.The N-terminus of the later isoform contain an extended sequence that comprises the repeats of Pro and Ala residues, as well as positively charged amino acids (Nieznañska et al., 2002).A study by Darewicz et al. (2016) had shown that ACE inhibitory and antioxidant peptides were the most predominant fragments of proteins in other subunits of myosin, which is a myosin heavy chain, after in silico evaluation.

Profile of the selected protein
The RAG complex is a multi-protein complex that mediates the DNA cleavage process during V(D)J recombination and comprises a catalytic component.V (D)J recombination deploys a wide repertoire of immunoglobulin and T-cell receptor genes of V and Tlymphocytes development via rearrangement of various V (variable), D (diversity), and J (joining) gene segments (UniProt Consortium, 2019).The adaptive immune response highly depends on V(D)J recombination actions.The human immune system is weakened when it is not present.It causes chromosomal translocations and B-and T-cell malignancies when it is not appropriately regulated (Market and Papavasiliou, 2003).
Moreover, 60 kDa chaperonin is known as a ubiquitous family of sequence-related molecular chaperones that consist of oligomeric proteins with a subunit mass of about 60 kDa that are required for protein folding in both normal and stressed situations (Levy-Rimler et al., 2001).60 kDa heat shock protein is the other name for 60 kDa chaperonin which it is usually found in eubacteria, mitochondria and chloroplast.It has the ability to act as intercellular signals in various biological effects such as immunity and inflammation (Maguire et al., 2002;UniProt Consortium, 2019).
Heat shock proteins (Hsps), sometimes referred to as stress proteins or extrinsic chaperones, are a group of highly preserved proteins found in all living creatures (Shi et al., 2016).Meanwhile, heat shock protein 90 (HSP90) is an important part of the stress response's protective mechanism.Signal transmission, cell cycle control, genomic silencing, and protein trafficking are all aided by HSP90s (Park and Kwak, 2014).Protein folding, damaged protein repair, nascent protein transport, and immune presentation are a few of the biological roles of heat shock proteins (Shi et al., 2016).Heat shock protein 90 has been reported in the liver, ovary, muscle, brain, and stomach tissue of Penaeus monodon, which is a species of black tiger shrimp (Jiang et al., 2009).

Profile of the potential biological activity and A parameter of selected anchovy proteins
The profile of the potential biological activity for each selected protein was measured (Table 5).It is a parameter assessment for the quantitative prediction of protein sequences as a precursor to bioactive peptides (Iwaniak et al., 2005).The number of bioactive peptides predicted to be released from proteins were generated by BIOPEP database.As of 27 June 2021, 4325 peptides functioned in 56 bioactivities have been collected in the database.The biological activities present in Table 5 include ACE inhibitor, DPP-IV inhibitor, DPP-III inhibitor, antioxidative, neuropeptide, regulating, antithrombotic, bacterial permease ligand, inhibitor, renin inhibitor, stimulating, activating ubiquitin mediated proteolysis, immuno-stimulating, antiamnestic, antiinflammatory, hypolipidemic, alpha-glucosidase Inhibitor, CaMPDE inhibitor, HMG-CoA reductase inhibitor, embryotoxic, immunomodulating and celiac toxic.
A total of 22 predicted biological activities were found in the selected anchovy proteins.From the list, 60 kDa chaperonin predicted the highest biological activities which were 21 activities, whereas myosin light chain 1 predicted the lowest value with 14 activities.60 kDa chaperonin and heat shock protein 90AA1 generated the highest number of predicted biological activities for ACE inhibitory peptides (284 and 264 fragments) and DPP-IV inhibitory peptides (395 and 409 fragments) respectively.Meanwhile myosin light chain 1 turned out to be the lowest source of ACE-inhibitory peptides and DPP-IV inhibitory peptides, which were 93 and 125 fragments respectively.(2011), the frequency of occurrence of bioactive peptide (A) indicates the potential for the bioactivity of the protein where the higher value of parameter A, the higher the probability of bioactive peptide to be released.
In the case of anchovy proteins, DPP-IV inhibitory peptides (A value ranged from 0.5647 to 0.7017) were the highest compared to ACE inhibitory peptides (0.3639 to 0.4931), followed by DPP III-inhibitory peptides (0.0590 to 0.0944) and antioxidative (0.0513-0.0855).
However, DPP-IV and ACE inhibitory peptides were the two dominant frequencies of occurrence of bioactive peptides.This theoretical value suggested the protein anchovy protein contains the high potential for DPP-IV inhibitory and ACE inhibitory bioactive peptides activity.Among the selected protein, myosin light chain 1 and 60 kDa chaperonin generated the highest A value for DPP-IV inhibitory peptide and ACE inhibitory peptide as shown in Table 6.The result from Tables 5  and 6 shows that the profile of biological activities of protein and its frequency of occurrence of fragments with a given activity proved the potency of anchovy as a source of bioactive peptides.

In silico proteolysis of anchovy protein for the production of peptides
The enzyme action tools of BIOPEP database allow the prediction of bioactive peptides released by the action of different enzymes.Table 7 summarized the peptide fragment that was released during simulated proteolysis of anchovy protein.The predictions of theoretical sequences were tentatively cleaved by 33 proteinases, but only 5 proteinases showed the ability to generate relatively numerous ACE inhibitory peptides and DPP-IV inhibitory peptides.Simulated proteolysis of the four selected proteins using five proteases (ficin, calpain 2, pancreatic elastase, stem bromelain and pepsin (pH>2) have generated between 9 to 51 dipeptide and tripeptide bioactive fragments of ACE inhibitory peptide.In the meantime, for DPP-IV inhibitory peptide, the dipeptide and tripeptide that were produced were between 9 to 79 fragments.Among the five proteases, pepsin (pH>2) had initially predicted that the enzyme was able to release the highest amount of ACE inhibitory peptides and DPP-IV inhibitory compared to other proteases.
A previous study by Huang et al. (2015) has theoretically determined that protein isolated from tilapia skin and frame generated abundant ACE inhibitory peptides through BIOPEP database analysis.The finding was consistent with Lin et al. (2017) whereby tilapia byproducts exhibited great ACE inhibitory activities hydrolyzed by pepsin in vitro.According to Shevchenko et al. (2007), trypsin enzyme was commonly used for the release of bioactive peptides in protein identification, while pepsin (pH>2) was able to hydrolyze the proteins to release a higher amount of bioactive peptides, compared to other enzymes (Panjaitan et al., 2018).
, KL [39] MA, KA    8, the degree of hydrolysis (DH%) for the anchovy proteins that generated ACE inhibitory peptides and DPP-IV inhibitory peptides were between 35.0515% and 77.83576%.Among the five enzymes, pepsin (pH >2) gave the highest percentage of DH value for four anchovies proteins.This indicated the high ability of hydrolysis of pepsin (pH>2) compared to other enzymes.
The two most critical parameters that could be used to determine the possibility of bioactive peptides being generated from protein sequences by proteases are AE and W values (Minkiewicz et al., 2011).Based on the AE and W value of each protein, pepsin (pH>2) marked its highest value (AE value between 0.0674-0.885)and (W value between 0.1478-0.2010).Parameter W showed the probability of fragments with a specific activity being released from a protein with a specified frequency of occurrence.The probability and "effectiveness" of the release of bioactive fragments from intact protein sequences were influenced by a higher value of parameter W (Borawska-Dziadkiewicz et al., 2021).

Conclusion
The combination of technique and in silico approach can identify proteins from anchovy through LC-MS/MS and predict the potential bioactive peptide from these protein sequences using the BIOPEP database.Moreover, through in silico hydrolysis, the release of fragments with given activity can be predicted by a selection of suitable protease.Among the proteins, 60 kDa chaperonin and heat shock protein 90AA1 were the highest potential sources of bioactive fragments exhibiting all types of activity, specifically ACE inhibitory and DPP-IV inhibitory peptides.Meanwhile, pepsin (pH>2) released the highest number of ACE inhibitory peptides and DPP-IV inhibitory peptides compared to other proteases.This study will provide a database and guidelines for further research.Therefore, coupled with appropriate techniques of protein extraction and identification, these anchovy proteins can be accepted to high value-added products or bioactive peptides ingredients that can be used in the food, cosmetic and biomedical industries.
. Data from BIOPEP database were imported into Microsoft Excel 2016, where the predicted bioactive Table 4 lists the identified proteins with their

Table 3 .
Selected proteins identified from LC-MS/MS analysis and PEAKS Studio database.OS = Organism name, OX = Organism identifier, GN = Gene name, PE = Protein existence, SV = Sequence version.

Table 4 .
Name, accession number, sequences, amino acid residue and molecular mass of anchovy protein used in in silico analysis peptides for the identified proteins was summarized in Table6.According toMinkiewicz et al.

Table 8 .
The values of parameters describing the predicted efficiency of release of bioactive fragments from selected anchovy proteins by in silico hydrolysis https://doi.org/10.26656/fr.2017.7(2).792© 2023 The Authors.Published by Rynnye Lyan Resources FULL PAPER