Morphometric Analysis and Genetic Relationship of Rasbora spp. in Sarawak, Malaysia

The genus Rasbora is one of the most species-rich genus among the freshwater fishes and cryptic diversity has been a major hindrance in species identification in the past four decades due to their high similarities in terms of morphology. This study aimed to investigate this issue both morphologically and molecularly. In this study, a total of 23 morphometric parameters were used to differentiate the 103 Rasbora fish samples harvested from different regions of Sarawak state of Malaysia via Multivariate Stepwise Discriminant Function Analysis (SDFA). Then, cytochrome oxidase subunit I (COI) gene was utilised to further distinguish 33 of these fishes, followed by sequence and phylogenetic analysis. Our results unravelled pre-anal length as strongest morphometric discriminant (100%) and that all eight Rasbora species tested are monophyletic except for R. sumatrana and R. caudimaculata, revealing possible cryptic Rasbora species. Further investigations are vital to enrich the data from this study for Rasbora cryptic diversity and conservation studies in future.


INTRODUCTION
The Rasbora fish are from the family of Cyprinidae and they are small to moderate in size, inhabiting the Asian region. The Rasbora genus is the most species-rich genus in the cyprinid Danioninae subfamily (Lumbantobing 2014). Currently, a total of 150 Rasbora species had been discovered where 39 species are distributed across Borneo and six are distributed across Sarawak (Fricke et al. 2019). There are a total of eight Rasbora groups identified by Brittan (1984), which are R. trifasciata, R. argyrotaenia, R. einthovenii, R. daniconius, R. lateristriata, R. caudimaculata, R. sumatrana-elegans and R. pauciperforata comprises of several species that are closely related according to their similar obvious features and linked evolutionarily. Taxonomically, Rasbora genus is known as catch all group due to insufficient unique diagnostic characteristics per species and morphological difficulty in characterisation (Muchlisin et al. 2012). This explains why the genus experiences cryptic diversity which this term translates as comprising more than one species that are morphologically indistinguishable and thus being characterised as a single species.
According to the International Union for Conservation of Nature (IUCN) Red List of Threatened Species, there are currently ten Rasbora species being categorised as "Near Threatened" and above, with two being labelled critically endangered, two labelled endangered, four labelled vulnerable and the remaining four labelled near threatened (IUCN 2019). These fishes, discovered to inhabit river streams, waterfalls and peat swamps, are deemed one of the important contributors towards the diversity of the peat swamp ecosystem (Sule et al. 2018). However, they are mostly threatened by residential and commercial development, natural system modifications, agriculture and aquaculture, invasive and other problematic species, genes and diseases, energy production and mining as well as pollution (IUCN 2019). For instance, R. tawarensis is one of the IUCN Red List's critically endangered Rasbora species only found in Lake Laut Tawar, Aceh, Indonesia that is severely affected by fishing and pollution (Lumbantobing 2019). Conservation efforts such as gill nets size regulation as well as pesticide and chemical fertilisers ban have been implemented for conservation purposes (Lumbantobing 2019). In the case of the Sarawak state of Malaysia, an investigation on water quality in downstream of Bakun Dam revealed the impact of water quality on fish diversity and the R. caudimaculata, R. borneensis, R. dusonensis and R. volzi are some of the many fishes affected by the poor water quality (Liew 2016). Up to the year 2012, the Sarawak state is home to 35 endemic species (23.8% of total found in Borneo) from Cyprinidae family. At least 16 Rasbora species discovered to be endemic to Borneo and 14 of them are distributed across Sarawak Rajang River and Danau Sentarum National Park (Sulaiman & Mayden 2012). This calls for the urgency and needs to initiate conservation efforts to preserve Rasbora fishes, especially in Sarawak, as the awareness is scarce and lacks research attention. Therefore, accurate diagnostic of cryptic species complexes like that of the Rasbora fishes is crucial due to constantly increasing cases of natural ecosystem destruction and disturbance leading to the extinction of species. Furthermore, research focusing on putting the Rasbora genus into the limelight is very much in its infancy, with researches focusing on ATP-binding cassette transporter gene family in R. sarawakensis (Lim et al. 2018), whole mitogenome sequencing (Miya 2009;Ho et al. 2014;Zhang et al. 2014;Kusuma et al. 2017;Chung et al. 2020;Lim et al. 2019) as well as ecotoxicology (Wijeyaratne & Pathiratne 2006). Adding to that, the border security for economic activity control requires precise cryptic species determination to resolve the invasive species issues.
Morphological analysis has been a major approach in species characterisation, however this approach becomes challenging when it comes to distinguishing species complexes (group of closely related species). The challenges involved are difficulty in recognising juvenile specimens (as morphological analysis depends on life stage and species gender) as well as the overlooking of the morphologically cryptic taxa (as the analysis rely on phenotypic characteristics). This implicates the rise of issue or argument upon species discovery accuracy in future and thus complicating the identification of cryptic species. The morphological analysis, when combined with genetic analysis is powerful in distinguishing sibling species (Rius & Teske 2013). It had been used in the study of cryptic diversity of Pacifastacus leniusculus by Larson et al. (2012). The objectives of this study are to evaluate the cryptic diversity of Rasbora fishes from Sarawak via morphological and molecular approaches as well as to resolve their phylogenetic relationships across Rasbora fishes from other regions. In this study, two different approaches were used, namely morphometric analysis as well as Cytochrome oxidase I (COI) to infer the genetic relationship in order to complement the morphological identification. A total of 23 morphometric measurements were recorded emulating that of Lubantobing (2014) to determine the statistically significant differences in characteristics that are able to discriminate selected Rasbora species. Besides, phylogenetic analysis was conducted to resolve the relationship between the Rasbora fishes isolated from Sarawak and across other regions.

Sample Collection
The sampling locations were listed in Table 1. These sites were randomly selected depending on its accessibility by car and boat. In order to mark the location of each sampling site, the geographical coordinates were recorded using global positioning system (GPS). The sampling activity was conducted with a permit given by Sarawak Forestry Department (NCCD.94047(Jld13)-178). Altogether, 103 Rasbora fishes were caught at all 12 targeted locations of Sarawak, Malaysia. Some of these were caught at slow moving drainage and also small ponds with leaves covering the water surface. Cast net, scoop net and the fish traps were used to catch the Rasbora fishes. All samples were held in the portable fish bucket with oxygen supplied by the portable air pump. The samples were identified in the field whenever it is possible. The standard length and total length of each specimen was measured at the sampling site in millimetre (mm) unit. The samples were then preserved in 95% of ethanol after anesthetised with tricaine solution and brought to the laboratory for proper identification. Adult fishes are humanely sacrificed by using Tricane TM as anaesthetics with permission from Universiti Malaysia Sarawak Animal Ethics Committee (reference number: UNIMAS/TNC (PI)-04.01/06-09(17)).

Morphometric Analysis
A total of 23 morphometric characteristics, emulating that from Lumbantobing (2014) (1989) to avoid calculation bias as the specimens collected were numerous in term of size and body length. The analysis of variance (ANOVA) was conducted to determine the significant variables from all measurement characteristics used for morphometric data. The data used for ANOVA was the original measurements. The adjusted measurement data were then subjected to Multivariate Stepwise Discriminant Function Analysis (SDFA).

Genomic DNA Extraction and PCR
A total of 33 adult fishes muscle tissues were isolated and subjected to storage in 95% ethanol. The genomic DNA was extracted using CTAB method (Chung 2018). Employing the primer pair designed by Ward et al. (2005) which are (F1 5'TCAACCAACCACAAAGACATTGGCAC 3') and (R1 5'TAGACTTCTGGGTGGCCAAAGAATCA 3'), approximately 655 bp COI gene fragment was amplified via polymerase chain reaction (PCR) by using Bio-Rad T-100 Thermal Cycler. A total of 20 µL reaction volume encompassing 1.6 µL of 0.2 mM DNTPs, 2.0 µL of 1X Transtaq Buffer, 0.4 µL of 0.2 µM forward and reverse primers each, 0.2 µL of 2.5 units of TransTaq DNA polymerase and 2.0 µL genomic DNA extract was set up as follows: initial denaturation at 95°C for 2 min followed by 35 cycles of amplification 94°C for 30 s, 59.7°C for 30 s, and 72°C for 30 s, followed by 1 min final extension at 72°C. PCR products were size-separated on 1.5% agarose gel electrophoresis followed by purification and sequencing. Bidirected sequencing was applied to obtain a full length sequence. The forward and reverse sequences retrieved after sequencing were blasted through the NCBI Nucleotide BLAST tool server for similarity searches. All trimmed sequences were then aligned using Clustal W. The interspecific and intraspecific variations were calculated to determine the variation within species and also between species using MEGA 6 programme.

Phylogenetic Analysis
A total of 33 COI sequences from Rasbora fish in this study together with thirteen COI sequences of Rasbora from Genbank database (Table 2) were utilised and aligned using Clustal W programme. Species relationship were illustrated by constructing phylogenetic tree via MEGA 6 programme by using Maximum Likelihood (ML) method with the model test used Kimura 2 parameter. The phylogenetic tree was run for 1000 bootstraps replication in order to determine the relationship of species for genus Rasbora.

Phylogenetic Relationships Inferred from Partial COI Gene Analysis
Sequencing of the partial COI gene resulted in approximately 651 bp of sequences and confirmed the absent of indels and stop codons. Pairwise sequence alignment of the partial COI gene encompasses 33 specimens with additional 13 sequences from of Rasbora species from NCBI elucidating 221 variable sites with 205 characters were parsimoniously informative. No genetic diversity detected in R. borapetensis, R. dusonensis, R. pauciperforata, with all groups constitute 0.0% interspecific variations. Genetic variation of R. caudimaculata ranging from 0.0% to 8.8%, R. einthovenii with 0.3% to 0.8% in range, R. sarawakensis are from 0.0% to 0.5% and R. sumatrana 0.0% to 3.5% display that low genetic diversity in each of Rasbora spp. Maximum likelihood phylogenetic tree (Fig. 3) revealed that six out of eight Rasbora fishes (R. borapetensis, R. dusonensis, R. argyrotaenia, R. einthovenii, R. pauciperforata and R. sarawakensis) are monophyletic. In other words, all Rasbora fish species are able to form a monophyly except for R. caudimaculata and R. sumatrana. Close relationship was observed between R. caudimaculata and R. sumatrana in ML tree with bootstraps value of 100%. Low divergence between species was also discovered in this study with interspecific variations of R. caudimaculata and R. sumatrana ranging from 0.8% to 3.3%. The intraspecific variations of R. caudimaculata and R. sumatrana were 0.0% to 8.8% and 0.0% to 3.5% respectively. R. sumatrana and R. caudimaculata were found to be genetically similar and closely related despite the distinguishable morphological appearances determined via morphometric analysis.

DISCUSSION
The discriminant analyses were accomplished to determine the functions that contributed into clustering of data to distinct groups. Essence of discriminant analysis is crucial to decide whether a characteristic is capable in differentiating an individual species from a seemingly indistinguishable sub category of organisms (Kočišová & Mišanková 2014). From SDFA analysis of morphometric data, Function 1 contributed the most in distinguishing Rasbora species with its variance of 93.8% and 0.00 Wilk's lambda scores. Wilk's lambda index described the discriminatory power by denoting 0.0 as the best discriminatory power while 1.0 as no discriminatory power (Chen et al. 2011).
In the Function 1, it is revealed that pre-anal length (PrAL) plays noteworthy role in distinguishing Rasbora species with the highest score of contribution when loading into the analysis followed by head length (HL), head width (HW), prepelvic length (PrPvL), pre-dorsal length (PrDL), body depth (BD), dorsohypural distance (DHD), upper-caudal lobe length (UcLL), head depth (HD) and pelvic length (PvL). The highest contribution of Function 1 was also remarked by its high cumulative percentage (93.8%) indicated that SDFA were capable of separating Rasbora samples into distinct species. This analysis was then concluded by the group prediction for Rasbora fishes with 100% correctly classified for all eight distinct species. Predicted group membership of data were based on high degree of similarity of the characteristics calculated in the analysis (Kočišová & Mišanková 2014).
The maximum likelihood phylogenetic tree constructed in this study showed six monophyletic Rasbora clades out of the expected eight. There are some variations and resemblances detected across the tree constructed in this study to that of Liao et al. (2010) that constructed phylogenetic tree based on 41 morphometric paramters. One example is that R. enthovenii and R. argyrotaenia are closely located in Liao et al. (2010) but on the contrary located far away in this study. In this study, R. borapetensis was found to reside clade in proximity to that of R. argyrotaenia but their positions turned out like the other end of the spectrum (far apart) in Liao et al. (2010). R. enthovenii and R. caudimaculata were found to locate closely in both trees compared (Liao et al. 2010). Comparing to the Rasbora phylogenetic tree constructed by Kusuma et al. (2016) using four different DNA barcode genes: opsin, COI, Cytb and RAG1, some similarities and differences were observed. For instances, R. borapetensis, R. dusonensis and R. argyrotaenia are closely located to one another in Kusuma et al. (2016) as well as in this study. One difference discovered is that R. caudimaculata is located far away from R. sumatrana In Kusuma et al. (2016) whereas that is not the case in this study.
The phylogeny of Rasbora genus are partially resolved in this study with some Rasbora species found to be non-monophyletic. In the case of R. caudimaculata and R. sumatrana, both of them shared a monophyly instead of individually, this could possibly be explained by the presence of cryptic diversity among the Rasbora fishes. In addition, the R. caudimaculata sequences mostly matches R. sumatrana sequences from Genbank database with very high similarity (94.0%) and the zero expected value suggests that both species might be considered as the same species. Hebert et al. (2003) highlighted that typical intraspecific variations are less than 3%. In this study, most of the species are correctly classified as the intraspecific divergences less than 3% except for R. caudimaculata and R. sumatrana with 8.8% and 3.5%, respectively. Given that their interspecific variations are relatively low which is 3.3% which contradict with hypothesis in determining species identification which is the intraspecific variation is significantly less than interspecific variation within barcoding sequence region (Meyer & Paulay 2005). Thus, it is suggested that there is lack of genetic differentiation at the barcode site of the sequences between these two species (Rubinoff 2006). This result supported the challenges for barcoding identified by Mallet and Walmot (2003) as there is very small variations occur in sequences between these closely related species.
In short, both R. caudimaculata and R. sumatrana are distinguishable morphologically via the SDFA analysis, but they remain unresolved when it comes to sequence and phylogenetic analysis. Similar phenomenon can be observed in Garra imberba and its related species (Wang et al. 2014) as well as Paragonimus bangkokensis and P. harinasutai (Ngoc Doanh et al. 2009). With this in mind, we do not exclude the possibility of the presence of cryptic species and the possibility that they are indeed the same species until further investigations, both morphologically and molecularly, are done in future.

CONCLUSION
In this study, all 103 Rasbora fishes from Sarawak were classified into eight species using discriminant analyses contributed by pre-anal length characteristic as discriminatory power. Phylogenetic ML tree revealed six monophyletic Rasbora fish species based on COI sequences from this study as well as from GenBank database. R. sumatrana and R. caudimaculata were unable to form their own monophyletic cluster due to their high sequence similarities, despite them being distinguishable morphologically. Further research upon species relationship among loosely related Rasbora species is necessary by using two or more markers from the nuclear genome and the combination of morphological and genetic approach should be retained to provide sufficient evidences for the Rasbora cryptic diversity studies.