Comparative mitochondrial phylogeography of water frogs (Ranidae: Pelophylax spp.) from the southwestern Balkans

The genus Pelophylax (water frogs) includes relatively common, widely distributed, and even invasive species, but also endemic taxa with small ranges and limited knowledge concerning their ecology and evolution. Among poorly studied species belong endemics of the southwestern Balkans, namely Pelophylax shqipericus , P. epeiroticus and P. kurtmuelleri . In this study, we focused on the genetic variability of these species aiming to reveal their phylogeographic patterns and Quaternary history. We used 1,088 published and newly obtained sequences of the mitochondrial ND2 gene and a variety of analyses, including molecular phylogenetics and dating, historical demography


Introduction
The southwestern part of the Balkan Peninsula is characterized by heterogeneity in topography, habitats, and microclimate (Griffiths and Frogley 2004;Reed et al. 2004;Savić 2008). This region is demarcated by two mountain massifs, the Dinarides and Hellenides, and extends from the Neretva River to Peloponnese (Savić 2008). The region is also characterized by steep mountains and deep valleys that form natural barriers for latitudinal dispersal, and thus, may affect species distribution and diversity (Podnar et al. 2014;Jablonski et al. 2016;Psonis et al. 2018). The current genetic diversity and species richness in the southwestern Balkans are results of the complex paleogeographical history of the region that was accompanied by radiation and speciation of animal taxa during the Miocene and the Pliocene. A series of paleoclimatic fluctuations during the Pleistocene have led to subsequent allopatric differentiation of populations that resulted in high intraspecific diversity (Podnar et al. 2014). The southwestern Balkans is well known as a centre of endemism (e.g., Radea et al. 2017;Sfenthourakis and Hornung 2018;Allegrucci et al. 2021;Mermygkas et al. 2021) including amphibians and reptiles. Endemic species or unique evolutionary lineages are reported in salamanders (Recuero et al. 2014;Pabijan et al. 2017), toads (Fijarczyk et al. 2011;Dufresnes, Mazepa et al. 2019a), frogs (Dufresnes et al. 2013;Jablonski et al. 2021), lizards Psonis et al. 2017;Kiourtsoglou et al. 2021;Strachinis et al. 2021), or snakes (Guicking et al. 2009;Musilová et al. 2010;Mizsei et al. 2017;Jablonski et al. 2019). The southwestern Balkans is inhabited by three endemic water frog taxa of the genus Pelophylax. Although we have preliminary views regarding their evolutionary relationships (e.g., Lymberakis et al. 2007;Plötner et al. 2012;Hofman et al. 2015), the range-wide intraspecific genetic variability and phylogeography have never been studied.
The last and taxonomically disputed species is the Balkan water frog P. kurtmuelleri (Gayda, 1940). Populations belonging to this species were described as Rana balcanica Schneider & Sinsch, 1992 based on bioacoustics and morphometry (Schneider and Sinsch 1992;Schneider et al. 1993). Rana balcanica is now a junior synonym of P. kurtmuelleri with the type locality "Arsen presso Petrelli" = Erzen river near Petrelë, Albania (Dubois and Ohler 1994). Although this species is considered to be endemic to the Balkans (Valakos et al. 2008), mitochondrial haplotypes and alleles of some genes affiliated with this taxon were recently discovered also in central and eastern Europe (Kolenda et al. 2017;Litvinchuk et al. 2020, Svinin et al. 2021. The species status of P. kurtmuelleri is controversial for a long time (Speybroeck et al. 2010(Speybroeck et al. , 2020, mostly due to its low mtDNA divergence and shared interspecific polymorphism with P. ridibundus (e.g., Lymberakis et al. 2007;Akın et al. 2010;Plötner et al. 2012;Hofman et al. 2015;Litvinchuk et al. 2020).
Considering the lack of comprehensive information on the genetic diversity and phylogeography of these three water frog taxa inhabiting the Balkan Peninsula, we decided to analyse a dense mitochondrial dataset to (1) describe their genetic diversity based on the mitochondrial ND2 gene, (2) reconstruct their Quaternary history, (3) compare these data with phylogeographical patterns of other amphibians and reptiles to infer general biogeographical scenarios in the southwestern Balkans, and (4) to deduce implications for conservation genetics and species protection.

Study species and sampling
We obtained a total of 996 samples of Pelophylax spp. from 230 localities across the Balkan Peninsula, with a focus on Albania and Greece (see Table S1 and Fig.  S1). We sourced DNA from drops of blood or toe clips of varying ages and sources, which had been stored in 96% ethanol. To identify the species in the field, we used the morphological traits described in Speybroeck et al. (2016) and Papežík et al. (2021). Identification of individuals from areas sympatric for two species was subsequently confirmed through microsatellite loci (for details see Supplementary Material 1).

Lab work and sequence alignments
Genomic DNA was extracted from blood or toe clips using NucleoSpin® Tissue kit (Macherey-Nagel, Düren, Germany) following the manufacturer's protocol. Complete NADH dehydrogenase 2 (ND2) gene 1,038 bp, was amplified using combinations of primer pairs L1 and H2 (Plötner et al. 2008). The following PCR program was used: 2 minutes denaturing step at 94°C, followed by 35 cycles consisting of denaturing for 30 seconds at 94°C, primer annealing for 20 seconds at 63°C and elongation for 1 minute at 72°C, with a final elongation step for 10 minutes at 72°C (modified from Plötner et al. 2008). Purification of PCR products was performed using the ExoSAP-IT enzymatic clean-up (USB Europe GmbH, Staufen, Germany) following the manufacturer protocol. PCR products were one-directionally sequenced with L1 primer using Macrogen Europe commercial services (Amsterdam, The Netherlands). The newly generated sequences were deposited in GenBank under accession numbers OP794049 -OP794075 for P. epeiroticus, OP794076 -OP794097 for P. shqipericus, OP794098 -OP794108 for P. ridibundus and OP820591 -OP820697 for P. kurtmuelleri (Table S1).
For genetic analyses, newly obtained sequences were combined with those already deposited in GenBank (Table  S1). New sequences were manually aligned, checked, and translated to amino acids to detect possible stop codons using Geneious Prime v. 2020.2.4 (Biomatters, Auckland, New Zealand). Five different datasets were prepared. Three ND2 datasets of P. epeiroticus, P. kurtmuelleri/P. ridi bundus, and P. shqipericus were prepared for the evaluation of their intraspecific genetic diversity. The fourth ND2 dataset, which includes the unique haplotypes of studied species, with sequences of other western Palaearctic water frog species (P. perezi, P. saharicus, P. lessonae, P. bergeri, P. cretensis, P. cerigensis and P. cf. bedriagae), and P. plancyi, P. nigromaculatus and Rana graeca as outgroups, was used for reconstruction of their phylogenetic relationships. Finally, the fifth ND2 dataset involves two sequences per phylogenetic lineage of each one of the studied taxa and sequences of the other western Palearctic water frogs was used for the estimation of the divergence times.

Phylogenetic analyses and DNA polymorphism
Phylogenetic relationships were inferred using maximum-likelihood (ML) and Bayesian interference (BI) by RAxML v. 8.1.12 (Stamatakis 2014) andMrBayes v. 3.2 (Ronquist et al. 2012), respectively. PartitionFinder 2 (Lanfear et al. 2017) was used to select the best-fit codon-partitioning schemes and substitution models using a greedy algorithm and the Akaike information criterion (AIC). The best-fit substitution model with each codon position treated separately for the BI and ML analysis was as follows: HKY+G (first position), TrN+I+G (second position), GTR+G (third position) for BI and GTR+G (first position), GTR+I+G (second position), GTR+G (third position) for ML. The best ML tree with the highest likelihood was selected from 20 generated trees. The clade support in the ML was assessed by 1,000 bootstrap pseudoreplicates. The BI analysis was set as follows: two separate runs with four chains in each run, 10 million generations sampled every 100 generations. The convergence of both runs was confirmed by the average standard deviation of split frequencies and potential scale reduction factor diagnostics (P < 0.01). The first 20% of trees were discarded as the burn-in after inspection for stationarity of log-likelihood scores of sampled trees in Tracer v. 1.7 (Rambaut et al. 2018). A consensus tree was constructed from the post-burn-in samples.
Molecular dating was performed using the calibration of the known split of P. cretensis that the most likely occurred at the end of the Messinian salinity crisis (~5.5-5.3 Mya) (Beerli et al. 1996;Lymberakis et al. 2007, Dubey and. Divergence times were estimated using 28 sequences of ND2, representing two sequences per phylogenetic lineage that were divergent more than 1% on p distances of the studied species (P. epeiroticus, P. kurtmuelleri, P. ridibundus, P. shqipericus) and represent thus deeper intraspecific structure. To provide well-resolved phylogeny, selected sequences of the genera Pelophylax and Rana from GenBank were used (see Table S2 for GenBank accessions numbers). BEAST 2 package (Bouckaert et al. 2014) was used to show time-measured phylogeny and its credibility intervals. The above-mentioned calibration point (5.4 Mya; 95% HPD 5.9-4.8) was tested together with the model of nucleotide substitution in PartitionFinder 2 under the AIC. Two distinct runs were carried out, each one with four independent chains of 20 million iterations. Initial two million iterations were excluded as burn-in. Results were inspected using Tracer to check the convergence and effective sample size. For the visualization of the final time-calibrated phylogeny we used TreeAnnotator included in BEAST 2 package.
Inter-and intraspecific genetic structure were also analyzed using Principal Component Analysis (PCA) implemented in 'adegenet' package (Jombart et al. 2008) as a part of R statistical environment, v. 4.1.3 (R Core Team 2022).
For a better presentation of the intraspecific evolution, the parsimony network analysis with the algorithm of TCS (Clement et al. 2000) implemented in PopArt v. 1.7 (http://popart.otago.ac.nz) was used.
The nucleotide diversity index (π), the number of haplotypes (h), the haplotype diversity (h d ), the number of segregating sites (S), and Watterson's theta per site (θW) were used to estimate the genetic diversity of each species and intraspecific lineages using DnaSP v. 6.00 (Rozas et al. 2017). The same software was used for average values of uncorrected p distances among species and intraspecific lineages.

Demography
Demographic history was analysed in BEAST 2, using Bayesian Skyline plots (BSP, Drummond et al. 2005). The strict molecular clock with the mean substitution rate of the initial value 0.016 substitution/site/Mya (based on the molecular dating calibration, see above) was set for each detected taxon or intraspecific lineage. Partition-Finder 2 (Lanfear et al. 2017) was used for the best-fit codon-partitioning schemes selection and substitution models under the (AIC). The best-fitting substitutional model was HKY for all tested alignments, i.e., P. epeiroticus and P. shqipericus, and P. kurtmuelleri and P. ridibundus, both tested separately and together as one evolutionary clade P. ridibundus/kurtmuelleri (see results). Each BSP analysis was run twice to check the consistency between the runs. The number of generations varied significantly among analyzed datasets, with values from 20 million to 300 million generations, and were sampled every 1,000 generations. The first 10% were discarded as a burnin. The final BSP, effective sample sizes (ESSs) of parameters >200 and convergence of parameter estimates across runs were prepared and checked with Tracer v. 1.7. with the maximum times as the median of the root height parameter. Historical population expansions in the whole dataset and within detected lineages were tested using Tajima's D (Tajima 1989), Fu's Fs (Fu 1997) and Ramos-Onsins, Rozas's R2 statistics (Ramos-Onsins and Rozas 2002), and pairwise mismatch distribution (MSD) analyses (Rogers and Harpending 1992) in DnaSP v. 6.00. The analyses were carried out with 10,000 coalescent simulations of Tajima's D, Fu's Fs and Ramos-Onsins and Roza's R2.

Species distribution modelling
With the combination of literature and our own field data, we altogether collected 101 records for P. epeiroticus (Sofianidou and Schneider 1989;Haxhiu 1994;Beerli et al. 1996;Sofianidou 1996;Guerrini et al. 1997;Mellado et al. 1999;Casola et al. 2004;Oruçi 2008;Akın et al. 2010;Hofman et al. 2015;Sagonas et al. 2019), 53 for P. shqipericus (Hotz et al. 1985(Hotz et al. , 1987Uzzell 1982, 1983;Haxhiu 1994;Schneider and Haxhiu 1994;Sinsch and Schneider 1996;Guerrini et al. 1997 Radojičić et al. 2015;Polović and Čađenović 2014;Hofman et al. 2015;Frank et al. 2018;Vucić et al. 2018) and 212 records for Balkan populations of P. kurtmuelleri (own field data). The presence coordinates for each species were filtered to remove potentially inaccurate records. Coordinates outside the Balkan Peninsula (sensu Džukić and Kalezić 2004) were discarded. Records outside of the 90% range of a kernel distribution fitted to the data were also discarded using the adehabi-tatHR package (Calenge 2006). Finally, a spatially balanced resampling was performed to account for potential spatial bias by iteratively (n = 1,000) selecting records according to the non-overlapping buffer of 0.05° retaining the maximum sample size. To build the SDMs we used the BIOCLIM variables as environmental response variables, obtained from the WorldClim v. 2.1 database for current conditions (derived from meteorological data recorded between 1970-2000) at a resolution of 2.5 arc minutes (~3.5×4.6 km raster cell resolution at the study area) (Fick and Hijmans 2017). For the LGM (the Last Glacial Maximum) past climate conditions, we used the downscaled climate data at the same resolution as for the current climate (2.5') from simulations with three Global Climate Models (GCMs): (i) CCSM4, (ii) MIROC-ESM and (iii) MPI-ESM-P, which were produced using the bias-corrected WorldClim 1.4 as baseline current climate-based data recorded between ~1960-1990~1960- (Hijmans et al. 2005. Preparation of environmental data and masking to the study area was done using the raster package (Hijmans 2022). Variable selection for each species was performed to reduce multicollinearity and variance inflation in the modelling by calculating the pairwise correlations among the variables, and among each pair of variables correlated above the threshold r>0.7. We excluded the variable with either the highest variance inflation factor (VIF) or the least significant or least informative bivariate relationship with the species presence/pseudo-absence data using the fuzzySim package (Barbosa 2015). We built SDMs for each studied Pelophylax species using the biomod2 package, which allowed us to build ensemble models from several different modelling algorithms, and to project that ensemble model in space and time (Thuiller et al. 2022). We created four presence/pseudo-absence datasets for each species by generating 5,000 random points as pseudo-absences. We used five linear and machine learning model algorithms to reduce uncertainties in different modelling approaches: Generalized Linear Models (GLM), Generalized Boosting Models/Boosted Regression Trees (GBM), Surface Range Envelop (SRE), Flexible Discriminant Analysis (FDA), Maximum Entropy (MaxEnt). We ran four modelling replicates for each presence/pseudo-absence dataset with a data split to 70% training and 30% testing data. To evaluate the individual model replicates we used the True Skills Statistic (TSS) metric, models were included in the ensemble model above TSS>0.7. Models building were done using current environmental conditions and ensemble models were projected for the LGM by projecting ensemble models for each GCM separately. Finally, the LGM projections for each species were combined by calculating the mean of the projections and were set to binary by the threshold of 0.8 suitability index. All steps of generating SDMs were done in R v. 4.1.3.

Phylogeny and phylogeography
Combining newly produced and available GenBank data, we obtained a dataset of 1,088 ND2 sequences (alignment length 1,038 bp) from four water frog taxa: P. epeiroticus (127 sequences), P. kurtmuelleri (780), P. ridibundus (112), and P. shqipericus (69). Both trees, ML and BI have identical topologies regarding the three major clades (P. epeiroticus, P. shqipericus and P. ridibundus/kurtmuelleri) with lnL = -7188.48 and -7744.64, respectively. Phylogenetic trees as well as haplotype networks and PCAs (Figs 1, S1) showed deep differentiation in both P. epeiroticus (genetic distance 2.1%) and P. shqipericus (genetic distance 1.6%) with two intraspecific lineages, named, due to their geographic positioning, Northern and Southern (Figs 2, 3). Moreover, possible third lineage of P. epeiroticus were detected in both phylogenetic network and PCA (Figs 1, 2). In contrast, the genetic differentiation within and between kurtmuelleri and ridibundus lineages (Fig. 4) was low and formed closely related clades in phylogenetic trees. Genetic differentiation of these clades was lower than between the Northern and Southern lineages in P. epeiroticus and P. shqipericus. (Table 1, Fig. 4). Thus, we will further call these three focal clades as P. epeiroticus, P. shqipericus and P. ridibundus/kurtmuelleri.
According to phylogenetic analyses and molecular dating (Fig. 5), studied taxa of the genus Pelophylax formed two main groups that diverged at 9.8 Mya (11.4-8.2 of 95% HPD). The first group was formed by species P. shqipericus, P. bergeri and P. lessonae, the second group by species P. epeiroticus, P. cretensis, P. bedriagae, P. cypriensis, P. ridibundus/kurtmuelleri. The Northern and Southern lineages in P. shqipericus diverged at 0.8 Mya (1.2-0.5 of 95% HPD), the Northern and Southern lineages in P. epeiroticus at 0.9 Mya (1.3-0.6 of 95% HPD), and the estimated split between kurtmuelleri and ridibundus lineages dated back to 0.6 Mya (0.9-0.4 of 95% HPD).  In P. epeiroticus 31 unique haplotypes in 127 sequences with h d = 0.87 and π = 1.14% (Table 1) were recognized. The Northern lineage is distributed in southern Albania, north-western Greece, down to the Ambracian Gulf (Fig.  2) and it is formed by 13 haplotypes (h d = 0.79 and π = 0.22%; Table 1). The Southern lineage is distributed from the Ioannina Lake, Ambracian Gulf, southwestern Greece, and along the north-western coast of the Peloponnese to Kaiafa Lake (Fig. 2). This lineage is represented by 18 haplotypes (h d = 0.72 and π = 0.23%; Table 1). Uncorrected mean p distance between the lineages is 2.09%.
Pelophylax shqipericus is represented by 69 sequences of ND2 with 25 unique haplotypes, h d = 0.91 and π = 0.92% (Table 1). Its Northern lineage is distributed around Lake Skadar and along the northern Adriatic coast of Albania (Fig. 3) with the southernmost locality detected in Nishaj. This lineage contains 11 haplotypes (h d = 0.77, π = 0.21%; Table 1). The Southern lineage is distributed from Velipojë, south to Orikum where is the southern edge of the species range (Fig. 3), although one individual of this lineage was detected also in Virpazar, Montenegro. The Southern lineage is represented by 14 haplotypes (h d = 0.93 and π = 0.43%; Table 1). Both lineages were found in sympatry at the following localities: Nishaj, Velipojë, Virpazar, and Torovicë. The uncorrected mean p distance between the two lineages is 1.59%.
In the P. ridibundus/kurtmuelleri clade 141 unique haplotypes were identified among 892 sequences, with h d = 0.92 and π = 0.63%. In the kurtmuelleri lineage, we identified 126 haplotypes among 780 sequences (h d = 0.91, π = 0.43%). We detected haplotypes of the kurtmuelleri lineage in Albania, Greece, Northern Macedonia, Bulgaria, Kosovo, Montenegro, Serbia, Croatia, Bosna and Hercegovina, Romania, Ukraine (collected in Kyiv and confirmed by J. Plötner, pers. comm. 2022, GenBank accession number AM900651), Germany, Lithuania, Latvia, Russia, Italy, and France. Only a weak phylogeographic structure was observed in the kurtmuelleri lin- Figure 3. The phylogenetic position, haplotype network, and distribution of detected lineages and haplotypes of Pelophylax shqipericus. Symbol sizes in the haplotype network reflect sequence frequencies. Small black circles are missing node haplotypes, each line connecting two haplotypes corresponds to one mutation step, if not otherwise indicated the number along the line. The shaded area represents the species range. The colour scheme of the haplotype networks and species ranges corresponds with the one used in PCAs (Fig. 1) and the molecular phylogeny (Fig. S1). eage (Figs 4, S2). In contrast, ridibundus lineage showed much lower genetic variability with only 15 haplotypes identified in 112 sequences, and with h d = 0.42 and π = 0.05%. Haplotypes of ridibundus lineage were detected in Greece, Bulgaria, Serbia, Croatia, Romania, Slovakia, Poland, Russia, Italy, and France (Fig. 4). Uncorrected mean p distance between haplotypes representing kurtmuelleri lineage and ridibundus lineage was 1.37%.

Demography
Bayesian skyline plot (BSP) for P. epeiroticus (Fig. 6) showed population expansion started ~25 Kya. On the other hand, the mismatch distribution (MSD) and the neutrality tests did not show significant evidence for population growth (Table 1). When demographic analyses were carried out for the Northern and the Southern lineage separately, the Northern lineage of P. epeiroticus showed population growth starting at ~25 Kya according to BSP (Fig. 6) but population stability according to MSD and neutrality tests. In the Southern lineage, BSP (Fig. 6) showed moderate expansion in population size (starting at ~75 Kya) supported also by MSD neutrality tests.
In P. shqipericus, population expansion based on BSP started approximately at ~25 Kya. MSD and neutrality tests, however, showed no significant population growth (Table 1). In the Northern lineage, BSP showed population expansion starting at ~35 Kya, however, MSD and neutrality test did not corroborate the expansion model. Similarly, BSP of the Southern lineage assumed recent expansion, starting at ~25 Kya, MSD and neutrality tests were non-significant.
Analysing both ridibundus/kurtmuelleri lineages together, the populations started the expansion ~30 Kya. The kurtmuelleri lineage alone started the expansion at ~30 Kya, followed by another steep expansion at ~12 Kya according to BSP (Fig. 6). This result is further supported by MSD and neutrality tests that both showed statistically significant evidence of population expansion (Table 1). The BSP of the ridibundus lineage showed population expansion starting ~17 Kya, which is consistent with MSD, where observed values mirrored expected values in the growing or declining population model (Fig. 6). In addi- Figure 4. Geographical range of the clade P. ridibundus/kurtmuelleri, haplotype network, distribution of detected lineages, and their haplotypes. For the detailed phylogeographic structure of kurtmuelleri haplotypes see Fig. S2. Symbol sizes in the haplotype network reflect sequence frequencies. Small black circles are missing node haplotypes, each line connecting two haplotypes corresponds to one mutation step. An uncertain contact zone between detected lineages/haplotypes is indicated by question marks. The black arrow represents the affiliation of the introduced population of the P. ridibundus/kurtmuelleri clade in southern Italy based on 517 bp only (MK116532). The colour scheme of haplotype networks and species ranges corresponds with the one used in PCAs (Fig. 1) and the molecular phylogeny (Fig. S1).   tion, also the neutrality test showed significant evidence of population expansion (Table 1).

Current and past distribution models
From the initial 366 occurrence data points (27 unique localities for P. epeiroticus, 20 for P. shqipericus and 198 unique localities for P. kurtmuelleri), 266 (58 for P. epeiroticus, 41 for P. shqipericus, and 167 for P. kurtmuelleri) occurrence data points were included in the final models. From the predictors after the variance inflation factor (VIF) analysis, we ended up with six, seven, and six, respectively, out of 19 initial predictors. For P. epeiroticus these were bio1 (annual mean temperature), bio2 (mean diurnal range -max/min temperature, monthly average), bio3 (isothermality), bio4 (temperature seasonality), bio8 (mean temperature of the wettest quarter), and bio13 (precipitation of wettest period) (Fig. S3). For P. shqipericus bio1, bio3, bio4, bio8, bio9 (mean temperature of the driest quarter), bio14 (precipitation of driest period), and bio15 (precipitation seasonality). For P. kurtmuelleri bio1, bio2, bio8, bio9, bio15, bio19 (precipitation of coldest quarter). The models under current climate conditions had robust evaluations metrics for both P. epeiroticus and P. shqipericus, but in the case of P. kurtmuelleri the algorithms not performed well (Fig. S3). The spatial projection of ensemble models corresponds well with the known range of the studied species (Figs 1, 2, 4). All three models of the LGM provided similar projections with the most suitable areas for P. epeiroticus and P. shqipericus in SW Balkans, and in the central and southern Balkans for P. kurtmuelleri. The LGM models for all taxa exhibit a potentially larger area of distribution than in the current model and observed occurrence (Fig. 7). The currently observed non-overlapping ranges were suitable for studying taxa also during the LGM according to the model predictions.

Discussion
Water frogs of the genus Pelophylax are among the most studied European amphibians (e.g., Plötner 2005). Although many studies focused on their unique reproduction mode associated with hybridogenesis, available data on their genetic diversity, or evolutionary history on a broad-geographic scale, are regionally limited, scattered in the literature, or missing. This is also a case of endemic Pelophylax taxa from the Balkan Peninsula, as they were studied only marginally from the range-wide phylogeographic point of view (Lymberakis et al 2007;Hofman et al. 2015;Vucić et al. 2018).

Historical biogeography
Our single gene molecular dating showed that the divergence of studied water frog taxa occurred between 9.8 and 2.5 Mya and was probably related to geological and climatic changes in the Balkans. During the Miocene, the opening of the mid-Aegean trench (~9-12 Mya) and the Messinian Salinity Crisis (MSC; ~5.3 Mya) were both well-discussed events as major factors of the evolution of local biota, including herpetofauna Lymberakis and Poulakakis 2010). These events lead to the isolation of Crete from the mainland, i.e., Peloponnese (Paulakakis et al. 2015) and the local biota which occurred there. Although some authors hypothesized the isolation of Crete already during the upper Miocene (~9 Mya) and used it as a calibration point for molecular dating , the isolation of the island associated with the end of the MSC is well reviewed (Dermitzakis 1990;Poulakakis et al. 2015) and used in similar studies associated to the evolution of Mediterranean vertebrates Dufresnes et al. 2018a;Spilani et al. 2019;Kiourtsoglou et al. 2021). We thus decided to select the end of MSC as a calibration point isolating P. cretensis from other congeners. Applying this calibration point, we showed that intraspecific divergence in P. shqipericus, P. epeiroticus and P. ridibundus/kurtmuelleri occurred in Pleistocene (0.9 to 0.6 Mya) and might have been connected to the Early-Middle Pleistocene transition (EMPT; 1.4-0.4 Mya). The EMPT was a period of significant climatic turnover in the western Palearctic Gibbard 2005, 2015;Maslin and Brierley 2015) leading to the increased amplitude of climatic oscillations during the Pleistocene. These changes resulted in an overall widespread temperature decrease, ice volume increase, sea-level fluctuations, aridification, and habitat changes (Kahlke et al. 2011), especially during the so-called "0.9 Mya event" (Strani et al. 2019). The climatic shift reached its peak at approximately 0.9 Mya with an extremely dry climate during both glacial and interglacial periods. At the late stages of the EMPT (0.9-0.4 Mya) intensive cold periods with lowland glaciation occurred, which were followed by warmer periods, with increasing humidity leading to a variety of habitats and the development of forested landscapes (Suc et al. 1995;Bertini 2000;Kahlke et al. 2011;Head and Gibbard 2015). The impact of EMPT on the faunal composition is well-known for mammals (e.g., Kahlke et al. 2011;Strani et al. 2019), but poorly studied in other vertebrates, such as amphibians (Blain et al. 2018).
Another paleoclimatic event that has significantly affected the genetic diversity and range dynamics of the biota has been the Last Glacial Maximum (LGM; ~21 Kya; Hewitt 1999Hewitt , 2011. The long-term in situ survival of populations in refugial areas has led to high genetic diversity and population differentiation. Subsequent rapid postglacial recolonization of the northern parts of Europe from refugia resulted in lower genetic diversity of the colonizing populations. This pattern of "the southern richness" and "the northern purity" (Hewitt 2004) is apparent also in studied water frogs. Based on the dispersal ability of a species and the opportunities for post-glacial expansion to the north, a lineage or species can be characterized either as a post-glacial re-colonizers (Hewitt 1999(Hewitt , 2000 or as a refugial endemics with limited expansion poten-tial (Bilton et al. 1998). While the Balkan species/populations of Pelophylax show higher genetic diversity, ridibundus lineage, which colonized most of Europe, seems to be genetically uniform according to our data (Figs 1-4). Similarly, the haplotype diversity of kurtmuelleri lineage in the Balkans seems enormous compared to its haplotype diversity found in northerly located European populations (Figs 4, S2). Pelophylax epeiroticus and P. shqipericus thus represent refugial endemics, while P. ridibundus/kurtmuelleri might be characterized as post-glacial re-colonizers.
Based on the distribution of the haplotype diversity in the Balkans, it could be hypothesized that the microrefugial model might be applied to the kurtmuelleri lineage. Some haplogroups of recent origin have limited distribution and therefore we can assume they could have survived the LGM in Dalmatia ("orange" haplogroup), Peloponnese ("pink" haplogroup), coastal Albania ("green" haplogroup) or Albania and western Greece ("red" and "blue" haplogroups). Unfortunately, the geographic pattern of several haplogroups is mixed, a fact that complicates the detection of microrefugia. The Pleistocene history of the most widely distributed "yellow" haplogroup remains also unclear (Figs 4, S2). If the "yellow" haplogroup had limited distribution during the LGM, it had to expand rapidly (as suggested by demographic analyses) not only to other parts of the Balkans but also to vast areas of western, eastern, central, and northern Europe.
It seems that during the LGM, the southwestern Balkans hosted favourable conditions for water frogs and their distribution did not significantly change (Fig.  7). Considering details from haplotype networks of P. epeiroticus, the south-western continental Greece and northern Peloponnese were probably the suitable areas for the Southern lineage and north-western Greece for the Northern lineage (Fig. 2). In P. shqipericus, we can estimate the location of these areas for the Northern lineage based solely on the haplotype network, lying in the Skadar Lake lowland (Fig. 3). The Skadar Lake system represents the area of local endemism and harbours unique lineages in several animal groups (Grabowski et al. 2014;Pešić and Gloër 2018;Jabłońska et al. 2020) and thus, we might expect that this area had an impact also on the evolutionary history of the Northern lineage. However, the suitable area of the Southern lineage remains questionable. Due to the limited range of the Southern lineage, the whole Adriatic coast of Albania might represent refugium itself for this population. Although this uncertainty might be also a result of sampling bias or extinction of populations with peripheral haplotypes. On the other hand, populations of the kurtmuelleri lineage diverged probably in several microrefugia scattered across the peninsula, especially in the southwestern part (Fig. S2), as mentioned above. Observed resolution may suggest historical population admixture that is unprecedented in the observed phylogeographic patterns of studied Balkan amphibians (e.g., Sotiropoulos et al. 2007;Fijarczyk et al. 2011;Dufresnes et al. 2013;Jablonski et al. 2021). It calls, however, for further research where the influence of the rapid natural dispersion/invasion with a possible human-me-diated introduction (see Dufresnes et al. 2018b) should be tested (see Conservation genetic implications). In the case of the ridibundus lineage, insufficient sampling covering only limited parts of the whole range and shallow mitochondrial structure (single haplogroup) through the species range suggest possible rapid colonization, but localization of glacial refugia makes rather impossible.
Except for the populations of the P. ridibundus/kurtmuelleri clade, the endemic species showed range stability or even retraction during the LGM (see also Fig. 7). In contrast, according to BSP of P. epeiroticus, both lineages showed population growth starting at approximately between 20-25 Kya. Based on neutrality tests, the Southern lineage assumes population expansion, whereas no sign of population expansion in the Northern lineage was found (see also Fig. 2). The Northern lineage is endemic to the area between southern Albania and the Ambracian Gulf, including Corfu Island which was several times connected and separated from the mainland during the last 1 Mya (Perissoratis and Conispoliatis 2003). Thus, we expect that this area could represent a long-term microrefugium of the Northern lineage. If we admit the possibility of expansion of the Southern lineage (according to BSP and neutrality tests), it would be probably along coastal areas of the Peloponnese and southern continental Greece up to Ioannina Lake. In the Ioannina Lake and Ambracian Gulf, both lineages were detected in sympatry. For P. shqipericus, population growth started approximately between 35 and 25 Kya in the Southern and Northern lineages, respectively. The statistical sign of the expansion is evident in the Southern lineage forming a contact zone with the Northern lineage in northern Albania and possibly Skadar Lake (see Fig. 3). The Northern lineage is distributed mostly in the vicinity of the Skadar Lake, including the type locality of the species in Virpazar. The Southern lineage covers most of the Adriatic coast of Albania, i.e., from Orikum to the border region between Albania and Montenegro. One individual of the Southern lineage was surprisingly recorded also further north, near Virpazar, Skadar Lake. If this record is a result of natural dispersion, denser sampling in the future should detect the mixed pattern of these two lineages around Skadar Lake. Our results suggest mostly pre-LGM population growth in both P. epeiroticus and P. shqipericus, with the evidence of further rapid post-glacial growth. As the expected times of population growth are close to the LGM period (21 Kya), results might reflect favourable conditions during the LGM for the frog population growth, although conditions differ significantly. On the contrary, southern parts of the Balkan were probably unaffected by glacials and might host favourable conditions for the population growth of water frogs even when environmental conditions in other parts of Europe were not suitable.
In the kurtmuelleri lineage, the first population growth started at approximately 30 Kya (possibly inside the Balkans only), following by the rapid expansion after the LGM (rest of the currently uncovered European range; Fig. 4 and S2). The ridibundus lineage also shows population growth and expansion corresponding approximately with the post-LGM period. The scenario of the expansion of the kurtmuelleri lineage is supported by the presence of kurtmuelleri haplotypes in the northern part of the peninsula and rarely also in central, northern, and eastern Europe. However, its frequency is decreasing northwards as is evident from own and published data (Herczeg et al. 2017;Kolenda et al. 2017;Litvinchuk et al. 2020). Kolenda et al. (2017) predict the occurrence of kurtmuelleri haplotypes and alleles in the geographic region between the northern Balkans and Poland. This is now confirmed by the presence of kurtmuelleri haplotypes in Ukraine and Germany (Figs 4 and S2). According to Litvinchuk et al. (2020), after the LGM, haplotypes associated with kurtmuelleri lineage possibly expanded north along lowlands in Romania, Serbia, or Hungary. During this expansion, haplotypes and alleles of the kurtmuelleri lineage dispersed to large distances across central, eastern, and northern Europe and met with ridibundus-specific haplotypes and individuals with different mtDNA and nDNA possibly hybridized. However, we cannot exclude this pattern could be also influenced by local human-mediated introductions, similar to western Europe or even northern Africa (e.g., Plötner 2005;Dubey et al. 2014;Dufresnes et al. 2017).

Contrasting genetic diversity
Comparisons among three water frog clades (P. epeiroticus, P. shqipericus and P. ridibundus/kurtmuelleri) of the southwestern Balkans revealed significant differences in their intraspecific genetic diversity. For example, both P. epeiroticus and P. shqipericus formed two, diverged and well-supported lineages (Figs 2, 3). Moreover, the presence of a possible third, separated lineage in P. epeiroticus (needing further attention) was detected particularly using PCAs in northern and central Greece (Fig.  1). However, due to the limited sample size of these individuals, we include them in the Southern lineage. The high nucleotide diversity was observed in both endemics, i.e., species with much smaller ranges compared to kurtmuelleri lineage (Figs 2, 3). However, it supports the hypothesis that the southwestern parts of the Balkans are sources of genetic diversity of endemic species/evolutionary lineages showing limited geographic dispersions (e.g., Marzahn et al. 2016;Psonis et al. 2018). For example, endemics to the southwestern Balkans, Triturus macedonicus or Anguis graeca, show a higher level of genetic diversity compared to other taxa of these genera from temperate Europe Wielstra and Arntzen 2020). This contrast could be connected to the position of areas of long-time survival where lineages or haplotypes evolved, as well as to the natural history of taxa and their capacity for ecological and geographical dispersions or adaptability. A similar pattern could be found also in widespread species in the Balkans (see Sotiropoulos et al. 2007;Pabijan et al. 2014;Jablonski et al. 2016, Wielstra andArntzen 2020). In P. kurtmuelleri, populations in northern and central Europe fall clearly into a single haplogroup ("yellow"), while in the Balkan Peninsula it forms a much more diverged pattern (Fig.   S2). Most of the members of the genus Pelophylax are well known to be associated with water bodies which is probably a limiting factor of their dispersion to the landscape (Mikulíček and Pišút 2012). For example, it is very rare to observe P. epeiroticus out of the water (Sofianidou 1996;Sofianidou and Schneider 1989). Moreover, at least P. epeiroticus has a limited vertical distribution that is up to 500 m a. s. l. (Sofianidou and Schneider 1989). Similarly, the range of P. shqipericus is also limited to lowland areas of Montenegro and Albania. In this sense ca 70 km wide distribution gap between both studied endemic clades in southern Albania is notable. In that gap, the Ceraunian Mountains (up to 2000 m a. s. l.), rising directly from the seacoast, are situated and support the hypothesis of environmental limitations for the population expansion of P. epeiroticus and P. shqipericus due to natural barriers. Thus, we expect that these specific traits may be behind the distribution and the intraspecific genetic difference observed in our data, i.e., limited dispersion and long-term in situ persistence of populations.
On the other hand, the kurtmuelleri lineage is known to be able to live for a certain time out of the water (Sofianidou and Schneider 1989) and is also observed in high elevations (e.g., more than 2,000 m; Szabolcs et al. 2017). It suggests a significantly better dispersion capacity of this lineage (cf. SDM of the kurtmuelleri lineage vs. both endemic taxa, Fig. 7). According to our data, the kurtmuelleri lineage has rich haplotype diversity (126 haplotypes vs. 31 in P. epeiroticus and 25 in P. shqipericus) that geographically present a complex pattern (Figs 4, S2). Overall genetic (nucleotide) variability of the lineage is, however, shallow (see also Vucić et al. 2018) with low nucleotide diversity reflecting isolation in short-lasting time and space windows.
Since the description of P. kurtmuelleri, the taxon was considered the Balkan endemics based on the morphometry (Schneider et al. 1993), bioacoustics (Schneider and Sinsch 1992;Schneider et al. 1993;Lukanov et al. 2015) and supposed inability to induce hybridogenesis in interspecific hybrids (Hotz et al. 1985;Berger et al. 1994;Sinsch and Eblenkamp 1994). On the other hand, P. kurtmuelleri shows only weak morphometrical (Papežík et al. 2021), and genetic differentiation from P. ridibundus in mtDNA (Lymberakis et al. 2007;Hofman et al. 2015;Vucić et al. 2018) as well as nDNA (SAI-1 gene; Kolenda et al. 2017;Papežík et al. 2021). The comparison of full mitogenome sequences showed the same conclusions (about 1% of divergence) where the species level distance among other taxa is more than 5% (Hofman et al. 2015). Thus, it is not surprising that the taxon is considered conspecific for a long-time with P. ridibundus (Speybroeck et al. 2010) and Balkans populations currently recognized under the name P. kurtmuelleri presumably representing a closely related lineage. This is also supported by our data as the genetic distance between these two lineages is less (1.4%) than the same distance between lineages detected inside both studied endemic species, P. epeiroticus (2.1%) and P. shqipericus (1.6%). Low levels of genetic divergence are reflected also in PCAs and molecular dating (see above and Figs 1 and 5). Our improved view thus supports the hypothesis that the independent evolution of the ridibundus and kurtmuelleri mitochondrial lineages seems to be young, starting probably in the Pleistocene (Lymberakis et al. 2007). We could conclude that the mitochondrial variability (our data and Hofman et al. 2015) does not support the species status of P. kurtmuelleri and its taxonomic distinctiveness from P. ridibundus. Nevertheless, the description of hybrid zones and the rate of genetic admixture between these evolutionary lineages should be in the scope of further research (see Dufresnes and Mazepa 2020).
Apart from the natural range, haplotypes and alleles of kurtmuelleri lineage were recorded in several European countries in the last decades (e.g., Herczeg et al. 2017;Kolenda et al 2017;Bruni et al. 2020;Litvinchuk et al. 2020). Kolenda et al. (2017) proposed three hypotheses for the origin of P. kurtmuelleri haplotypes in central and northern Europe: (1) recent introduction followed by hybridization of autochthonous and allochthonous populations, (2) incomplete lineage sorting or, (3) hybridization before or during the post-glacial expansion. Some of the above-mentioned reports represent well-documented introductions of individuals from non-native kurtmuelleri lineage, especially to western Europe (Dubey et al. 2014;Dufresnes et al. 2017Dufresnes et al. , 2018bBruni et al. 2020). In contrast, such introductions to multiple countries in central, northern, and eastern Europe are not reported. Also, ancestral polymorphism, characteristic by admixture of mtDNA between two taxa regardless of distance from secondary contact of both taxa should carry both types of mtDNA (Toews and Brelsford 2012). However, to the best of our knowledge, no individuals of kurtmuelleri lineage with ridibundus specific mtDNA are known. On the contrary, in the case of hybridization and introgression, one of the involved taxa should keep its mtDNA, while the rate of this mtDNA in another taxon decreases with distance from secondary contact (Toews and Brelsford 2012). This seems to be a case of kurtmuelleri lineage haplotypes and alleles as they seem to be quite rare in central, northern, and eastern Europe. Our data thus rath-er support the natural colonization of temperate Europe along lowlands in eastern Balkans and eastern Europe as proposed by Kolenda et al. (2017) and Litvinchuk et al. (2020).

Conservation genetic implications
Populations of amphibians are in decline on a global scale with various factors influencing their extinction (reviewed by Hayes et al. 2010). Even though genetic factors affect population extinction on a long-term scale (Frankham 2005), the loss of genetic diversity can increase extinction risk by insufficient adaptability to environmental changes and by fitness reduction (Spielman et al. 2004;Allentoft and O'Brien 2010). Here, we emphasize that the southwestern Balkan region is an important centre of genetic diversity of amphibians in the Palearctic with implications for conservation.
In the southwestern Balkans, P. epeiroticus and P. shqi pericus represent endemic species with a limited range of several thousand square kilometers (IUCN SSC Amphibian Specialist Group 2020a,b; Jablonski et al. 2021). Both species are known to be threatened by habitat loss and degradation, and increasing levels of pollution of water bodies due to human overpopulation (IUCN SSC Amphibian Specialist Group 2020a,b; Pafilis and Marangou 2020; Saçdanaku et al. 2020;Crnobrnja-Isailović et al. 2022). This applies mostly to P. shqipericus which is present in the Adriatic coastland and lowlands of Albania that were transformed from swamps and marshes into the desiccated extensive agricultural landscape during the last decades (Haxhiu 1994;Frank et al. 2018;Saçdanaku et al. 2020). Moreover, both species are also collected for consumption by restaurants, the food industry, and by local people, even during the breeding season (Fig. S4.;Frank et al. 2018, Pafilis andMarangou 2020). Due to over-collecting for frog legs trade, P. shqipericus was even considered for listing in CITES (Gratwicke et al. 2010). Pelophylax epeiroticus is also still consumed in local restaurants. However, the assessment of both species in the IUCN Red List category was recently lowered to Near Threatened in P. epeiroticus and Vulnerable in P. shqipericus, respectively (Dufresnes 2019; IUCN SSC Amphibian Specialist Group 2020a,b). With respect to detected endemic genetic diversity discovered in both species and their possible increasing vulnerability, we appeal to a re-evaluation of their IUCN Red List assessment. Similarly, both species are listed in Appendix III of the Berne Convention, and they received only a lower level of protection but lack specific law protection in Albania and Greece. Thus, our study may also serve as a breakpoint for species conservation and management. This is especially true for Albania, where both endemic species occur. Limited ranges of both species together with the genetic intraspecific variability may lead to the increased threat of irreversible loss of their genetic diversity. Or such loss would reduce intraspecific diversity and could lead to an increased risk of possible local extinctions. Habitat alteration and anthropogenic pressure are strong factors in areas where these taxa occur. Thus, the formation of specific areas where these taxa and their genetic variability will be protected is necessary. Therefore, this new view should be considered by the conservation legislation, especially in Albania and Greece.