Benchmarking of structural variant detection in the tetraploid potato genome using linked-read sequencing

It has recently been shown that structural variants (SV) can have a higher impact on gene expression variation compared to single nucleotide variants (SNV) in different plant species. Additionally, SV were associated with phenotypic variation in several crops. However, compared to the established SV detection based on short-read sequencing, less approaches were described for linked-read based SV calling. We therefore evaluated the performance of six linked-read SV callers compared to an established short-read SV caller based on simulated linked-reads in tetraploid potato. The objectives of our study were to i) compare the performance of SV callers based on linked-read sequencing to short-read sequencing, ii) examine the influence of SV type, SV length, haplotype incidence (HI), as well as sequencing coverage on the SV calling performance in the tetraploid potato genome, and iii) evaluate the accuracy of detecting insertions by linked-read compared to short-read sequencing. We observed high break point resolutions (BPR) detecting short SV and slightly lower BPR for large SV. Our observations highlighted the importance of short-read signals provided by Manta and LinkedSV to detect short SV. Manta and NAIBR performed well for detecting larger deletions, inversions, and duplications. Detected large SV were weakly influenced by the HI. Furthermore, we illustrated that large insertions can be assembled by Novel-X. Our results suggest the usage of the short-read and linked-read SV callers Manta, NAIBR, LinkedSV, and Novel-X based on at least 90x linked-read sequencing coverage to ensure the detection of a broad range of SV in the tetraploid potato genome.


Introduction
Structural variants (SV) are commonly defined as genomic rearrangements between individuals or haplotypes that are larger than 49 bp [19].SV can occur as deletions, insertions, duplications, inversions, or translocations in the genome.SV were more strongly associated with gene expression variation compared to single nucleotide variants (SNV) in human [9] and were also associated with transcript abundance in crops such as maize and tomato [2,57].Additionally, SV were associated with phenotypic variation in several plant species such as wheat and rice [32,39,56].In potato, copy number variation at a limited number of loci was associated with the level of gene expression [23].
Due to the technical improvements of DNA sequencing and novel algorithms [19], it is nowadays possible to detect and characterize SV on a genome-wide level.SV detection based on short-read sequencing is well established in human genomics [4,29] and was also evaluated and used recently for plant genomes [16,17].However, the reliable detection of SV based on short-read sequencing is challenging due to the necessity of confidently mapped read-pairs [14].Additionally, repetitive regions are associated with the occurrence of SV [21], where split and paired-end reads can have a low mapping quality due to multi-mapping [14].These issues can be avoided by using long-read sequencing [11].However, this approach in turn is associated with high costs and therefore, it is less efficient in breeding-related applications.
Recently, linked-read sequencing was proposed [50,51].For linkedread sequencing, paired-end short reads are derived from 50 to 100 kb DNA molecules [12], which is at least as long as the read length of most long-read sequencing approaches (cf.[53]).During the library preparation process, around ten molecules are partitioned into droplets where each DNA fragment (500 bp) derived from these molecules is tagged with a 16 bp long barcode.Due to the random partition of molecules, the likelihood of assigning the same barcode to two molecules from nearby regions in the genome is very low [12].Therewith, linked-read sequencing provides long-range information as long-read sequencing [19] and has the advantages of a high accuracy and low costs as shortread sequencing [51].However, compared to the established SV detection based on short-read sequencing, less approaches have been described and evaluated for linked-read based SV calling.
Eight linked-read SV callers were described until today, namely LongRanger [58], GROC-SVs [46], NAIBR [12], ZoomX [55], LinkedSV [14], Novel-X [35], VALOR2 [24], and LEVIATHAN [37].LongRanger identifies paired-end reads with overlapping barcodes between distant loci.GROC-SVs works similarly to LongRanger with the addition of SV reconstruction using local assemblies.NAIBR exploits discordant pairedend read and split molecule signals in a probabilistic model.ZoomX uses molecule coverage to identify large genomic rearrangements in the human genome.LinkedSV uses short-read signals as read depth, discordance of paired-end reads, and local assembly to detect short deletions.In addition, this tool uses fragments with shared barcodes between two genomic locations and enriched fragment endpoints near break points to detect larger SV [14].Novel-X assembles unmapped reads associated with barcodes and maps the resulting contigs to the reference sequence.VALOR2 identifies submolecules using split molecule signals based on barcode information and filters SV candidates using read depth and paired-end read signals.LEVIATHAN identifies a number of shared barcodes in specific regions and secondly, discordant paired-end and split read signals are then used to filter SV candidates (for review see [19]).
With the exception of LEVIATHAN, all of the above mentioned SV callers were up to now only evaluated for SV detection in the human genome.LEVIATHAN was also evaluated for SV detection in the butterfly (H.numata) genome [37].To our knowledge, no study is available where SV detection using linked-read sequencing is evaluated for plant species despite the differences between the plant and human genome with respect to genome size, repeat content, or ploidy.Furthermore, no earlier study evaluated SV calling for an autotetraploid genome and which examined the effect of the haplotype incidence (HI) on SV detection.Additionally, the detection of SV in the tetraploid potato genome is of high interest, due to the potential usage of SV as genetic markers in genome-wide association studies [36] or genomic selection [47,54] to increase the gain of selection in this important crop species.
Therefore, the objectives of our study were to i) compare the performance of SV callers based on linked-read sequencing to that of shortread sequencing, ii) examine the influence of SV type, SV length, HI, as well as sequencing coverage on the SV calling performance in the tetraploid potato genome, and iii) evaluate the accuracy of detecting insertions by linked-read compared to short-read sequencing.

Simulation preparation and genome mutation
We used Mutation-Simulator (version 2.0.3)[30] to simulate deletions, duplications, inversions, and insertions in the first and second chromosome of the dAg1_v1.0potato reference sequence [15] which is a consensus sequence of the two haplotypes of a diploid clone derived from the commercially important potato variety Agria.We considered five SV length categories for each of the above mentioned SV types (A: 50-300 bp; B: 0.3-5 kb; C: 5-50 kb; D: 50-250 kb; E: 0.25-1 Mb).Mutation-Simulator was used with the mutation rates of 7.0x10 − 6 (~ 800-1000 SV) for the SV length categories A -C, 7.0x10 − 7 (~ 90 SV) for D, and 3.5x10 − 7 (~ 45 SV) for E.
In a first step, simulations on a homozygous level were performed where the SV were present in all four haplotypes (4/4) of the simulated potato genome.In addition to the homozygous level, we simulated heterozygous SV with HIs of one to three (if SV occurs in one, two, or three haplotypes).To do this, a custom python script was used to prepare heterozygous SV for simulations, where the SV was only present in one of the four haplotypes (1/4).Which of the four haplotypes received the SV was randomly determined for each SV.The same procedure was used to simulate SV in two out of four (2/4) as well as three out of four (3/4) haplotypes.For each heterozygous SV simulation, the total number of simulated SV corresponded to that of the above described homozygous simulation of the specific SV type and SV length category combination.Simulations for each SV type* SV length category* HI combination were replicated five times.
In addition to the simple simulations explained above, where the SV types, SV length categories, and HIs were simulated separately, we performed complex simulations (Fig. 1).In these complex simulations, different SV types, SV length categories, and HIs were simulated together to mimic more closely experimental potato genome sequences.Additionally, 80,000 single nucleotide variants (SNV) and 600 short insertions and deletions (INDELs, 2-49 bp) were included.The numbers of SV for each SV type (464 deletions, 464 insertions, 124 duplications, 108 inversions) and SV length category were chosen based on the average number of SV observed in experimental data for 100 tetraploid potato clones (Baig et al., in preparation).For each SV type and SV length category, 25% of SV were simulated for each of the four different HIs.The complex simulations were replicated 20 times.

Linked-read simulation and mapping
LRSim (version 1.0) [33] was used to simulate linked-reads with the following parameters (-f 50 -t 20 -m 10) with a sequencing coverage of 45x, 90x, 135x, and 180x resulting in a sequencing coverage per haplotype of about 11x, 22x, 34x, and 45x, respectively.The mean molecule size was set to 50 kb, the molecules per partition to 10 and the number of partitions to 20,000 as it was recommended by Luo et al. [33] for Arabidopsis thaliana which have a similar genome size as the first two chromosomes of the dAg1_v1.0reference sequence [15].Linked-reads were mapped against the non-mutated dAg1_v1.0reference sequence with LongRanger wgs (version 2.2.2).
In the next step, the detected SV were filtered.A SV call was only kept if it passed the built-in filters of the respective SV caller.SV calls which were annotated as"BND" were filtered out.SV calls which covered regions in the reference sequence consisting of N's were filtered out as well.Additionally, for some SV callers additional filter criteria were applied: for LongRanger, SV calls with the annotation"UNK", which is defined as unknown SV type, were not considered.Additionally, for LinkedSV and Manta where each inversion was called twice, only one inversion entry was kept to avoid incorrect statistics.For NAIBR, the orientation of novel adjacencies was used as SV type annotation.
M. Weisweiler and B. Stich

Evaluation of SV calling
We calculated the sensitivity (1), which is also called statistical power in other studies, precision (2), which corresponds to 1 -false discovery rate, and the F1-score (harmonic average of the precision and sensitivity) (3) as Sensitivity = TP/(TP + FN) (1) for all combinations of SV types* SV callers* HIs, where TP was the number of true positive SV, FP the number of false positive SV, and FN the number of false negative SV.Before calculating the above described evaluation criteria, the break point resolution (BPR) for each SV length category was estimated for all SV callers based on 135x sequencing coverage for all SV types.Based on this analysis, the following BPR thresholds were chosen to allow a fair comparison between the SV callers (Supplementary Table S1).For SV length category A, a TP SV had break points that did not differ more than 10 bp from those of the simulated SV and the SV length did not differ by more than 10 bp.For the SV length category B, a TP SV had break points and length differences compared to the simulated SV of ≤ 50 bp.For the SV length category C, a TP SV had break points and length differences compared to the simulated SV of ≤ 160 bp.For duplications of the SV length categories D and E, a TP SV had break points and length differences compared to the simulated SV of ≤ 250 bp.For deletions and inversions of the SV length category D, ≤ 550 bp and ≤ 800 bp were chosen as threshold, respectively.For deletions and inversions of the SV length category E, ≤ 250 and ≤ 550 bp were used, respectively.For insertions, the start of a TP insertion had a break point that did differ ≤ 10 bp from the start of the simulated insertion to allow a fair comparison between Manta and Novel-X due to the absence of an insertion length for Fig. 1.Overview of the workflow of this study including used bioinformatic tools (left) in the simple (center) and complex (right) simulations.Detailed information of the workflow can be found in the material and methods section and on https://github.com/mw-qggp/SV_simulation_potato.x (≥ 20 kb) M. Weisweiler and B. Stich Manta.Additionally, for Novel-X, called insertions were also evaluated considering two break points as it was done for deletions to determine the precision of the detected insertion length.The sequence similarity between detected and simulated insertions was evaluated.This was realized by pairwise alignments using stretcher from the EMBOSS package (version 6.6.0.0) [42].
For each TP SV, the called SV had to be annotated as the considered SV type.For deletions and duplications called by LEVIATHAN, the SV type annotation was ignored in a second evaluation (LEVIATHAN (IG)), because pre-simulations have shown that a bug in the algorithm of LEVIATHAN makes it difficult to differ between deletions and duplications.To determine the final sensitivity and precision values, as well as the final F1-scores for the simple and complex simulation scenarios, the median across the five (simple) as well as 20 (complex) replications was calculated.In contrast to the simple simulations, we only evaluated the performance of SV callers for the SV length categories C, D, and E for the complex simulations.For the detection of insertions in the complex simulations, all SV length categories were evaluated together because detected insertions could not be separated by the SV length category for Manta.

Results
Six linked-read and one short-read SV caller (Table 1) were evaluated based on linked-read sequencing with respect to their precision, sensitivity, and F1-score to detect different SV types with different SV lengths and HIs in the tetraploid potato genome using computer simulations.

BPR of SV callers
In a first step, the BPR of each SV caller was determined for the detection of homozygous (4/4) deletions (insertions for Novel-X) for each SV length category based on a 135x sequencing coverage.Deletions have been chosen as SV type and 135x as sequencing coverage, because all SV callers, except VALOR2 and LEVIATHAN, have been developed to detect deletions of all SV length categories.
We observed considerable differences among the BPR of the different SV callers (Fig. 2).Across all examined SV length categories, Manta and LEVIATHAN reached the maximum precision of SV detection with the highest BPR of ≤ 10 bp.In contrast, the BPR of LongRanger and VALOR2 were the lowest.
The trends observed for the BPR of the other SV types corresponded well to those observed for deletions (Supplementary Figs.S1, S2).The main exception was VALOR2, where a BPR was observed for large inversions that was even lower than the BPR of deletions.

SV detection for different SV length categories
First, we focused on the detection of SV based on a sequencing coverage of 135x which corresponds to that of an experimental study with about 100 tetraploid potato clones (Baig et al., in preparation).
All SV callers, except Novel-X, were able to detect deletions for at least one SV length category.For the SV length categories A and B, the highest F1-scores averaged across the four HIs (hereafter designated as average F1-score) were observed for Manta with 98.3% followed closely by LinkedSV (short) (95.9%, 95.6%, Fig. 3   The performance of detecting inversions showed a similar trend as it was observed for deletions.For the SV length categories A and B, the short-read SV caller Manta performed well with high average F1-scores (90.0%, 98.9%) (Supplementary Fig. S3III, Supplementary Tables S11, S12), whereas linked-read SV callers, especially LEVIATHAN (91.4%), showed high average F1-scores for larger inversions of the SV length category C. Additionally, the average precision values were very high for LinkedSV (99.4%) and NAIBR (98.3%) (Supplementary Table S13).An even better performance of linked-read SV callers was observed for the SV length categories D and E (Supplementary Tables S14, S15), especially for NAIBR and LEVIATHAN.
With the exception of VALOR2, the same SV callers which could detect inversions were able to detect duplications.As it was observed for deletions and inversions, Manta was the best SV caller to identify duplications for the SV length categories A with an average F1-score of 66.2% (Supplementary Fig. S4III) which was considerably lower compared to those values for calling deletions (98.3%) and inversions (90.0%).This is caused by a low sensitivity (58.6%) rather than by a low precision (82.2%) (Supplementary Table S16).LEVIATHAN (IG) was the only linked-read SV caller which could detect duplications of the SV length category B, but the average F1-score, sensitivity, and precision values were with 6.4%, 3.5%, and 52.6%, respectively, considerably 4. F1-score, which is the harmonic mean of the precision and sensitivity, observed in the simple simulations, for the detection of insertions of five structural variant (SV) length categories: A (50-300 bp), B (0. lower compared to those values observed for Manta (97.7%, 95.7%, 99.8%) (Supplementary Table S17).For the SV length category C, Manta performed well with an average F1-score of 97.2%.LEVIATHAN (IG) followed slightly behind with an average F1-score of 84.4%.LongRanger showed a considerably lower F1-score of 34.4% because of the low sensitivity (21.8%) (Supplementary Table S18).In contrast to the SV length category C, NAIBR and LinkedSV were able to detect duplications of the SV length category D (Supplementary Table S19).Manta, NAIBR, and LongRanger performed well with average F1-scores ranging from 88.9 to 92.6%.For the SV length category E (Supplementary Table S20), the highest average F1-scores were observed for Manta (85.2%) and NAIBR (85.3%).
Manta and Novel-X were the only two SV callers that were able to detect insertions.Manta as short-read SV caller could detect the break point of the insertion start position but could not assemble the inserted sequence.Therefore, the performance of Manta and Novel-X was compared based on the detection of one break point at the insertion start position.For the SV length category A, Manta showed considerably higher F1-scores (94.5-99.5%)for all four HIs compared to Novel-X (45.7-87.6%)(Fig. 4 III).The precision of Novel-X to detect insertions of the SV length category A was with values between 98.2 and 98.9% high, but the sensitivity was low (29.6-78.7%)(Supplementary Table S21).For the SV length categories B and C, Novel-X performed with F1-scores between 97.3 and 98.6% better than Manta (86.7-99.2%)for almost all four HIs.In addition to the comparison of Manta and Novel-X, the performance of Novel-X was also evaluated as it was done before for the other SV types to determine the precision to assemble the inserted sequence.With exception of the SV length category E, the evaluation of Novel-X based on two break points has shown similar F1scores compared to the evaluation based on only one break point (Supplementary Tables S22-S25).

SV detection based on different sequencing coverages
Apart from the influence of the SV type and SV length on the SV calling performance, we examined the influence of the sequencing coverage.To do so, four different sequencing coverages, namely 45x, 90x, 135x, and 180x were considered.
The performance to detect deletions of the short-read SV callers increased with increasing sequencing coverage (Fig. 3, Supplementary Tables S6 -S10).This was especially true for the detection of deletions of the SV length category A and B. The F1-score of Manta e.g.increased from 81.1% (45x) to 98.1% (180x) for the detection of deletions of the SV length category A and the HI 1/4.Even higher was the difference for this scenario for LinkedSV (short) with an increase of 50.3%.This strong influence of the sequencing coverage on the F1-score was not observed for the detection of inversions and duplications of the SV length categories A and B.
Linked-read SV callers, especially NAIBR and LinkedSV (linked) performed more independently from the sequencing coverage than short-read SV callers.The only exception was the detection of insertions.The average F1-scores of Novel-X increased considerably with an increasing coverage.

SV detection assuming different HIs
We also examined the role of HIs on the performance of SV detection.In most of the simulation scenarios, a higher F1-score was observed for the simulations of the HI 1/4 and 4/4 compared to 2/4 and 3/4 scenarios.This was especially true for the SV length categories D and E for all SV types and for the SV callers Manta and NAIBR.Exceptions of this trend were the performance of LinkedSV (linked) and LEVIATHAN (IG) for the detection of deletions and duplications of the HI 1/4 and NAIBR for the detection of deletions and inversions of the SV length category C. Further, Novel-X showed a higher F1-score to detect insertions of the SV length category A for the HI 2/4 and 4/4 compared to 1/4 and 3/4.
Interestingly, the performance of VALOR2 was more independent from the HI compared to the other SV callers.

Uniquely detected SV by different SV callers
In addition to considering all simulated SV for the evaluations, we also performed evaluations of the SV that were uniquely detected by one SV caller.Manta showed a high number of uniquely detected SV compared to the linked-read SV callers (Fig. 5).Additionally, the total number of detected SV was also the highest for Manta to all other SV callers.The uniquely detected SV by Manta had high precisions between 95% and 100% for the different SV types (Fig. 6).In addition, high median values were also observed for LinkedSV (short) for deletions (87.5%) and for Novel-X for insertions (87.6%).The precisions of the uniquely detected SV for the other linked-read SV callers were with values below 20% considerably lower, but also their number was with values between one and 20 much lower compared to those of Manta.

Evaluation of SV detection using complex simulations
In addition to the simple simulations explained before, where the combinations of SV types, SV length categories, as well as HIs were simulated separately, we performed complex simulations including all features of the simple simulations together to mimic experimental potato genome sequencing data.
In general, the F1-scores observed in the complex simulations showed a high accordance to the results of the simple simulations (Supplementary Tables S2 -S5).For the detection of the different SV types, Manta and NAIBR showed sensitivity and precision values up to 100.0% for most of the SV length categories for all sequencing coverages.In contrast to the simple simulations, LongRanger (linked) showed lower sensitivity values for the detection of larger deletions.

Discussion
Due to tremendous improvements of sequencing technologies and bioinformatic tools, genome-wide SV detection became possible [19].Algorithms based on short-read and long-read sequencing were developed to detect SV.However, despite well established SV detection based on short-read sequencing in the human genome [4,29], low precision and a lack of detecting large SV as well as assembling insertions were reported [5,19,22,35].In contrast, SV calling based on long-read sequencing overcomes these issues but results in higher operational Fig. 5. Number of SV calls shared among SV callers where SV calls across all SV types, SV length categories, haplotype incidences, sequencing coverages, and repetitions were considered.costs, large DNA input requirement, as well as lower sample throughput [19].We therefore benchmarked in a plant genome context SV callers which were developed to detect SV based on linked-read sequencing, as the latter has the potential to exploit signals of short-read sequencing and long-range information [12].Despite the discontinued support of 10xGenomics offering linked-read sequencing, many current studies are available where 10x linked-reads are used [45,49].More importantly, linked-read sequencing is still offered by BGI as single tube long fragment reads (stLFR) [50].Two previously described linked-read SV callers were not considered in our study, due to discontinued support and algorithm similarity to LongRanger (GROC-SVs) [46] or the functional restriction to human genomes (ZoomX) [55].

Simple vs. complex simulations
In general, the high sensitivity and precision values observed in the simple simulations were confirmed by the complex simulations.Therefore, only the results of the simple simulations were discussed in the following sections.Furthermore, in both, simple and complex simulations, maximum precision values of 100% were frequently observed for all SV types and SV length categories.This finding suggests the different SV types and SV lengths occurring simultaneously have not a negative influence on the detection of each other.Therefore, the high precision values observed in our complex simulations can be also expected in experimental data of tetraploid potato varieties.

SV detection based on short-read vs. linked-read signals
The linked-read sequencing data simulated in our study can be used to evaluate SV detection based on short-read and linked-read signals.In contrast to using linked-read SV callers, linked-read signals are, except for the mapping of the reads, simply not considered by the short-read SV callers to call SV.Therefore, we used Manta as short-read SV caller to evaluate the detection of SV using short-read signals based on linkedread mapping.
We observed high precision and sensitivity values for the SV detection using the short-read SV callers Manta and LinkedSV (short) (Fig. 3, Supplementary Tables S6 -S10).Our observations are supported by recent comprehensive SV calling evaluation studies in humans [4,29].However, our figures are in contrast to the low precision of around 15% and sensitivity values between 30 and 70% which have been frequently reported for the detection of SV based on short-read sequencing in the context of the human genome [13,44,45,48].One reason might be that the latter studies evaluated SV callers that have been developed ten years ago such as Pindel [7] or BreakDancer [1].The latter SV callers only exploit one single short-read signal whereas the nowadays available tools use a combination of read depth, paired-end reads, and split reads to increase the sensitivity and precision [52].An additional reason for the high precision and sensitivity observed in our study might be the improved accuracy of read mapping by considering the linked-read information for that step of the analysis [34].Fig. 6.Precision of uniquely detected SV by each SV caller.Only SV types and SV length categories of all scenarios (haplotype incidences, sequencing coverages, repetitions) for the simple simulations were considered which can be detected by the particular SV caller.
M. Weisweiler and B. Stich In our study, the F1-score of the short-read SV caller Manta was always equal or higher compared to the linked-read SV callers NAIBR or LinkedSV, caused by a lower sensitivity of the linked-read SV callers.In contrast, the precision was high for short-and linked-read SV callers.However, only Manta, LinkedSV (short), and Novel-X showed high precision values for uniquely detected SV (Figs. 5, 6).The low precision values of uniquely detected SV for linked-read SV callers is due to the low number of uniquely detected SV (Fig. 6) by those.In contrast, the high precision of linked-read SV callers considering all simulated SV can be explained by the usage of short-read signals and barcode information which was also previously reported in human [45].Due to the usage of additional information provided by linked-read sequencing, linked-read SV callers should be able to increase the sensitivity.However, the lower sensitivity of linked-read SV callers compared to Manta indicates that linked-read SV callers cannot use all information provided by linkedread sequencing.A reason for this might be the relatively recent history and the corresponding low level of elaboration of linked-read compared to short-read SV calling algorithms [45].In contrast, Fang et al. [14] compared the performance of linked-read SV callers to the short-read SV callers Lumpy [31] and Delly [41] and showed that the F1score was higher for NAIBR and LinkedSV than for Delly and Lumpy.This observation can be explained thereby that Manta showed a better performance to detect SV in human [4,29] and barley [52] compared to Delly and Lumpy.However, our finding indicates that further improvements are possible for linked-read SV callers.Furthermore, the combination of short-read signals and long-range information based on molecule signals is expected to increase the precision of SV detection.Therefore, until improved linked-read SV callers are available, we suggest the combined usage of both, short-read and linked-read SV callers, based on linked-read sequencing data to maximize the sensitivity but retaining a high precision.

Influence of SV length on SV detection and performance of SV callers
In order to being able to interpret properly the observed numbers of detected SV of different SV lengths and SV types in experimental studies, a detailed knowledge about the sensitivity and precision of SV callers for different SV length categories is required.
Except for insertions, linked-read SV callers were not able to detect SV of the SV length category A (50-300 bp) and B (0.3-5 kb) or the performance was on a low level (e.g.LEVIATHAN) (Fig. 3, Supplementary Fig. S3, S4).In contrast, Manta as short-read SV caller as well as the short-read algorithm of LinkedSV performed well for these SV length categories.The examined linked-read SV callers were developed for the detection of large SV (≥ 10 kb) [14,58] and the focus laid not on the detection of short SV.However, NAIBR and LEVIATHAN were able to detect SV between 1 and 5 kb in the human genome, but they showed a low sensitivity [12,37].This finding is in agreement with our results for LEVIATHAN.The reason for the discrepancy of SV detection by NAIBR remains elusive.An obvious reason for the low performance of linkedread SV callers to detect short SV in our study is that the principle of SV detection based on linked-read barcode information is not suitable here.The specific signals of linked-read SV calling as overlapped barcodes or split molecules cannot be used because of the short distance between the two break points of a short SV.Therefore, these SV can only be detected based on short-read signals as discordant paired-end reads, split reads, or unusual read depth.
The sensitivity and precision of the examined linked-read SV callers to detect SV of the SV length categories C -E (5 kb -1 Mb) for all SV types was considerably higher compared to the SV length category A and B (Supplementary Tables S6 -S25).In addition, Manta performed also well for large SV for all SV types.Our results were supported by a previous study in human, where a high precision of NAIBR and LinkedSV and a considerably lower precision of LongRanger for the detection of large SV was reported [14].The high precision values to detect large deletions and inversions in the human genome reported for VALOR2 [24] could be supported by our results as well (Supplementary Tables S9, S10, S14, S15).However, these come together with the costs of a lower sensitivity and a considerably lower BPR compared to that of the other SV callers (Fig. 2, Supplementary Fig. S1, S2).

Influence of sequencing coverage on SV detection
First, we assessed the influence of the sequencing coverage on the performance of short-read algorithms based on linked-read sequencing data.The strongest differences were observed for calling deletions of the SV length category A (Fig. 3, Supplementary Table S6) from 45x (~11x per haplotype in potato) to 90x (~22x per haplotype) sequencing coverage, where the sensitivity increased by 23.3% for Manta and 45.6% for LinkedSV (short).This trend was also observed for the other SV length categories albeit in alleviated terms.Further, the performance of short-read algorithms increased only marginally when increasing the sequencing coverage to 135x and 180x, respectively.Our observations are in accordance with results of Cameron et al. [4] who reported a higher sensitivity for short-read SV callers using higher levels of sequencing coverage.In detail, these authors reported above 30x (15x per haplotype) that the sensitivity increased marginally whereas below 30x the sensitivity decreased considerably.These findings can be explained by the fact that short-read sequencing with higher coverage results in an increased number of short-read signals such as discordant paired-end and split reads [29].This in turn results in a higher sensitivity.
In contrast to the SV detection based on short-read signals, the influence of sequencing coverage on the performance of linked-read SV callers seems to be marginal (Figs. 3, 4, Supplementary Fig. S3, S4).The good performance of linked-read SV callers independent from the sequencing coverage can be explained by additional signals comprised in linked-read sequencing data sets which are created during the library preparation process.When exploiting linked-read sequencing for SV detection, the vicinity of SV break points provides more signals due to the longer anchor sequences given by the molecule signals.In contrast, for short-read sequencing, only reads can be considered where the sequence covered the break points.Therefore, the reduction of the sequencing coverage results in fewer short-read signals which has more severe consequences for the SV detection compared to linked-read signals.
In contrast to the above described trend, we have observed two exceptions where the sequencing coverage influenced the SV detection for linked-read SV callers.First, detecting insertions by Novel-X is strongly influenced by the sequencing coverage (Fig. 4).An insufficient coverage leads to difficulties in reassembling the anchor sequences for the detected insertions and thus, the break points of the insertions cannot be determined [35].Second, SV detection for the SV length category C of the HI 1/4 scenario by LEVIATHAN (IG) was strongly influenced by the sequencing coverage e.g. for deletions (40.1%) (Supplementary Table S8) or inversions (20.4%) (Supplementary Table S13).An explanation for the weak performance of LEVIATHAN (IG) for calling SV for the HI 1/4 scenario on 45x sequencing coverage could be that after considering the barcode information, short-read signals such as discordant paired-end or split reads are used to process candidate SV [37].However, as explained above, short-read signals benefit from an increased coverage.

Influence of HI on SV detection in a tetraploid genome
We examined the performance of SV callers using different HIs for the tetraploid potato genome.
As expected, the performance of all SV callers was better for simulation scenarios with a HI 4/4 than for the other HI scenarios.However, the observed performance for the HIs 2/4 and 3/4 was worse compared to those for the HIs 1/4 and 4/4 (Figs. 3, 4, Supplementary Fig. S3, S4).The reason for this observation remains elusive and additional research is needed in the field of polyploid SV calling.
Approaches for SV genotyping based on short-read sequencing have been described for diploid genomes [18] even though it is more complex [3] compared to well established SNV genotyping based on read depth signals [40].Recently, it has been shown that SNV genotyping is more error-prone for polyploid than for diploid genomes with the request of attention interpreting polyploid genotype calls and a need for further improvements [10].Considering the need of improvements of diploid SV genotyping [6,27] and the issues of polyploid SNV genotyping [10], polyploid SV genotyping will be one of the big challenges in crop research.

Assembling insertions using linked-read sequencing
An obvious drawback of SV calling using short-read sequencing is the lack of detecting larger insertions (≥ 0.3 kb) [20,25,26,43] caused by the limited anchor size due to the short insert size of the sequencing library and the corresponding incapacity to span over larger repetitive regions in the genome [35].Manta is able to determine the SV length for insertions up to ~1 kb.However, SV calling based on linked-read sequencing can principally detect larger insertions.But, up to date, only one algorithm (Novel-X) was developed for the detection of insertions.
As this algorithm revealed high sensitivity and precision values to detect insertions in our study (Fig. 4, Supplementary Tables S21 -S25), we evaluated the assembled length of the insertions.Considering both break points to determine the length of the insertions, high sensitivity and precision values were observed for Novel-X.Furthermore, we observed sequencing similarities of 100% between five simulated and detected insertions for each SV length category.This observation was in accordance to Meleshko et al. [35] who reported similar values for the human genome.These observations illustrate the potential of linkedreads and especially of Novel-X to detect and assemble insertions.

Computational performance of SV callers
To compare the computational performance of the different SV callers, we examined the resources needed by SV callers in the case of 180x sequencing coverage in the complex simulations for two potato chromosomes (Table 2).We have observed a short CPU time and low memory requirement for Manta compared to the considerably higher values for the linked-read SV callers.High memory peaks as observed for LEVIATHAN could lead to issues when SV calling is examined on a whole genome level for species with large genomes.

Conclusion
We observed high precision and sensitivity values considering different sequencing coverages for the SV detection in the potato genome.Our observations highlighted the importance of short-read signals by Manta and LinkedSV to detect short SV, whereas Manta and NAIBR performed well for detecting larger deletions, inversions, and duplications.We illustrated that large insertions can be assembled by Novel-X using linked-read sequencing and, thus, it is superior compared to the detection of insertions based on short-read sequencing.The BPR was similar for the different SV types, where we observed the highest BPR for Manta and LEVIATHAN.The HI influenced the performance of all SV callers, where for the HI 4/4 scenario, the highest precision and sensitivity values were observed.Finally, the short-read algorithms were stronger influenced by the sequencing coverage than the linked-read SV callers, except Novel-X, where at least a sequencing coverage of about 22x per haplotype should be used to detect insertions.

Table 1
Properties of structural variant (SV) callers.
III), and with a considerable difference by LongRanger (short) (23.4%, 22.5%).Linked-read SV callers without an implemented short-read algorithm were not able to detect deletions of the SV length category A and B (Supplementary TableS6, S7).Larger deletions could be identified by linked-read SV callers (Supplementary TableS8-S10).

Table 2
Resources used by SV callers in the case of 180x sequencing coverage and in complex simulations.For details see material and methods.
M.Weisweiler and B. Stich