A new GTSeq resource to facilitate multijurisdictional research and management of walleye Sander vitreus

Abstract Conservation and management professionals often work across jurisdictional boundaries to identify broad ecological patterns. These collaborations help to protect populations whose distributions span political borders. One common limitation to multijurisdictional collaboration is consistency in data recording and reporting. This limitation can impact genetic research, which relies on data about specific markers in an organism's genome. Incomplete overlap of markers between separate studies can prevent direct comparisons of results. Standardized marker panels can reduce the impact of this issue and provide a common starting place for new research. Genotyping‐in‐thousands (GTSeq) is one approach used to create standardized marker panels for nonmodel organisms. Here, we describe the development, optimization, and early assessments of a new GTSeq panel for use with walleye (Sander vitreus) from the Great Lakes region of North America. High genome‐coverage sequencing conducted using RAD capture provided genotypes for thousands of single nucleotide polymorphisms (SNPs). From these markers, SNP and microhaplotype markers were chosen, which were informative for genetic stock identification (GSI) and kinship analysis. The final GTSeq panel contained 500 markers, including 197 microhaplotypes and 303 SNPs. Leave‐one‐out GSI simulations indicated that GSI accuracy should be greater than 80% in most jurisdictions. The false‐positive rates of parent‐offspring and full‐sibling kinship identification were found to be low. Finally, genotypes could be consistently scored among separate sequencing runs >94% of the time. Results indicate that the GTSeq panel that we developed should perform well for multijurisdictional walleye research throughout the Great Lakes region.


| INTRODUC TI ON
Effective conservation of biological diversity requires collaborative research to inform conservation or natural resource planning.
In many cases, this involves working across political boundaries and merging datasets generated in different laboratories to identify broad ecological patterns undetectable at a more regional scale (Jay et al., 2016;Margerum, 2008). Unfortunately, merging independent datasets is often impeded when studies do not share a common methodology (de Groot et al., 2015;Fairweather et al., 2018;Hunter et al., 2020). This can be an issue for genetic studies, which frequently generate marker sets de novo for each experiment (e.g., genotyping-by-sequencing; restriction site-associated DNA sequencing [RAD-seq]) or use laboratory-specific protocols or marker panels (e.g., microsatellite genotyping) that result in genotype scoring discrepancies when datasets are merged (Goh et al., 2017;Pasqualotto et al., 2007). Without standardized methods and marker panels, genetic data generated from independent laboratories can be difficult or impossible to merge, limiting opportunities for collaboration and hampering the incorporation of molecular resources into natural resource planning.
Establishing standardized marker panels is important because genetic data provide insight into population biology and connectivity, recruitment dynamics, assessments of historical demography, and population-specific mortality, which can take place across a large geographical area (Allendorf et al., 2010;Benestan et al., 2016). Therefore, collaboration among researchers is often necessary to extend population genetic research beyond a local scale Ruzzante et al., 1999). Historically, standardized marker panels for nonmodel species have mostly included microsatellite panels, or more recently, TaqMan assays, which require extensive laboratory validation to ensure genotype accuracy (Ellis et al., 2011;Hui et al., 2008;Seeb et al., 2007). Data collected using these types of resources have enabled managers to work collaboratively to inform policies structured around a species or population boundary, rather than a political or jurisdictional boundary (Homola et al., 2019;White et al., 2021). The development of new marker panels for common study organisms that are less reliant on intensive laboratory validation than microsatellite panels could benefit many species.
Standardized resources may particularly benefit the conservation of mobile species that frequently cross political boundaries (e.g., border waters of the Laurentian Great Lakes (Hildebrand et al., 2002) or transboundary conservation regions such as the Kavango-Zambezi Transfrontier Conservation Area (KAZA) in Africa or the Amazon River basin in South America; Mena et al., 2020;Stoldt et al., 2020). Species in these transboundary regions are often managed by multiple agencies that conduct research separately but must work collaboratively to protect the entire population. Sequencingbased genotyping panels, such as genotyping-in-thousands (GTSeq), are becoming an increasingly accessible approach for nonmodel organisms (Campbell et al., 2015;Meek & Larson, 2019). Because this approach uses DNA sequencing, which provides exact nucleotide arrangements, the resulting genotypes can be more easily and consistently compared among studies than other PCR-based assays.
Other approaches such as microsatellite DNA markers, which require manual allele calling, are more vulnerable to human error and laboratory variability, making inter-laboratory comparisons more difficult. The adoption of amplicon sequencing panels by laboratories with a purview of conducting research in major transboundary regions can help to facilitate collaboration and generate data that can be used for large-scale meta-analyses or long-term monitoring of populations dynamics and genetic diversity (Hayward et al., 2022;McCane et al., 2018). However, published GTSeq panels are still unavailable for most species and can be time-consuming to develop and implement.
Many of the developed GTSeq panels are for species of fisheries management interest, such as Pacific salmon (e.g., Chang et al., 2021;McKinney et al., 2020) and trout (Bohling et al., 2021). Another species with a recently developed GTSeq panel is walleye (Sander vitreus; Bootsma et al., 2020). Walleye is a highly mobile predatory species of fish native to North America, with an expansive endemic range spanning most of the United States and Canada (Figure S1-S8; Billington et al., 2011). There are many applications for a genetic panel for walleye, including tracking hatchery outplants, geneticinformed domestication of aquaculture strains, population genetics, and genetic stock identification (GSI) of natural populations . The GTSeq panel developed by Bootsma et al. (2020) was created specifically for walleye in inland lakes in the Mississippi River basin of Wisconsin and Minnesota (Bootsma et al., 2020. However, allele frequencies and genetic diversity differ between Mississippi River basin and Great Lakes walleye populations. Therefore, there has been some concern that an additional marker panel may be necessary to inform the conservation and management of walleye populations with broader Great Lakes ancestry. Walleye stocks support extensive recreational and commercial harvest managed by numerous First Nation and tribal communities, Canadian provincial agencies, and eight American states surrounding the Great Lakes. Walleye can swim hundreds of kilometers per year, which means that walleye produced in one jurisdiction contributes to fishing opportunities in other jurisdictions (Brenden et al., 2015;Hayden et al., 2014;Matley et al., 2020). With so many sources of walleye recruitment and mortality, tracking walleye productivity in the Great Lakes has been a priority (Wills et al., 2020).
Genetics is one effective method to track walleye productivity and stock connectivity; however, previous work has relied on microsatellite panels or large single-use genotyping-by-sequencing studies (Chen, Euclide, et al., 2020;Garner et al., 2013), neither of which provide the compositional consistency necessary to merge datasets produced in different laboratories.
Here we describe the multi-omic development and outline applications of a new GTSeq panel developed from 29 walleye spawning populations in the Great Lakes. The objectives of our study were to: (1) develop a general-use GTSeq panel that includes genetic diversity from major walleye stocks in state, provincial, and tribal management jurisdictions in the Great Lakes, (2) evaluate the effectiveness of the panel to conduct mixed-stock analysis and pedigree/kinship analysis throughout the Great Lakes and within each lake, and (3) quantify genotype call variation among laboratories.

| Study system and genetic diversity survey
The Laurentian Great Lakes is centrally located in the walleye species range and contain numerous and interconnected stocks of walleye that colonized the lakes following the last ice age from three different glacial refugia: the Mississippian, Atlantic, and Missourian (Stepien et al., 2009;Stepien & Faber, 1998). Walleye spawn on rocky reefs and in rivers throughout all five of the Great Lakes and are believed to exhibit moderate to strong natal spawning site fidelity . Regionally, walleye spawning stocks range from highly productive naturally reproducing stocks, such as those in the West Basin of Lake Erie, to recovering or recovered stocks supported by fish hatcheries, such as those in northwestern Lake Superior (Vandergoot et al., 2010;Wilson et al., 2007).
Sometimes both naturally reproducing and recovering stocks can be found within the same lake, such as the Ontario Grand River stock in Lake Erie (MacDougall et al., 2007). Therefore, to comprehensively survey walleye genetic diversity in the Great Lakes, it was important to include samples from as many known active walleye spawning stocks throughout the Great Lakes as possible. Samples of walleye fin clips and DNA were compiled from existing collections at collaborating institutions or collected for the purpose of this study during routine spawning stock assessments. All samples were collected between the years 2000 and 2019 from mature individuals during the spawning season at one of 29 known spawning sites (Table 1).
Special attention was paid to sampling locations in Lake Erie where walleye abundance is high and genetic differences among spawning sites are low (Chen, Euclide, et al., 2020;Stepien et al., 2012) and to known stocking sources or receiving populations (i.e., Oneida Lake that is the stocking source for Lake Ontario and Lake Gogebic that was stocked with the ancestral Saginaw Bay walleye stock).
An initial genetic survey was conducted for walleye from the Great Lakes using a subset of 45 of the compiled samples (8-10 individuals from each Great Lake). Putative loci and genotypes were identified de novo using RAD-sequencing. In brief, single nucleotide polymorphisms (SNPs) were identified by conducting PstI RADsequencing (Ali et al., 2016). The program STACKS v2.3 (Rochette et al., 2019) was used to identify and genotype SNPs using the de novo pipeline, which was then filtered based on minor allele frequency (MAF). Following marker identification and de novo genotyping, an early draft of the walleye genome was obtained from the Great Lake Genomics Center at the University of Wisconsin-Milwaukee. Therefore, the alignment position on the draft genome was identified using bowtie2 version 2.2.4 (Langmead & Salzberg, 2012) and used as a filter to limit the linkage disequilibrium among panel loci by removing loci in close proximity to one another (personal communication Aurash Mohaimani, Angela Schmoldt, and Rebecca Klaper, Great Lakes Genomics Center; Table S1-S8). A more detailed description of the RAD-sequencing methods is outlined in Appendix S1.
Following MAF and alignment position filters, 129,281 SNPs remained. Sequences for the 100,000 SNP loci, which contained the highest MAF and heterozygosity were submitted to ArborBioscience (Ann Arbor, MI) for capture bait development to create a Rapture panel. Capture baits were successfully designed for 99,636 loci (80 nucleotide baits with 2 × tiling). Sequencing libraries (maximum of 96 individuals per library) were then constructed for 1289 walleye spanning 29 walleye collection locations ( Figure 1; Table 1) and bait captured following the approach outlined in Ali et al. (2016). These data were processed using STACKS v2.3 (Rochette et al., 2019) and quality filtered using the population step in STACKS v2.3 and VCFtools 2.3 (Danecek et al., 2011) to remove 220 individuals and 296,336 SNPs with poor genotyping rates (Table S2). Following filters, 44,261 of the baited loci with a genotype rate >70% across 1069 individuals were retained that were sequenced to an average depth of coverage of 19X.

| Marker quality screening for GTSeq panel development
Microhaplotypes were identified and genotyped for all 44,261 SNP loci in the datafile using a whitelist containing marker IDs for each locus and the population module of STACKS v2.3. The resulting microhaplotype-VCF file was then filtered using VCFtools to remove loci with >20% missing data (Danecek et al., 2011). Because the genotyping rate of microhaplotypes was lower than that of individual SNPs, to maintain a consistent set of loci in the final dataset, any microhaplotype removed due to missing data was replaced with the genotype of the individual SNP call with the highest minor allele frequency from the original 44,261 SNP datafile. Locus diversity was summarized with custom R scripts that used the DiveRsity and Adegenet R packages (Jombart, 2008;Keenan et al., 2013). Loci were sequentially removed as possible GTSeq panel candidates based on inbreeding coefficient (−0.2 < F IS < 0.2), SNP position (17 < SNP position <140 on the forward read), and number of alleles per locus (<11). These filters removed 27,723 loci, leaving 16,538 as possible GTSeq panel candidates.

| Marker selection scenarios
Our objective was to create a GTSeq panel containing 400 to 600 genetic markers. The number of potential markers was narrowed from 16,538 into five sets of 600 markers containing different numbers of markers that expressed high heterozygosity or allele frequency variance. High allele frequency differences (F ST ; Weir & Cockerham, 1984) are important for applications such as GSI (e.g., Ozerov et al., 2013), while high heterozygosity can be important for kinship analysis (e.g., Baetscher et al., 2018;Blouin, 2003). The five Marker sets were then subjected to GSI and kinship analysis simulations, and the panel mixture that performed well for both GSI and kinships was selected. GSI is frequently used for fisheries management to define management units and to track movement, and to assess contributions of different stocks to a mixed harvest while kinship analysis is the basis of many management-focused activities, such as close-kin mark-recapture (CKMR) and parentage-based tagging (Bravington et al., 2016;Schwartz et al., 2007). GSI simulations were conducted using Rubias (Moran & Anderson, 2019) and kinship simulations were conducted using CKMRsim following nearly identical protocols as outlined in Bootsma et al. (2020). The eight reporting units used for GSI simulations were defined based on prior knowledge of the system (i.e., existing jurisdictional and geographical breaks in the system) and included: Lake Ontario, the Ontario Grand River in Lake Erie, the East Basin of Lake Erie, the West Basin of Lake Erie, Lake Huron, Lake Michigan, the St. Mary's River, and Lake Superior ( Figures S2 and S3

| Panel primer design
We selected 3X the number of loci in the 450:150 ratio of high F ST to high microhaplotype heterozygosity for primer design to account for the loss of markers due to poor primer design. Primers were then designed for each marker using Primer3 v. 2.3 (Untergasser et al., 2012; Table S3). When more than one SNP was present at a locus, primers were designed to target as many SNPs as possible, but preference was given to the SNP with the highest minor allele frequency. However, if targeting the SNP with the highest minor allele frequency excluded three or more SNPs, primers were redesigned to exclude the highest minor allele frequency SNP and instead target the group of 3+ SNPs, thereby retaining the microhaplotype. Of the markers investigated, quality primer pairs were designed for 793 markers. Nine markers were removed due to potential off-target amplification or identical forward and reverse primers. Diversity statistics were then used to select 600 markers from the remaining 784 markers to retain 450 markers originally selected based on SNP F ST and 150 markers originally selected based on microhaplotype H E .
The panel of 600 markers was then re-assessed for GSI and parentage using identical protocols as preliminary screening to ensure that it performed similarly as the original FST450_mHE150 panel ( Figures S4 and S5). Once satisfied, 6-bp plate and sample adapters were added to forward and reverse primer sequences, and oligonucleotides for all 1200 primers were ordered from Integrated DNA Technologies (IDT, Coralville, Iowa).

| Panel PCR optimization
The optimal multiplex combination of primer pairs was determined by conducting four sequential library preparation and sequencing runs on MiSeq Micro flow cells (paired-end 150 bp; 300 cycles).
The sample size for Moon River in Lake Huron was small (N = 14) after removing individuals with a low genotyping rate; therefore, samples were combined with Tittabawassee samples to create a single Lake Huron reporting group. Additionally, because walleye from Lake St. Clair and western Lake Erie are known to mix with Lake Huron walleye (Brenden et al., 2015), samples from Lake St. Clair and the West Basin of Lake Erie were also included in the Lake Huron reporting group. Collections within all other lakes were analyzed separately. Pairwise Weir and Cockerham's F ST was calculated among among-lake and within-lake reporting units as a metric of population structure (Meirmans, 2020;Weir & Cockerham, 1984). Amplicon reads for each sample then were used to score genotypes for each locus based on the number of probe reads for each SNP and a maximum likelihood algorithm described in McKinney et al. (2018) that accounts for variance in allele dosage.
The consistency in sequencing output (i.e., amplification and subsequent sequencing of targeted genetic markers) among libraries prepared at the UWM, GLSC, and OMNRF laboratories separately was evaluated using the number of reads containing a primer sequence to the number of reads containing a probe sequence for a given marker (i.e., the exact 30-bp sequence flanking a known SNP or microhaplotype). Individual coverage was calculated as the total number of reads containing sequence data for both the primer and probe for a given marker divided by 500 (the total number of markers included). For each marker and individual, the data were analyzed as a proportion of primer reads to probe reads (here forward referred to as primer: probe proportion). The consistency in this proportion among datasets was evaluated using pairwise Pearson's correlations of marker-specific primer: probe proportion. Consistent amplification of the GTSeq panel was expected to result in a strong positive result and high correlation coefficient (r 2 ). The relative differences in the individual or marker sequencing variance among preparations are described using the standard deviation in primer: probe proportion.
Genotypes were defined as "congruent" between two datasets if the same alleles were scored in both cases of a pairwise assessment between laboratories for a given individual and locus. In other words, if individual-X contained an AG heterozygote score in both the UWM and GLSC datasets, the genotype was considered "congruent" between these datasets. Congruency in scored genotypes among separate sequencing runs was evaluated in a pairwise fashion. First, individuals that lacked genotype calls at 50% or more of the GTSeq markers were removed from the analysis. Then, the percent of identical genotype calls (e.g., a call that is scored as a heterozygote in both datasets being compared) was calculated for individuals.
One-way Analysis of Variance (ANOVA) was used to test whether the average percent of congruent genotypes differed between laboratory pairs. We hypothesized that depth of coverage may influence genotype call accuracy, and therefore also tested whether average individual total read count across all three sequencing runs influence percent congruency using an ANOVA.

| Panel selection
All five tested panel-marker combinations performed similarly for GSI to eight putative reporting units and kinship assignment of parentoffspring and full-sibling pairs ( Table 2). The FST_600_mHE0 panel TA B L E 2 Mean estimated assignment accuracy across eight reporting units (Lake Ontario, the Ontario Grand River in Lake Erie, the East Basin of Lake Erie, the West Basin of Lake Erie, Lake Huron, Lake Michigan, the St. Mary's River, and Lake Superior) and the estimated falsepositive rate (FPR) of full-sibling assignment used to compare between five potential panels at an accepted false-negative rate (FNR) of 0.01. performed the best for GSI (mean assignment accuracy = 92.7%) but worst for kinship analysis (full-sibling FPR (FNR=0.01) = 3.8 × 10 −15 ).
Based on these results, we chose one of the intermediate panels (FST_450_mHE150), which appeared to perform moderately well for both GSI (mean assignment accuracy = 91.1%) and kinship (fullsibling FPR (FNR=0.01) = 2.3 × 10 −18 ).  (Table 1). The G IS was close to zero in all collections (overall G IS = −0.008; 95% CI = −0.013 to −0.004), and a maximum of 10% of loci departed significantly from HWE at any given collection (α = .05). No loci were significantly out of HWE once a Bonferroni correction was applied (α = .0001).

| Among-lake genetic stock identification
Average pairwise F ST among Great Lakes was 0.083, the smallest distance was between Lake St. Clair and Lake Erie (F ST = 0.008) and the largest was between Lake Erie and Lake Superior (F ST = 0.169; Table S7). Average assignment accuracy of GSI to lake was greater than 95% for Lake Ontario (100%), Lake Erie (99%), Lake Michigan (97%), and Lake Superior (95%). Average assignment accuracy was less than 95% for Lake Huron (76%) with misassignments of individuals to Lake Michigan (9.8%), Lake St. Clair (6.4%), Lake Superior (3.5%), and Lake Erie (1.7%). Average assignment accuracy was the lowest for the Clinton River in Lake St. Clair (10%) with misassignments of individuals to Lake Erie (68%), Lake Michigan (10%), and Lake Huron (6.5%).

| Within-lake genetic stock identification
To ensure that the final GTSeq panel could be used effectively within smaller jurisdictions throughout the Great Lakes, we estimated local GSI using mixture analysis and kinship within each of the Great Lakes. Within-lake F ST was 0.083 when averaged across all reporting group pairwise comparisons (Table S7). Among-group pairwise F ST was lowest among Lake Erie (average F ST = 0.049) and highest in Lake Michigan (average F ST = 0.097). Greater than 98.7% of individuals were assigned to at least one collection with a pofZ > 0.7.
Of individuals with a pofZ score > 0.7, 80% were correctly assigned to their true collection location (Figure 2). Fox River in the Lake Michigan basin had particularly low GSI accuracy (mean = 54%). This was largely due to 35% of individuals being misassigned to the Wolf River, which is connected to the Fox River through Lake Winnebago pairwise F ST = 0.091). Average assignment accuracy at other collections was higher than 90% but did vary among consecutive leaveone-out simulations.

| Within-lake kinship assignment
To evaluate how well the GTSeq panel performed for kinship analysis, we compared estimates of false-positive pairwise relationship assignments for full-sibling, parent-offspring, and half-sibling relationships simulated from allele frequency distributions within each lake. False-positive rates for full-sibling and parent-offspring relationships were less than 1 × 10 −11 at an acceptable false-negative rate of 0.01 (Figure 3). This indicates that the ability to distinguish between unrelated pairs and full-sibling or parent-offspring pairs was high. False-positive rates differed slightly among lakes and were highest in lakes Erie and Huron, and lowest in Lake Ontario.
However, in all cases, we concluded that the maximum false-positive rate for full-sibling and parent-offspring pairs should be sufficiently low for most applications. The false-positive rate for distinguishing true half-siblings from unrelated pairs was substantially higher and | 9 of 15 EUCLIDE et al.
ranged from 1 × 10 −2 to 3 × 10 −2 (FNR = 0.01) to 6 × 10 −4 to 6 × 10 −4 (FNR = 0.1). About 1 out of every 100 to 300 observations can be expected to be false positives when an FNR threshold of 0.01 is used. The primer: probe proportion of each marker was positively correlated among runs from different laboratories suggesting that marker amplification and sequencing performed similarly between sequencing replicates ( Figure S7). The correlation was weaker between OMNRF and UWM or GLSC (r(434) = .73, p < .001 and r(434) = .72, p < .001) than between GLSC and UWM (r(468) = .89, F I G U R E 2 The estimated genetic stock identification accuracy for each within-Lake reporting unit (x-axis) for the final GTSeq panel containing 500 SNP and microhaplotype markers. Reporting units are colored according to their corresponding Great Lake. Each point represents the proportion of individuals correctly assigned with a (pofZ) score of >0.7 to a given reporting unit in a single leave-one-out 100% mixture simulation (N = 99).

F I G U R E 3
The change in false-positive detection rates (i.e., the rate of true-unrelated pairs being identified as full-sibling [FS], half-sibling [HS], or parent-offspring [PO] pairs) for 10 false-negative rates (0.01-0.1; i.e, the rate of true full-sibling, half-sibling, or parent-offspring pairs being identified as unrelated pairs) estimated separately for each lake. Note that the y-axis differs between plots.
p < .001). Amplification and sequencing performance of individuals was less consistent among laboratories than markers ( Figure S8). Individuals were successfully genotyped for 90% of markers at UWM (SD = 10.5%) and GLSC (SD = 6.3%) and 63% at OMNRF (SD = 9.9%). The average individual congruence between shared genotype calls among laboratories ranged from a low of 94% between UWM and OMNRF to a high of 97% between UWM and GLSC (  (Figure 4).
Genotype congruence was not influenced by individual coverage (ANOVA p = .7; F 1, 263 = 0.14), suggesting that depth of coverage may not be a principal factor influencing genotype congruence.

| DISCUSS ION
Interjurisdictional natural resource research and conservation rely on an ability to integrate data and de-centralize work pursuing research objectives. The genotyping-in-thousands sequencing (GTSeq) panel that we created for walleye provides an efficient and consistent method of collecting genetic data on walleye of Great Lakes lineages for fisheries research and management purposes.
We demonstrate that SNP and microhaplotype genotypes from the 500 markers included in our GTSeq panel could be used to: (1)

| Predicted performance for fisheries applications
Identification of the geographical source of a sample of unknown origin has important implications for both management (Valenzuela-Quiñonez, 2016) and conservation biology (Zhang et al., 2020). By targeting genetic markers with high diversity and among-collection allele frequency variability, we created a multi-use GTSeq panel that should perform adequately for most walleye GSI studies in major Great Lakes jurisdictions. Stock identification and structure is a key management objective for several major walleye population assemblages throughout the Great Lakes including Lake Erie , Saginaw Bay, Lake Huron (Brenden et al., 2015), Green Bay, Lake Michigan (Dembkowski et al., 2018), and Lake Superior (Homola, unpublished data). Our analysis shows that the panel should perform sufficiently well in each of these regions to assign individuals to specific spawning reefs/sites as in Lake Superior or to groups of sites such as the "West Basin" vs. "East Basin" of Lake Erie with >90% accuracy. Importantly, this means that this single marker panel could be used to facilitate mixed-stock assignment and recovery programs for walleye in many different areas.
Data collected from these regional studies could be shared to identify long-distance migrants and larger spatial patterns in movement and gene flow.
Future generation and sharing of new data by researchers using this GTSeq panel could help to improve GSI and kinship assignment accuracy. Increasing the number of sites and samples included in population baselines increases the accuracy of population allele frequency estimates (Wood et al., 1987). In our study, sample sizes of our baseline dataset were variable but generally included greater than 30 individuals from a given spawning population.  (Seeb et al., 2007;Stott et al., 2010). Therefore, the present panel should be viewed as the starting place that will be improved with ongoing collaboration and continued optimization.
One of the major benefits of including microhaplotype loci in panel construction is that they provide multiallelic markers that can facilitate kinship and pedigree analysis (Baetscher et al., 2018). Our data demonstrated that microhaplotypes did contain higher genetic diversity than biallelic SNPs, which contributed to accurate kinship assignment for walleye throughout the Great Lakes. However, microhaplotypes also contained higher inter-laboratory scoring errors.
These data could provide new opportunities to assess the abundance of local walleye populations using genetic techniques such as closekin mark-recapture (CKMR) and rarefaction, which benefit from multiallelic markers (Bravington et al., 2016;White et al., 2022).
Prior to the application of the present panel to kinship studies, there are several reasons why additional assessments of kinship for target populations will be necessary. First, the false-positive rates of detection for half-siblings were substantially higher than for parent-offspring and full-sibling identification in simulations.
Misassignment of half-siblings can be an issue for CKMR when full-sibling and parent-offspring pairs may be uncommonly encountered in sample sets (Waples & Feutry, 2022). Second, our analysis focused on determining false-positive rates of misassigning an unrelated pair as a related pair. However, the majority misassignments are likely to occur between different types of related pairs (e.g., misassigning half-siblings as full-siblings). Third, about a quarter of the SNP markers in the panel appear to be in moderate linkage disequilibrium with at least one other locus in the panel. Given the large physical distance between markers based on alignment to the draft walleye genome, we suggest that much of this linkage is the result of population structure and not physical linkage among loci.
Nonetheless, power assessments of kinship assignment can become inflated when linked loci are included (Huang et al., 2004). Thus, researchers should conduct their own power assessments and linkage disequilibrium assessments using samples collected from their study area to determine the statistical power of the panel prior to largescale application.

| Interjurisdictional collaboration
Most fisheries management and research activities in the Great Lakes are decentralized and decisions are based on data produced from each lake's surrounding jurisdictional fisheries agencies. Therefore, the creation of a standardized resource is only the first step towards unifying walleye research and stock monitoring throughout the Great Lakes region Stott et al., 2010). Long-term collaboration among laboratories will be required to ensure that data produced separately is consistent and comparable. We demonstrated that most genotype calls were consistent among independent sequencing runs; however, discrepancies can be expected. For example, sequencing data produced from OMNRF contained fewer reads that could be assigned to any of the target markers, and this led to a lower overall genotyping rate for individuals in this dataset. We were unable to identify the reason for the lower sequencing quality obtained from the OMNRF laboratory; however, we predict that it is likely associated with slight differences TA B L E 3 Among-laboratory genotype congruence statistics for individuals with a genotype rate greater than 50% for all types of markers (All), microhaplotypes (mhaps), and single nucleotide polymorphisms (SNPs).  (Seeb et al., 2007;Stott et al., 2010) and have begun to be used for GTSeq panels (Bohling et al., 2021;Hayward et al., 2022). However, the appropriate use of positive and negative controls should help account for batch effects in future studies.
The need for standardized resources that facilitate interjurisdictional research is a constant across natural resource conservation and management. Here we respond to that need by developing a new genetic resource that will facilitate population structure and connectivity research of one of the most important fisheries in the Great Lakes region of the United States and Canada, walleye. Our panels and necessary resources have been made publicly available through this publication (Dryad: https://doi.org/10.5061/dryad. xd254 7dmg). We showed that the GTSeq panel provides high assignment accuracy for major walleye stocks in each of the Great Lakes, low false-positive kinship assignment for full-sibling and parentoffspring pairs, and >95% genotype congruence among subsequent sequencing runs. We hope that future studies using this research will continue to improve panel performance and add to ongoing collaboration to the benefit of walleye fisheries in North America. funding acquisition (equal); investigation (supporting); methodology (supporting); project administration (supporting); writing -original draft (supporting); writing -review and editing (supporting). Emily K.

ACK N OWLED G M ENTS
We thank numerous past and current university, state, tribal, and

O PEN R E S E A RCH BA D G E S
This article has earned Open Data and Open Materials badges. Data and materials are available at https://doi.org/10.5061/dryad.xd254 7dmg.

DATA AVA I L A B I L I T Y S TAT E M E N T
Datasets, including primer sequences, capture bait fasta file, GTScore input files, and all of the sample metadata and GTSeq genotyping data used to generate the figures and tables included in the main body of the text, will be made available on Dryad upon manuscript acceptance .

B EN EFIT-S H A R I N G S TATEM ENT
An international research collaboration was developed with scientists from the nations and states providing genetic samples, and many of those collaborators have been included as co-authors. The results of this study are being shared openly with all agencies involved in walleye management and made accessible to the broader scientific community through this publication. Our group is committed to scientific partnerships and to developing a more inclusive and open space for research.