Application of Morphometric Analysis to Identify Alewife Stock Structure in the Gulf of Maine

Abstract Alewife Alosa pseudoharengus is an anadromous clupeid fish of long-standing ecological and socioeconomic importance along the Atlantic coast of North America. Since the 1970s, Alewife populations have been declining throughout the species' range. A number of hypotheses have been proposed to explain the decline, but a lack of basic information on population demographics inhibits hypothesis testing. In this study, we evaluated the use of morphometric analysis to discriminate among spawning stocks of Alewives collected from 24 sites in Maine and one site in Massachusetts. We first identified 10 morphometric measurements that were not influenced by the freezing—thawing process, and then used principal component and discriminant function analyses to develop stock-structure classification models from these 10 measurements. Classification models were able to discriminate Alewives to be from Maine or the single Massachusetts site 100% of the time. In addition, classification models correctly classified pooled sampling sites from the extreme western and eastern parts of Maine with 64% accuracy. Morphometric analysis may therefore provide an easily accessible, comparatively fast, and inexpensive method to discriminate marine-captured Alewives spawned in areas separated by major biogeographic regions, large geographic distances (100s of kilometers), or both, and thus help inform questions about stock composition at these spatial scales for assessment surveys and bycatch events.


12
cormorants Phalacrocorax auritus, Largemouth Bass Micropterus salmoides, Bluefish Pomatomus saltatrix, and ospreys Pandion haliaetus (Fay et al. 1983;Yako et al. 2000;Dalton et al. 2009;Glass and Watts 2009). Alewives also provide socioeconomic benefits to a variety of stakeholders. Historically, communities harvested Alewives as a food source. More recently, Alewives have been used as spring bait in the New England lobster Americanus homarus fishery, food for local consumption, fish oil, fertilizer, domestic animal feed, and bait for recreational fishing (Bigelow and Schroeder 1953;Fay et al. 1983;Dalton et al. 2009). Some communities have capitalized on Alewife spawning runs as tourist attractions generating additional revenues for local economies (e.g., Creamer 2010).
Commercial catch of Alewives and Blueback Herring Alosa aestivalis (collectively known as "river herring") has declined, starting with a sharp drop in the 1970s and a more recent drop to very low levels since the mid-1980s (Fay et al. 1983Schmidt et al. 2003). Because of this decline, the U.S. National Marine Fisheries Service listed river herring as a Species of Concern (NMFS 2009). A moratorium on river herring fisheries has been implemented in five states (Massachusetts, Rhode Island, Connecticut, Virginia, and North Carolina) (ASMFC 2009). In addition, the dramatic decline, and insufficient data to identify and assess the potential causes of this decline, led the Atlantic States Marine Fisheries Commission to close all river herring fisheries in 2012, although states with sustainable harvest plans will be allowed to remain open (ASMFC 2009). Additionally, conservation groups petitioned to have river herring listed as threatened under the Endangered Species Act in 2011 (NOAA 2011). A number of hypotheses have been proposed to explain the decline and lack of recovery, including restricted habitat access due to dams, habitat degradation caused by pollution, increased predation, overfishing, and bycatch (McCoy 1975;Hartman 2003;Saunders et al. 2006;Hall et al. 2010). The exact cause or combination of causes is still uncertain.
From March (southern end of distribution) to June (northern end), adult Alewives migrate up freshwater streams and rivers to spawn in lakes and ponds (Pardue 1983;Walsh et al. 2005). After fertilization, eggs hatch within 2-15 d depending on temperature (Pardue 1983). Juveniles remain in the freshwater for 3 to 7 months (Richkus 1975) until they out-migrate to the ocean from late summer to late fall (Iafrate and Oliveira 2008;Gahagan et al. 2010). Alewives are believed to return to their natal river systems and lakes to spawn (Thunberg 1971). This behavior, over time, may lead to unique characteristics based on the influence of local environments on early life stages (Beacham et al. 1988;Taylor 1991). Such characteristics provide an opportunity to test for stock structure based on natal origin if adaptations to local river and lake conditions are expressed as measurable differences in phenotypic traits (Barnett-Johnson et al. 2008).
Understanding stock structure is an important consideration in developing fisheries management plans. Disregarding stock structure can lead to a variety of problems, including loss of genetic diversity (Smith et al. 1991), changes in the biological characteristics such as making fish smaller (Ricker 1981), overfishing less productive stocks (Graham 1982), and inaccurate predictions of how management strategies may affect a stock . However, very little is known about the stock structure of Alewife (Fay et al. 1983). The Atlantic States Marine Fisheries Commission decision to close the fishery in 2012 implies the need for better assessments of individual spawning groups, and thus knowledge of river herring stock structure (ASMFC 2009).
A variety of techniques have been used to differentiate between stocks of fish. For example, genetics has been used to distinguish between stocks of American Shad A. sapidissima (Nolan et al. 1991) and between landlocked populations of Alewife (Ihssen et al. 1992). Morphometrics have been used successfully to identify stock structure in a number of fish species including Pacific Herring Clupea pallasii, Rainbow Smelt Osmerus mordax, and Yellowtail Flounder Limanda ferruginea (e.g., Meng and Stocker 1984;Cadrin and Silva 2005;Lecomte and Dodson 2005). In this study, we evaluated morphometrics as a tool to discriminate among Alewife spawning groups in the Gulf of Maine. We hypothesized that Alewife morphometric characteristics are established in their natal habitat, and therefore a fine-scale stock structure exists to differentiate at a lake scale. Alternatively, morphometric characteristics may be influenced by factors that work at larger geographic scales. If there are measureable differences in morphometric characteristics, body shape may provide a means to discriminate among stocks at the scales of lakes, watersheds, or regions. This would suggest that morphometric analyses could be applied to marinecaptured Alewives to determine how different stocks associate with one another in the open ocean to address critical management questions such as stock composition of bycatch.

METHODS
Samples of spawning anadromous Alewives were collected from 24 rivers and lakes in Maine within the Gulf of Maine watershed from April to June 2010 (Table 1; Figure 1). An additional site, the Nemasket River, Massachusetts, was sampled as an "outgroup." In general, 100 fish were targeted at each site for each sampling event. Alewives were caught using dip nets, seine nets, fyke nets, cast nets, and trammel nets in both riverine and lacustrine locales. However, a majority of fish were captured at harvest points with the assistance of municipal harvesters or management authorities, typically below natal lake outlets. Samples were placed in a cooler on ice and processed within 2 d of capture.
For each fish, fork length (FL) and total length (TL) were measured to the nearest millimeter and total mass was recorded to the nearest gram. After a standardized digital image was recorded, fish were dissected to confirm species identification (Alewife or Blueback Herring) based on pigmentation of the peritoneum (Bigelow and Schroeder 1953). Sex was recorded and gonads were removed and weighed to the nearest gram. The 13 FIGURE 1. The 25 sites sampled for spawning Alewives in 2010, including the separation between the Maine sites and the Massachusetts site (top panel) and the 24 sampling sites in Maine (bottom panel). The two-way geographic divide is identified by symbol shading and the three-way geographic divide is identified by symbol shapes.  (Maier 1908). The sagittal otoliths were removed and mounted in two-part Buehler EpoHeat epoxy resin and cured in a drying oven for 3 h at 60 • C. To estimate age, mounted otoliths were examined under a dissecting microscope with the sulcus facing up, the rostrum aligned with the 12 o'clock position, and the annuli counted at the 7 o'clock region. The otoliths were read whole, and the right otolith for each fish was used whenever possible. Two otolith readers were used, one to be the primary reader and the second to verify 50% of the age estimates (adapted from Burke et al. 2008). Differences in assigned ages were resolved through a consensus process. Images for morphometric analyses were taken using a Nikon Coolpix S700 camera mounted on a frame 50 cm above the processing table. Each fish was placed underneath the camera on a plastic grid. Fifteen landmarks were used on each fish, and 10 of the landmarks were marked by pins prior to photo documentation (Armstrong and Cadrin 2001). From the 15 landmarks, 31 measurements were recorded ( Figure 2; Table 2) using tps-Dig2 (http://life.bio.sunysb.edu/morph/). A calibration picture was taken at the beginning of each series of digital images to correct for possible image distortion. The measurements SL, head height 1 (HH1), head diagonal 1 (HD1), and HD2 (Table 2) were excluded from further analyses because of inconsistencies in determining the location of landmark 2 ( Figure 2). Because most samples from marine environments are usually frozen before they are analyzed (e.g., samples from observer programs, assessment surveys), we first tested for the effect of freezing on morphometric measurements. We then used the measurements unaffected by freezing to produce a classification model. To test the effect of freezing, we recorded routine length and weight measurements and took digital images for morphometric analyses of freshly captured fish from one of our study sites (Hadley Lake). The fish were then frozen at −20 • C for 40 d, thawed, and reprocessed. Landmarks were repinned and a second set of digital images were taken for morphometric analyses. We used a paired t-test to test for differences in each of the 27 morphometric measurements between fresh and frozen fish. We used α = 0.05 to test for significance but did not correct for multiple comparisons because we were interested in identifying those morphometric measurements that did not significantly differ between the two treatments. Although we may have artificially excluded some measurements because of the increased chance for a type I error, our approach resulted in a more conservative list of measurements to be used in developing classification models. All analyses were conducted in JMP (version 9, SAS, Cary, North Carolina).
Morphometric analyses were conducted using log etransformed data for those metrics where no significant differences were found between fresh and frozen fish. Initially, principal component analysis (PCA) was used to examine which combinations of measurements were most responsible for the variance in the data. Because the first principal component (PC1) explained 54% of the variance in the data and was associated with overall fish size, we removed the effect of fish size from the log e -transformed data using Burnaby's size correction method (Burnaby 1966) as follows: where Y is the size-adjusted data, X is the n × p data matrix, n is the total number of samples, p is the number of morphometric measurements, I is an identity matrix of rank p, b is a matrix with each column equal to PC1 of the covariance matrix for each individual sampling group, and b is the transpose of matrix b. The procedure proposed by Burnaby (1966) eliminates the effects of growth from multivariate data by projecting data points onto a subspace that is orthogonal to the growth vector (Klingenberg 1996).
We examined the size-adjusted data at the following different scales: sex, age, gonad stage, two-way geographic divide (based on whether a sampling site was west or east of Penobscot Bay), and three-way geographic divide (close to Penobscot Bay, far east of Penobscot Bay, and far west of Penobscot Bay) (see Figure 1). Principal component analysis was initially used to explore possible patterns in the data. We then developed a classification model by applying a linear discriminant function analysis that calculates the Mahalanobis distance from each individual sample to the group's multivariate mean. The accuracy of the classification model was tested by randomly selecting 75% of the data to build the classification model, and then the remaining 25% of the data was used to independently test the ability of the model to correctly classify these observations. The maximum chance criterion and the proportional chance criterion (Schlottmann 1989) were used to determine whether the prediction equation was better than random chance. The maximum chance criterion assumed that all the samples in the 25% used to test the ability of the model to correctly classify the observations are from the single largest group in the 75% that were used to produce the model. The proportional chance criterion assumed that the 25% are randomly distributed in the same proportions as the 75% group.

Fresh versus Frozen
A total of 69 fish from Hadley Lake were used to test for differences between fresh and frozen measurements of Alewives.

PCA Exploration
A total of 2,714 fish from 25 sites were used in the analysis of which 1,548 were male, 1,155 were female, and 11 were unknown. The age of the fish ranged from 3 to 6 years, with 377 fish estimated to be age 3, 1,748 fish age 4, 507 fish age 5, 42 fish age 6, and 40 fish of undetermined age. There was an 85% agreement between the two otolith age readers. The gonad stages ranged from 3 to 7 on the development scale, with 4 fish at stage 3 (developing), 135 fish at stage 4 (developed), 2,070 fish at stage 5 (gravid), 464 fish at stage 6 (ripe and running), 36 fish at stage 7 (spent), and five fish that were of unknown stage.
The PCA on the log e -transformed, size-adjusted morphometric data showed that PC1 accounted for 90% of the variance in the data and was mostly correlated with two groups of measurements: BL2, OPH, CL2, AFL, and BD4 versus BD2, BD3, and OPEC2. Principal component 2 accounted for 8% of the variance and was mostly correlated with two groups of measurements: AFL and BH2 versus CL1 (Table 3). The PCA showed a very strong separation by sampling site. Specifically, fish from the Nemasket River, Massachusetts, were separated from all other sampling sites in Maine (Figure 3). When fish from the Nemasket River were excluded from PCA exploration (i.e., only using sites from Maine), we did not find any patterns by sex, age, gonad stage, two-way geographic divide, or three-way geographic divide ( Figure 4).

Discriminant Function Analysis
A discriminant function analysis was run between sites located in Maine and the site located in Massachusetts using a randomly selected subset of fish (75%) from each state. Classification from the resultant model correctly predicted the state of origin for all of the remaining 25% of the fish not used to develop the model. This was significantly better than random chance for both proportional chance (P < 0.001) and maximum chance criteria (P < 0.001). We then developed two more classification models to determine the extent to which fish from the Nemasket River separated from different subsets of Maine samples, using 75% of the samples from each group to develop each model and the remaining 25% to validate each model. The first model attempted to discriminate fish from the Nemasket River, eastern Maine, and western Maine. Sites located east of Penobscot Bay were considered eastern Maine and sites located west of Penobscot Bay were considered western Maine. The model correctly classified 58% of the samples to their region, which was significantly better than random chance for the proportional chance criterion (P < 0.001) but not for the maximum chance criterion (P = 0.759; Table 4). However, none of the samples from the Nemasket River were misclassified and none of the samples from the two Maine groups were misclassified as Nemasket River. The second classification model attempted to discriminate among Nemasket River and four major watersheds in Maine (Kennebec, East Machias, Penobscot, St. George). The model correctly allocated 47% of the samples to their origin and was significantly better than random chance for both proportional chance (P < 0.001) and maximum chance criteria (P = 0.003; Table 5). Again, none of the samples from the Nemasket River were misclassified and none of the samples from the four Maine watersheds were misclassified as being from Nemasket River.   Because of the outstanding difference between Alewives from Maine sites and those from the single Massachusetts site, we conducted seven discriminant function analyses on Alewives only from Maine. The models were developed using 75% of randomly selected samples from each group, and the remaining 25% were used to validate the models. Five of the analyses (by sex, age, gonad stage, two-way geographic divide, and threeway geographic divide) yielded models that were no different than random chance. However, the sixth model, based on spawning sites, correctly classified 15% of the samples to their site. From the 24 sites used in the model, seven sites (Benton Falls, Little River, North Pond, Orland River, Sennebec Pond, Somes Pond, Webber Pond) had ≥20% accuracy while the rest had accuracies of <20%. Results from the sixth model were significantly better than both proportional chance criterion (P < 0.001) and maximum chance criterion (P < 0.001). The seventh model examined whether Alewives from the extreme east or extreme west of Maine could be distinguished from one another. Eight sites were used for this model: four from the extreme eastern parts of Maine (Gardner Lake, Hadley Lake, Little River, and Meddybemps) and four from the extreme western parts of Maine (Androscoggin River, Nequasset Lake, Presumpscot River, and Saco River). This model correctly classified 63.9% of the samples to their site and was significantly better than both proportional chance criterion (P < 0.001) and maximum chance criterion (P = 0.028; Table 6).

DISCUSSION
The goal of this study was to develop classification models using morphometrics to determine the stock structure of Alewives from the Gulf of Maine region. For the classification models to be useful to managers, they needed to be based on measurements that were stable through the freezing and thawing process because samples from marine environments are typically frozen for later processing. We identified 10 measurements that were robust to freezing. Based on these 10 measurements, our results suggest that there is a strong and distinguishable difference between Alewives from Maine and Alewives from the single site we sampled in Massachusetts. There is a strong geographic divide between these sites; all the sites from Maine drain into the Gulf of Maine while the Nemasket River drains into Narragansett Bay and then Rhode Island Sound. If Alewives from these two regions remain separated during the marine phase of their life cycle, they probably experience different environmental conditions. For example, in 2011, the monthly average water temperature at 3 m depth in Penobscot Bay was 2.7 • C lower than the monthly average water temperature at 3 m depth in Narragansett Bay (http://www.gomoos.org/gnd/). The growth rate of Alewives can vary depending on certain factors such as prey availability and temperature (Henderson and Brown 1985). These differences could cause morphometric characteristics to vary between the two groups, allowing a morphometric classification model to be robust.
Using the 10 metrics, we developed two models that showed distinguishable differences among Alewife spawning groups from within Maine. The first model, which was based on sampling sites, was significantly better than random chance, but the accuracy was only 15% and thus not likely useful for classifying Alewives of unknown origins. The second model, which was based on sites in the extreme east and extreme west of Maine, was 63.9% accurate, suggesting that Alewives from neighboring lakes were more similar than Alewives from distant lakes. Although this model was not as accurate as the model differentiating between Maine and the one site in Massachusetts, it does suggests that spatial differentiation are detectable within Maine at scales of more than 100s of kilometers, but more local processes may blur the ability to distinguish stocks at smaller spatial scales (if they exist). For example, restoration stocking may "smooth out" morphometric differences. As part of their restoration and management plan, the Maine Department of Marine Resources (DMR) has intercepted and transplanted Alewives on their spawning runs to lakes and ponds where they were depleted or extirpated (Rounsefell and Stringer 1945;Maine DMR, unpublished data). There is strong evidence that offspring of transplanted Alewives return to the ponds where their spawning parents were stocked (Rounsefell and Stringer 1945). If there is a genetic component to the phenotypic expression of morphometric characteristics, genetic exchanges between watersheds may reduce the likelihood of morphometric differentiation that could be used to define a stock (Begg and Waldman 1999;Jørgensen et al. 2008).
Another factor that could account for the low classification rates at smaller spatial scales is that not all Alewives return to their lake or pond of natal origins to spawn. Even though research has suggested that Alewives do return to their lake of natal origins (Thunberg 1971), the level of homing fidelity is not known. Messieh (1977) suggested that Alewives may stray away from their natal lakes, especially to adjacent areas during upstream spawning migrations. Similar to the stocking scenario described above, the implications are that phenotypic expression of genetic differences would be reduced, which would thus reduce the likelihood of morphometric differentiation. This possible factor is supported by the distinguishable difference between the extreme eastern and extreme western part of Maine. Straying to adjacent areas during spawning migrations may blur local morphometric differentiation, but our results suggest that pooled local spawning sites can be distinguished to a certain extent from other pooled local spawning sites that have a large enough geographic divide between them.
Variation in morphometric measurements due to natal origins could be negated by the environmental effects of being at sea. After hatching, Alewives spend 3 to 7 months in freshwater (Richkus 1975) before returning to the ocean at a TL of 30-80 mm (Iafrate et al. 2008), and thus spend the majority of their life in salt water. They remain at sea until they become sexually mature at 3 or 4 years of age (typically, TL > 250 mm ;Walton 1979;Fay et al. 1983) and return to freshwater to spawn (Loesch and Lund 1977;O'Neill 1980). If Alewives from Maine sites experience similar ocean conditions in the Gulf of Maine, differences in growth from variation in environmental factors would be small and development of morphometric differences negligible. Once out in the open ocean, morphometric variation caused by natal origins, specific watersheds, or other levels could be smoothed out due to trait homogenization.
Based on the 10 measurements that were not altered by freezing, we could discriminate between Alewives from Maine and a single site in Massachusetts, as well as spawning groups of Alewives from the extreme western and the extreme eastern parts of Maine. Our results suggest that the 10 measurements are useful in determining the origins of Alewives at regional scales larger than 100s of kilometers. Thus, it appears that morphometric analysis may provide an easily accessible, comparatively fast, and inexpensive method to test for stock identification across regions. Our findings provide a starting point for a morphometric evaluation across major biogeographic regions or from potentially mixed sources (e.g., marine bycatch). More samples will be required from other Massachusetts streams, as well as spawning runs from more southerly and northerly locations, to fully implement a regional differentiation model. Although we did not use all 27 measurements from freshly caught fish (i.e., nonfrozen fish), future analyses of these data may provide better discrimination among spawning groups at a finer scale (across and within watersheds), which may provide more ecological insights. Also, additional stock-structure techniques such as meristics or genetics, in combination with morphometrics, may provide a more powerful tool to fully evaluate and discriminate stock structure at scales that are below the detection limit of the 10 morphometric variables used here.