Introduction

Narwhals (Monodon monoceros) occur year-round north of 60oN in the eastern Canadian Arctic and around West and East Greenland, Svalbard, and Franz Josef Land (Laidre et al. 2008). In summer, narwhals spend approximately two months in ice-free bays and fjords of the High Arctic and northern Hudson Bay (including Repulse Bay), and they overwinter offshore in deep, ice-covered habitats along the continental slope (Heide-Jørgensen and Dietz 1995). Several populations undertake extensive annual migrations (Innes et al. 2002; Heide-Jørgensen et al. 2002, 2003; Dietz et al. 2008). Narwhals from Canada and West Greenland have strong site fidelity to the winter pack ice of Davis Strait and Baffin Bay, occurring mainly in areas along the continental slope with high gradients in bottom temperature, predictable but sparse amounts of open water (<5%), and relatively high densities of Greenland halibut (Reinhardtius hippoglossoides) (Laidre et al. 2004). It is not known where narwhals from some summering areas (e.g., northern Hudson Bay, East Greenland, and Svalbard) spend the winter. Sightings in the Greenland Sea suggest this is an important wintering area for narwhals from East Greenland and/or Svalbard (Gjertz 1991; Dietz et al. 1994), and some narwhals, presumably from Repulse Bay and northern Hudson Bay, winter in eastern Hudson Strait (Richard 1991).

Narwhals winter in northern and central Baffin Bay and off central West Greenland, and they occur in large numbers off Uummannaq in November. The summer range of these narwhals includes coastal waters of the Canadian Arctic Archipelago and northwestern Greenland. Main summering areas in Canada are Peel Sound, Prince Regent Inlet, Admiralty Inlet, and the Eclipse Sound area (Richard et al. 2010). The main summering areas of West Greenland are Melville Bay and Inglefield Bredning and those of East Greenland are Scoresby Sound and the Ammassalik area (Heide-Jørgensen et al. 2010). Narwhals that summer in northern Hudson Bay and Repulse Bay are assumed not to be part of the Baffin Bay population (Richard 1991).

Dizon et al. (1992) suggested that stocks should be delineated on the basis of information on distribution and movements as well as population responses, genotypic differences, and phenotypic differences. The advantage of using life history and morphological traits in studies of population structure is that these are often related to fitness and respond to selection and thus may reveal differentiation not evident in neutral genetic markers (Swain et al. 2005).

Satellite tracking of narwhals has been used to help define stocks and to examine the meta-population structure of several small subpopulations that showed high fidelity to certain summering grounds (Heide-Jørgensen et al. 2003). Narwhals tagged and tracked from three summer aggregations in West Greenland and the eastern Canadian High Arctic generally remained within those isolated aggregations through the summer and then moved to specific wintering areas. Narwhals tagged in Melville Bay and Eclipse Sound moved to a common wintering ground in the middle of northern Davis Strait, while narwhals tagged near Somerset Island moved to a distinct area in southern Baffin Bay (Dietz et al. 2008). The narwhals made only local movements on their wintering grounds (Laidre et al. 2004), and the tracking results from two individuals over 14 months showed them to return to the same summering areas where they had been tagged (Heide-Jørgensen et al. 2003). Narwhals from Eclipse Sound visited several East Baffin fjords during their fall migration, within the range of hunters from East Baffin communities. Some exchange of narwhals between Canada and West Greenland seems to occur. For example, a male from Admiralty Inlet visited Disko Bay in 2006 (NAMMCO 2009), and a female from Uummannaq visited the Somerset Island area in 2009 (Heide-Jørgensen et al. unpublished).

Dietz et al. (2004) studied the patterns of heavy metals, organochlorines, and stable isotopes in narwhal tissue samples from West Greenland and found some evidence of differences between localities but also evidence of inter-annual differences at the same locality. Those results could not be used to differentiate stocks.

Results of a genetic study of mitochondrial DNA showed that narwhals have very low genetic diversity, which makes stock differentiation from genetics alone problematic although stocks in East and West Greenland can be differentiated (Palsbøll et al. 1997). High Arctic narwhals appeared genetically distinct from narwhals in Repulse Bay, and narwhals from Grise Fiord were weakly differentiated from animals elsewhere in Canada (de March et al. 2003).

Hunters have reported that they observe differences in the physical appearance of narwhals, possibly indicative of stocks (Remnant and Thomas 1992; Thomsen 1993; NAMMCO 2005). Phenotypic differences between groups in the wild may reflect genetic differentiation, environmental differences, or a combination of the two (Thompson 1991). Geographic variation in morphology has been used in many studies to define populations, stocks, or subspecies of cetaceans (e.g., Perrin 1975; Christensen et al. 1990; Amano and Miyazaki 1992; Mikkelsen and Lund 1994; Perrin et al. 1994; Westgate 2007). The purpose of this study was to examine geographic variation in cranial morphology of narwhals from Baffin Bay and East Greenland and to determine whether there was sufficient variation to inform discussions of stock structure.

Materials and methods

Skulls

We used 152 narwhal skulls collected from Eclipse Sound in Canada, Inglefield Bredning, Uummannaq and Disko Bay in West Greenland, and Scoresby Sound in East Greenland (Fig. 1). The skulls were collected between the 1800s and 2006, mostly from the Inuit subsistence hunt (ten skulls from Eclipse Sound were from scientific collections; Mansfield et al. 1975). Ninety-four (62%) of the skulls were collected within the last 20 years. Several of the skulls had unknown collection dates.

Fig. 1
figure 1

Localities related to narwhal (Monodon monoceros) distribution in Greenland and the eastern Canadian Arctic

The occurrence of a single spiraling tusk that emerges from the left side of the upper jaw is a characteristic feature of the narwhal (Reeves and Tracey 1980). This tusk is a male sexual trait, although females sometimes have it. Individuals are occasionally ‘double-tusked’, with tusks growing from both sides of the jaw. Even more rarely, the tusk grows only on the right side.

We initially used the presence/absence of a tusk to determine the sexes of the skulls. Any skull with the left part of the rostrum destroyed and recorded as a male was assumed to be a male. The sexes of 23 of the skulls from Scoresby Sound were also confirmed by DNA analyses (Palsbøll et al. 1997). The sexes of 42 skulls from all areas in West Greenland and Canada were further confirmed from sex organs. Overall, the sample size consisted of 79 males and 73 females (Table 1). One male without tusks identified from two testes was caught in Uummannaq in 1993.

Table 1 Number of skulls of narwhals (Monodon monoceros) from localities in eastern Canada and Greenland used in the analyses

The ages of the skulls were not known. Several skulls were partly destroyed from bullet wounds during the harvest or when the tusk had been removed from the maxillary bone. All variables used in the analyses could therefore not be scored on all skulls. The skulls are located at the Zoological Museum, Copenhagen; Natural History Museum, University of Oslo; and Museum of Nature, Ottawa.

Non-metric variables

A set of eight non-metric variables (NMV) was scored on each skull. Five of the same variables had been used by other investigators (Perrin et al. 1994; Mikkelsen and Lund 1994; Gao and Gaskin 1996), but three (NMV 1, 2, and 3) were defined specifically for this study. Bilateral variables were scored on the left side. If the left side was missing, it was scored at the right side, if possible. The eight non-metric variables are defined in Table 2 (Fig. 2).

Table 2 Non-metric variables scored in skulls of narwhals (Monodon monoceros) from localities in eastern Canada and Greenland
Fig. 2
figure 2

Rear (a), dorsal (b), and lateral (c) views of the narwhal skull with metric variables indicated

Metric variables

Thirty-two metric variables (MV) were measured to the nearest 0.1 mm on all available skulls according to Perrin (1975) and Mikkelsen and Lund (1994). Bilateral variables were measured on both sides of the skull. Because many of the skulls were damaged, it was not possible to measure all variables on all skulls. In the analyses, we used the left side for all comparisons; however, in cases where the left side was missing, we used measures from the right side. We made provisional analyses with eight, 12, and 15 metric variables but selected eight variables for the final analysis in order to keep the number of variables lower than the sample size in each cell (Tabachnick and Fidell 2007), to maximize the number of skulls with all variables present, and to minimize the number of estimated values (see below) for each skull. The eight metric variables are defined in Table 3 (Fig. 3).

Table 3 Metric variables scored in skulls of narwhals (Monodon monoceros) from localities in eastern Canada and Greenland
Fig. 3
figure 3

Rear (a), dorsal (b), and lateral (c) views of the narwhal skull and mandible with position of non-metric variables indicated

Statistical methods

Most non-metric variables are independent of the size of the skull and sex of the animal. However, we tested this assumption given that the variables were being applied to a new species (Wiig and Andersen 1988). The relationship between the non-metric variables and the size of the narwhal skull was tested by Kendall’s correlation and separating the sexes. We selected condylobasal length (MV1) to represent the size of the skull. The variables were coded as absent or present and tested for sex dependence. Sexes were pooled for non-metric variables independent of sex and used for analyzing geographic variation.

Sexes were kept separate in analyses of metric variables because male narwhals are generally larger than females (Garde et al. 2007). As a result, sample sizes from only two areas (Inglefield Bredning and Scoresby Sound) were large enough to be included in the analyses. Based on NAMMCO (2009), we pooled the samples from Uummannaq and Disko Bay so that sample sizes were large enough to compare three regions of Greenland in the analyses of metric variables. However, it must be noted that the relationships between narwhals caught in winter in Disko Bay and the summer aggregations in Canada and Greenland are unresolved. In order to maximize sample sizes of complete skulls, we estimated missing values for specimens with one to three missing values by the Expectation Maximization (EM) technique recommended by Tabachnick and Fidell (2007) for data missing at random. There was no way to age individual skulls, so removal of juveniles (i.e., individuals not yet fully grown) from the analysis was not possible. Rostral fusion has been suggested as a criterion for cranial maturity in dolphins, but this criterion can give misleading results (Perrin and Heyning 1993). We used the distribution of condylobasal length and excluded small individuals that were more than 1.5 quartiles away from the 25th percentile, according to the definition of extreme values in SPSS (Pallant 2001). Condylobasal length has been used in other studies to help assess physical maturity (e.g., Westgate 2007).

We pooled the sexes and used a two-way multivariate analysis (MANOVA) to compare sexes and areas based on log-transformed data. In order to reduce the effect of possible age-related size variation, we first applied a multivariate analysis of covariance (MANCOVA) with condylobasal length as a covariate. Condylobasal length was selected because it is a variable that expresses the general size of the skull and because it was also used to assess the maturity of the skulls. Second, we used a multiple-group principal component analysis based on log-transformed data to discriminate between size and size-free axes (Wiig 1989). We then used a MANOVA to test differences between sexes and areas in a size-free space.

All statistical analyses were performed by SPSS (ver. 16.0) (SPSS Inc., Chicago, Illinois). Level of significance was generally set at P ≤ 0.05, except for tests of equality of covariance matrix across groups by Box’s M, where level of significance was set at P ≤ 0.001 (see Tabachnick and Fidell 2007). When several tests were performed on the same dataset of non-metric variables, we used Bonferroni correction of level of significance (alpha) that at the outset was set at P < 0.05. We used seven or eight variables in these tests so that level of significance was P ≤ 0.05/7 = 0.007 or P ≤ 0.05/8 = 0.006.

Results

Non-metric variables

The occurrence of NMV 2 (Opening into the embedded tooth) was negatively related to the size of the skull (Males: R = −0.40, n = 51, P < 0.006; Females: R = −0.32, n = 54, P < 0.006), and the variable was not used further in the study. The other seven variables (NMV 1, 3–8) were not related to the size of the skull (P > 0.006) and were used to assess geographic variation. The distribution of the variables was not different between the sexes (χ2 < 3.84, df 1, P > 0.007). Data from both sexes were accordingly pooled in further analyses.

We first tested whether there was any difference between Eclipse Sound in Canada and the four areas in West Greenland, but no variable had significantly different distribution between these areas (P > 0.007, Table 4). We then compared the four West Greenland samples and found no differences (P > 0.007, Table 5). Two of the variables (NMV 1 and NMV 3) had significantly different distributions between the combined West Greenland sample and the sample from East Greenland (Scoresby Sound) (Table 5).

Table 4 Test of difference in distribution of seven non-metric variables in skulls of narwhals (Monodon monoceros) from Eclipse Sound in eastern Canada and Inglefield Bredning, Uummannaq, and Disko Bay in West Greenland
Table 5 Test of difference in distribution of seven non-metric variables in skulls of narwhals (Monodon monoceros) from Inglefield Bredning, Uummannaq, and Disko Bay combined as a West Greenland sample, and Scoresby Sound in East Greenland

Metric variables

Data from 10 males and 10 females were removed from further analyses either because they had more than three missing values or because they were extremely small. The resulting dataset consisted of 57 males and 45 females. There was no time trend in size of skulls assessed by regressing condylobasal length against year of capture (Males: F = 0.21, df 1, 45, P > 0.05; Females: F = 0.30, df 1, 41, P > 0.05). Data were missing at random in both sexes (Males: χ2 = 54.3, df 50, P > 0.05; Females: χ2 = 67.7, df 54, P > 0.05). The proportion of estimated data was 6.8% in males and 5.8% in females. The distribution of the eight variables in the three areas (Inglefield Bredning, Uummannaq/Disko Bay, and Scoresby Sound) was assessed for normality for each sex by a Kolmogorov–Smirnov test. None of the 24 tests were significant in males or in females. Therefore, we used all variables in further analyses.

The covariance matrix was equal across groups for the pooled dataset (Box’s M = 302.36, df 180, 8,593.30, P > 0.001). The two-way MANOVA revealed significant differences between the sexes (Wilks’ λ = 0.590, df 8, 91, P < 0.001). There was no significant difference between areas (Wilks’ λ = 0.835, df 16, 182, P = 0.382). The power of the test (the probability of correctly rejecting a hypothesis that is false) was 0.70 using a significance level of 0.05, and the effect size (Partial eta squared = total variance of independent variables accounted for by dependent variable) was 0.086. However, by using a significance level of 0.10, the estimated power of the test was 0.81.

In the ANCOVA with condylobasal length as a covariate, the covariance matrix was not different across groups (Box’s M = 220.26, df 140, 8,728.38, P > 0.001). There was no difference between the sexes in multivariate space (Wilks’ λ = 0.880, df 7, 91, P = 0.103). There was also no differences between areas (Wilks’ λ = 0.893, df 14, 182, P = 0.71). The power of these tests was 0.69 and 0.47, respectively, based on a significance level of 0.05, with effect sizes of 0.120 and 0.055, respectively. Using a significance level at 0.10, the power of the tests was 0.79 and 0.61, respectively, confirming differences between sexes as in the ANOVA but still low power for evaluating the difference between areas.

The PCA on log-transformed data revealed positive loadings for all eight variables on the first principal component (PC1) (Table 6), indicating that this is a size component (Wiig 1989). The other components had loadings of variable size and signs and are shape components. PC1 accounts for 60.0% and PC2 accounts for 13.7% of the total variation. We tested the difference between groups on the size component PC1. The error variance across groups was not different (Leven’s test: F = 1.550, df: 1, 98, P = 0.181). PC1 showed difference between sexes (F = 38.51, df 1, 98, P < 0.001); females had lower scores than males. There was no difference between areas (F = 1.507, df 2, 98, P = 0.206). The power of this test was, however, low (0.332), and the effect size was 0.032.

Table 6 Principal component loadings for the first (PC1) and the second (PC2) principal component of log-transformed variables measured on the skulls of male and female narwhals (Monodon monoceros) from Greenland

The MANOVA on the seven size-free components showed that the covariance matrix was not different across groups (Box’s M = 225.42, df 140, 8,728.38, P > 0.001). There was no difference between the sexes in multivariate space (Wilks’ λ = 0.866, df 7, 92, P = 0.060). There were also no differences between areas (Wilks’ λ = 0.873, df 14, 184, P = 0.533). The power of these tests was 0.76 and 0.57, respectively, based on a significance level of 0.05. The effect size was 0.134 and 0.066, respectively. Using a significance level of 0.10, the power of the tests increased to 0.849 and 0.700, respectively. Thus, we detected marginal shape differences between sexes but still had too little power to the test for differences between areas. This is very similar to the results of the ANCOVA. The distribution of the skulls on PC1 and PC2 is shown in Fig. 4 and indicates a complete overlap of scores between areas in both sexes.

Fig. 4
figure 4

Scores of males and females on PC1 and PC2 from a principal component analysis of eight metric variables measured on the skull of (Monodon monoceros) from Inglefield Bredning (filled circle), Uummannaq/Disko Bay (open circle), and Scoresby Sound (inverted filled triangle) in Greenland

Discussion

In general, our results showed no differences in morphology between the skull samples from Eclipse Sound in Canada and those from West Greenland, while they showed some differences between West and East Greenland.

The three different analyses of metric variables did not detect morphological differences between narwhals from Inglefield Bredning, Uummannaq/Disko Bay, and Scoresby Sound. The power of the tests was, however, small in relation to the value of 0.80 recommended by Cohen (1988). Power depends on the effect size, the alpha level set by the researcher, and the sample size. The effect size in our analyses of relationships between areas was generally small (0.034–0.087), indicating that area accounted for very little of the variance in the metric variables. By applying an alpha level of 0.10, as suggested by Stevens (1996) for tests with low power, the power increased but the differences between areas were still not significant. This indicates that the present data do not show any detectable difference in metric variation between areas. This is also apparent from the complete overlap of scores of each sex on PC1 and PC2 in Fig. 4.

No differences in distribution of non-metric variables were detected between the sample from Eclipse Sound and any of the samples from West Greenland. According to NAMMCO (2009), narwhals from Eclipse Sound and Melville Bay overwinter in the same area, meaning some genetic exchange may occur between animals from these two summer aggregations. Narwhals from Admiralty Inlet might spend time in Eclipse Sound as well as in Disko Bay (and thus contribute to the samples from hunts in those two areas). Also, narwhals from the summer aggregation in Melville Bay may occur in Disko Bay in the winter and in Uummannaq in the spring (NAMMCO 2009). These relationships could contribute to the lack of differentiation in distribution of non-metric variables between the Eclipse Sound sample and the combined West Greenland sample as well as the lack of differences among the four samples from West Greenland. Differences among the West Greenland samples were also not found from the metric variables. Two of the seven non-metric variables showed differences between the combined West Greenland sample and the East Greenland sample from Scoresby Sound, whereas such differences could not be detected in the metric variables.

Perrin et al. (1994) compared the resolving power of metric and non-metric cranial variables in a study of geographic variation in common dolphins (Delphinus spp.). They concluded that while metric characters had higher discriminating power than non-metric characters, both were useful and efficient. They recommended that non-metric characters be used only in combination with metric characters. Interestingly, in our study, it was the non-metric variables that differed between samples. A major shortcoming in our study, however, was the smallness of sample sizes from all areas, resulting in low statistical power. Sample sizes in studies where skull morphology has been used successfully to distinguish populations have generally been larger, and these studies have used a higher number of variables. Westgate (2007) pointed out that the significance reported in his study was likely the result of large sample sizes that increased the power of the tests. Larger samples of complete (intact) skulls would allow us to use more variables and thereby incorporate a greater proportion of the total morphological variation in the skulls and increase the power of the tests.

The findings from our study using morphology are generally consistent with the results from genetic studies (Palsbøll et al. 1997) and pollution distribution studies (Dietz et al. 2004). Population structure within the Baffin Bay region is difficult to resolve, whereas it seems clear that narwhals from west of Greenland (Baffin Bay region) and east of Greenland are geographically and demographically separate.

Because of the complex interdependence between function and anatomy of the cranium, cranial changes are likely to occur only if differential selective pressures and reproductive isolation are strongly in place (Westgate 2007). However, phenotypic differences between groups in the wild may reflect genetic differentiation, environmental differences, or a combination of the two (Thompson 1991). The studies of Palsbøll et al. (1997) and Dietz et al. (2004) suggest that there are genetic as well as environmental differences between narwhals living west and east of Greenland, which could explain the difference we detected in skull morphology. Movement of narwhals between West and East Greenland has never been documented, and narwhals in these areas are managed as separate stocks (NAMMCO 2009).