Chemical abundances and ages of the bulge stars in APOGEE high-velocity peaks

A cold high-velocity (HV, $\sim$ 200 km/s) peak was first reported in several Galactic bulge fields based on the APOGEE commissioning observations. Both the existence and the nature of the high-velocity peak are still under debate. Here we revisit this feature with the latest APOGEE DR13 data. We find that most of the low latitude bulge fields display a skewed Gaussian distribution with a HV shoulder. However, only 3 out of 53 fields show distinct high-velocity peaks around 200 km/s. The velocity distribution can be well described by Gauss-Hermite polynomials, except the three fields showing clear HV peaks. We find that the correlation between the skewness parameter ($h_{3}$) and the mean velocity ($\bar{v}$), instead of a distinctive HV peak, is a strong indicator of the bar. It was recently suggested that the HV peak is composed of preferentially young stars. We choose three fields showing clear HV peaks to test this hypothesis using the metallicity, [$\alpha$/M] and [C/N] as age proxies. We find that both young and old stars show HV features. The similarity between the chemical abundances of stars in the HV peaks and the main component indicates that they are not systematically different in terms of chemical abundance or age. In contrast, there are clear differences in chemical space between stars in the Sagittarius dwarf and the bulge stars. The strong HV peaks off-plane are still to be explained properly, and could be different in nature.


INTRODUCTION
Bulges, disks, and halos are the main components in most spiral galaxies. Bulges are very common; they can be found in more than 80% of Milky Way (MW) size galaxies (Fisher & Drory 2011). There are two main types of bulges: pseudo-bulges (disk-like, rotation dominated) and classical bulges (mini-elliptical, random motion dominated). The near infrared images from the COBE satellite reveal clearly that the MW contains an asymmetric parallelogram-shaped boxy bulge in the center (Weiland et al. 1994). The stellar mass of the MW bulge in the volume of the red clump giant (RCG) measurement (±2.2 × ±1.4 × ±1.2 kpc) is ∼ 1.25 − 1.6 × 10 10 M ⊙ and the bulge composes ∼ 20% of the total Galactic luminosity (Portail et al. 2015). Using red clump giants from the OGLE survey in the bulge region (−10 < l < 10, 2 < |b| < 7), Cao et al. (2013) constructed a Galactic bulge model with total bar mass of ∼ 1.8 ×10 10 M ⊙ . Increasingly more evidence supports that the bulk of our MW bulge is probably a boxy/peanut-shaped bar (see, e.g., the recent review by Shen & Li 2016).
A cold (σ V ∼ 30 km s −1 ) high-velocity (HV; V GSR ≈ +220 km s −1 ) peak in the bulge radial velocity distribution was first reported by Nidever et al. (2012) with the Apache Point Observatory Galaxy Evolution Experiment (APOGEE; Majewski et al. 2017) commissioning data (first released in DR10). Based on the Giraffe Inner Bulge Survey (GIBS), Zoccali et al. (2014) and Zoccali et al. (2017) found no significant HV peaks in the bulge. However, the APOGEE HV detection was predominantly in the lowest-latitude fields and GIBS did not observe in-plane fields (b = 0 • ). The possible HV peaks have raised intense discussions about its nature, which is still unclear. One potential explanation is that these stars are part of the tidal tail of Sagittarius (Sgr), which lies near the bulge fields, but the distances of the HV stars are roughly from 5 to 10 kpc, which makes this explanation unlikely (Nidever et al. 2012). The HV peaks are not likely to be a new substructure in the halo. Nidever et al. (2012) suggested that the HV stars are most likely bulge stars on bar orbits. Analysis of the bulge orbits showed that the HV peak can be due to the motion of stars in the barsupporting 2:1 resonant family and in other higher-order resonances (Aumer & Schönrich 2015; Molloy et al. 2015). However, stars in the bar may not be on exactly resonant orbits. Using the model in Shen et al. (2010), Li et al. (2014) suggested that the full velocity distribution of the stars making up the bar potential can only produce an HV shoulder instead of a distinct HV peak. Li et al. (2014) speculated that the observed cold HV peak might be an artificial small number statistic. Debattista et al. (2015) suggested that the HV peak may be explained by a kiloparsec-scale nuclear stellar disk in the Galactic bulge. They predicted second peaks in the lineof-sight velocity distribution (LOSVD) at Galactic longitude l = 8 • ± 2 • , but not off the midplane.
Aumer & Schönrich (2015) who argued that the selection function of APOGEE is more sensitive to young stars, found that the velocity distribution of young populations represents such an HV peak in their model. The HV peak could be due to young stars formed from gas just outside the growing bar and subsequently captured by it.
Stellar ages are important in understanding the HV peak, but they are hard to measure directly. Astroseismology provides a good approximation of age by measuring the time spent in the core hydrogen burning phase, which is a function of stellar mass, but it is observationally expensive and currently available only for relatively small samples of stars, and it is not feasible to obtain for a large survey. However, stars with different ages can display some different chemical abundances. For example, the C and N elemental abundances in red giants are sensitive to stellar characteristics, including mass and age (Masseron & Gilmore 2015;Martig et al. 2016). APOGEE provides chemical information (including C and N) for stars with an internal precision of 0.05-0.09 dex, and with an external accuracy of 0.1-0.2 dex . Therefore, the elemental abundances derived from APOGEE giant stars can be used to infer ages. Chemical abundances and the inferred ages may provide important clues to the nature of the HV peaks/shoulders.
Our motivations of this work are to (1) revisit the HV peak in the latest APOGEE data release DR13, (2) see whether or not the stars in the HV peak have a different age composition compared to the main component, and (3) determine the age composition of the APOGEE bulge stars and whether the HV peak is more pronounced in young stars.
The paper is organized as follows. Section 2 introduces the selection method of our main sample. In Section 3 we check the velocity distribution of every field for the possible existence of the HV peak (∼200 km s −1 ), fit the velocity distribution with Gauss−Hermite polynomials and compare the results to Shen et al. (2010) model. To study chemical abundance differences between the HV peak stars and main sample, stars are divided into two components depending on their velocities. In Section 4 we analyze the chemical abundance including [M/H], [α/M] and [C/N] of the two components to see whether the HV stars have distinct chemical features and study the velocity distributions of the young and the old populations, using [C/N] as an age proxy. In Section 5 we summarize our main results.

SAMPLE SELECTION
Because of operating in the near-IR (1.51-1.70µm), APOGEE provides a good window to understanding the heav-ily dust-obscured bulge. APOGEE collects high-resolution (R ∼ 22,500) spectra with a multiplexing (300 fibered objects) capability and provides a catalog with radial velocity, stellar parameters, and up to 15 elemental abundances for over 150,000 stars in DR12 (Gunn et al. 2006;Holtzman et al. 2015;Majewski et al. 2017).
DR13 does not have any additional targets compared to DR12, but the data reduction and analysis have been improved in several ways. In this work we consider the bulge region covering the Galactic longitude l from −5 • to 20 • and Galactic latitude b from −20 • to 20 • . The Sgr dwarf spheroidal (dSph) galaxy is also located in the region. In this special field, Sgr member candidates are targeted based on a selection of Two Micron All Sky Survey (2MASS) M giants (Majewski et al. 2003).
There are ∼20,000 stars in the selected region. However, ∼10,000 stars do not have accurate effective temperature, due to either the low signal-to-noise ratio S/N or the temperature limitations in the ASPCAP pipeline (García Pérez et al. 2016a). The STARFLAG indicator is used to avoid bad pixels, and we do not consider stars with very bright neighbors or with low S/N (S/N < 5). To avoid globular cluster, binary and variable stars, we only consider velocity dispersions (vscatter) less than 1 km s −1 over four visits (Nidever et al. 2015;Fernández-Trincado et al. 2016). To exclude foreground dwarfs, we apply a log g < 3.8 limit .
APOGEE provides accurate line-of-sight velocities. The galactocentric standard of rest velocity (V GSR , the line-of-sight velocity that would be observed by a stationary observer at the Sun's position) is derived as follows: (1) where V HC is the heliocentric velocity provided in DR13. We adopt the solar peculiar motion (U ⊙ , V ⊙ , W ⊙ ) as (11.1 +0.69 −0.75 , 12.24 +0.47 −0.47 , 7.25 +0.37 −0.36 ) km s −1 (Schönrich et al. 2010), and the circular speed of the local standard of rest as 239 ± 5 km s −1 (McMillan 2011). We tested the other values of the solar motion, such as those of Tian et al. (2015) and Huang et al. (2015), as well as the circular speed of 220 km s −1 , and found that these different values would not change our conclusions. 1 The T eff -V GSR distribution of stars located in bulge direction is shown in the left panel of Figure 1. The T eff distribution of low-velocity stars (−50 km s −1 < V GSR < 150 km s −1 ) spreads over a wider temperature range (from 3500 K to 5000 K), while the T eff of HV stars (V GSR > 150 km s −1 ) spreads over a narrower range (from 3600 K to 4300 K). Due to our cut in log g, most of the stars are giants, so the higher-T eff stars are located generally on the lower part of the red giant branch (RGB) and thus have relatively lower luminosity, which leads to a selection effect that stars with higher T eff are likely closer to the Sun than those with lower T eff . The smaller velocity dispersion for the stars with higher T eff shown in Figure 1 is another indication that a larger fraction of those stars may be foreground disk stars. The much larger velocity dispersion for the cool stars (T eff < 4000 K) also hints that the majority of this sample belongs to the Galactic bulge.
Distances are given in a value-added catalog of DR13 by Santiago et al. (2016). The middle and right panels in Figure 1 show the stellar distributions in distance−T eff and distance−V GSR , respectively. The right panel is roughly consistent with the predicted distance−velocity diagram shown in Li et al. (2014) based on the Shen et al. (2010) model. This confirms our assertion that the lower-dispersion/higher T eff giants are significantly more contaminated by foreground disk stars. At the temperature limit of 4000 K, the stellar distance ranges from about 4 to 12 kpc, with most of the foreground stars removed. Unfortunately, a significant fraction of the APOGEE stars do not have distances measured by Santiago et al. (2016), so in the end we do not use the distance cut in this study. Instead, we use the temperature cut of 4000 K to exclude the foreground disk contamination. Note that the exact value of the temperature cut does not affect our conclusions. The temperature cut at 4000 K corresponds to log g cut of 1.5 since log g is roughly correlated with T eff for the red giant branch. There are 2065 stars with T eff < 4000 K in Figure 1; 319 of these are disk stars with distance less than 4 kpc. Thus, in this subsample, the fraction of disk contamination is 15%. The observed and derived quantities, especially distances, have uncertainties, and thus the actual fraction of foreground (and also possibly background) disk stars in the cleaned sample might be higher than the estimated 15%. Figure 2 displays the velocity distributions for all −5 • < l < 20 • and −20 • < b < 20 • survey regions. The bin width of 20 km s −1 is the same as in Nidever et al. (2012). Clear HV peaks can be seen in only three bulge fields, i.e., at (l, b)=(6 • , 0 • ) and (10, ±2) • . For this discussion we only consider positive HV peaks centered at 150-220 km s −1 , where the distributions are clearly skewed Gaussians from visual inspection. Peaks at negative velocities such as the field (4 • , −2 • ) are not considered because no current theoretical models can explain such peaks. A simple statistical approach 2 is adopted to estimate the significance of the peaks compared to the Poisson noise. Most negative velocity peaks and some of the positive peaks might be due to the statistical fluctuations because they are within ∼ 2σ Poisson error. However, at (6 • , 0 • ), (10 • , +2 • ) and (10 • , −2 • ) the statistic values of the peaks are 4.90σ, 3.77σ, and 2.88σ, respectively. As a comparison, in Section 3.3, we also present velocity distributions of our sample without removing foreground disk stars, and the majority of fields show only a smooth skewed Gaussian distribution, and the peaks become statistically less significant.

Velocity distributions in the Galactic bulge
To compare directly with Nidever et al. (2012) in the radial velocity distributions for different bulge fields, we use the same bin size (20 km s −1 ) as in their paper. Different bin sizes can affect the shape of the distribution: larger bins reduce the sampling noise, while smaller bins give better resolution of the density estimation but with larger noise. The optimal choice of the bin size for the histogram determined with the Freedman & Diaconis' rule 3 is from 25 km s −1 to 70 2 We define a statistic s = |n − y|/ √ n, where n is the number of counts in a certain bin and y is the theoretical value from single Gaussian fitting. The value √ n is the Poisson error. Our null hypothesis is that the peak is significant. The confidence level could be calculated by erf( s √ 2 ), where erf(x) is the error function. 3 Freedman & Diaconis (1981) specifies an optimal method to minimize the difference between the histogram and the underlying distribution. The optimal bin size is h = 2IQR(X)n −1/3 , where n is the number of observations km s −1 (depending on the number of stars in each field), which is larger than the adopted bin size 20 km s −1 . With the optimal bin sizes, potential HV peaks all become weaker or even disappear. Therefore, statistical fluctuation due to small sample sizes might also be partially responsible for the HV peaks, as suggested by Li et al. (2014).
The underlying distribution of velocity is still unknown. A double Gaussian is the simplest model, as attempted in Nidever et al. (2012) and Zasowski et al. (2016). In this case, the positions of the peaks are very important parameters. However, if the peak position is allowed to vary during the fitting, it is quite difficult to constrain the two Gaussians simultaneously (one for the main component, the other for the HV peak), i.e., two different double Gaussian profiles might give very similar χ 2 . On the other hand, if the peak positions of the double Gaussian are constrained in a small range during the fitting, the final results would depend sensitively on the initial peak positions. Therefore this fitting strategy may not be optimal. There is another way to describe a skewed Gaussian. van der Marel & Franx (1993) proposed a decomposition of line profiles into orthogonal functions of the Gauss−Hermite series. The LOSVD could be written as (van der Marel & Franx 1993): (2) wherev is the mean velocity, σ is the velocity dispersion, In this function, γ,v, ω, h 3 , and h 4 are free parameters. We truncate the Gauss−Hermite series at n = 4 because the higherorder moment contribution is negligible (Binney & Tremaine 2008). The parameters h 3 and h 4 measure asymmetric and symmetric deviations from a Gaussian, respectively. For example, a distribution with a prominent HV tail will have a positive h 3 and a distribution with a sharp central peak will have a positive h 4 .
To avoid the uncertainties and degeneracies mentioned above, and to better describe the shape of the velocity distribution, the Gauss−Hermite series are used in this paper (as in Bureau & Athanassoula 2005; Iannuzzi & Athanassoula 2015; Du et al. 2016). Considering the complicated bar kinematics and our small sample, Gauss−Hermite moments that describe higher order deviations of a Gaussian may be a better choice than fitting double Gaussian profiles.
The Gauss−Hermite fitting results are shown in Figure 2 and the correspondingv, σ, h 3 and h 4 distributions are shown in Figure 3. Velocity distributions in 5 • < l < 15 • and −3 • < b < 3 • show larger h 3 than other fields. Including the three fields with clear HV peaks, the velocity distributions in this region show strong asymmetries. These will be studied in more detail in Section 4 specifically for their chemical properties.
Although the velocity distributions look noisy, the HV shoulders in most of the bulge fields can be well described by the Gauss−Hermite polynomials, except the three fields showing clear HV peaks. We will show that the spatial distribution of h 3 is consistent with that of the model in Shen et al. (2010) (Section 3.2) and a clearer pattern of h 3 is seen when we do not remove the foreground disk giants (Section 3.3).
For stars with high effective temperature (T eff > 4000 K), the velocity dispersion is small (consistent with the disk-like kinematics), while lower-temperature stars have a larger velocity dispersion similar to bulge-like kinematics. The black dashed line shows our temperature cut to separate possible foreground disk stars from bulge stars. (b) Relation between distance and T eff for the same stars. The vertical dashed line marks T eff =4000 K, and the horizontal dashed line corresponds to a distance of 4 kpc. A clear anticorrelation can be seen between distance and T eff . (c) The V GSR − distance distribution for the sample. The dashed line corresponds to 4 kpc. Debattista et al. (2015) suggested that the HV peaks of the LOSVD could be due to a kiloparsec-scale stellar disk that is composed of stars on x 2 orbits. They predict that the LOSVDs in the midplane (l = 8 • ± 2 • ) should exhibit HV peaks, whereas those off-plane (|b| ≥ 1 • ) will not. The clear peak at (6 • , 0 • ) may be explained by their model provided that the nuclear disk/ring is smaller. However, the peaks offplane at (10 • , ±2 • ) cannot be explained by their model and remain puzzling in the context of their model. APOGEE has fields covering the Sgr core and stream. In these fields candidate Sgr members are targeted based on a selection of 2MASS M giants (Majewski et al. 2013;Zasowski et al. 2013). Since the Sgr has experienced quite different evolution history compared to the MW, the comparison between the SGR field and Galactic bulge is useful to distinguish the chemical and kinematic differences. In Sgr fields, especially SGRC3, the HV peaks are much more prominent; thus, the fitting result is not as good as that in the bulge fields. This implies that the underlying velocity distribution is different from that of bulge stars. The HV peaks in Sgr fields are better described by adding another Gaussian. This is reasonable since the stars in the second peak are the core or tidal streams of the Sgr dSph. In this case, a double Gaussian fit may be more appropriate.

h 3 and skewness profiles in the Galactic bulge
Simulations suggest that h 3 provides a good description of the shape of LOSVDs in the bar region of edge-on galaxies.
In an edge-on disk, for a bar seen end-on 4 , h 3 is correlated withv over the entire projected bar length, and then becomes anticorrelated beyond the bar in the disk region, and becomes correlated again at even larger distances from the galaxy center (Bureau & Athanassoula 2005). This feature makes an h 3 -v correlation an indication of the LOSVD with the HV tail created by the bar-supporting orbit (Shen & Debattista 2009;Iannuzzi & Athanassoula 2015). Thus, h 3 can be a very good tracer of bars viewed edge-on, instead of using a second peak. The correlation between h 4 and the bar is not as strong as h 3 but we consider it for completeness. For the strong-bar case, the width of the central h 4 minimum is roughly the same as that of the flat part of the dispersion profile (Bureau & Athanassoula 2005). 4 End-on means that the bar is viewed in-plane from its major axis.
Skewness and kurtosis 5 are also used to described the bar kinematics (Zasowski et al. 2016). Fitting the full LOSVD requires all the moments of the distribution. Higher-order moments depend sensitively on the LOSVD shape, especially the contribution from outliers. The coefficients h 3 and h 4 of the Gauss−Hermite series can give a good description of the skewness and kurtosis, with additional restrictions on the shape of the underlying distribution. Considering that the covariances between the best-fit parameters are minimized, using Gauss−Hermite polynomials is optimal. In other words, there will be almost no correlations between the errors in different parameters (van der Marel & Franx 1993).
To explore how h 3 changes with l (orv), we compare the observational data to an N-body model of the MW bar. The model employed here comes from Shen et al. (2010, hereafter S10) which has successfully reproduced MW bulge kinematics from BRAVA observations. The simulation starts with an exponential disk represented by one million particles in a rigid dark matter halo potential. A bar forms spontaneously owing to disk instability and quickly develops a boxy/peanut-shaped bulge in the vertical direction. We select the snapshot at ∼5 Gyr to study the bulge kinematics here. The half length of the bar in S10 is 4 kpc, and the Sun-Galactic center line is 20 • offset from the major axis of the bar. Figure 4 shows the h 3 and the skewness profiles as a function of the longitude in our sample (−5 We use the bootstrap method to estimate the errors of the sample. The 1σ errors are shown as blue and red shaded regions in each panel. We found that the skewness measurement can show large scatter owing to the velocity outliers dominating the (v −v) 3 term in the skewness computation, while h 3 is much less affected in this case. Therefore, we exclude stars with V GSR deviating fromv by 3σ to avoid large fluctuations in the skewness distributions.
In S10, with all the particles, the bulge (inner) and the disk (outer) regions show opposite behaviors in both h 3 and skewness profiles, as shown in Figure 4(c). Note that the mean velocity profile (v) increases monotonically from negative to 5 The skewness is the third standardized moment, defined as g 1 = m 3 /m 3/2 2 , where m 3 = Σ(x −x) 3 /n and m 2 = Σ(x −x) 2 /n,x is the mean, and n is the sample size. The sample skewness we use is (n(n − 1)) 1/2 /(n − 2) × g 1 (Joanes & Gill 1998). Kurtosis is the fourth standardized moment, defined as g 2 = m 4 /m 2 .   In Figure 4(d), we exclude the particles with distance larger than 4 kpc away from the Galactic center to select only those in the bulge without foreground or background contaminations. The h 3 −v correlation in the bar region in Figure 4(d) becomes much more weakened compared to that in Figure  4(c). At positive longitudes, the foreground and background stars mainly contribute to low-velocity tails. With them removed, the velocity distribution becomes roughly symmetric to yield small h 3 and skewness values. Away from the bulge region, they become negative since disk stars dominate the LOSVD. The behaviors are reversed for negative longitudes. For our APOGEE sample, with all stars considered, the h 3 and the skewness show similar trends, but the h 3 peaks at l ∼ 17 • while the skewness peaks at l ∼ 10 • . After excluding the foreground disk stars as shown in Figure 4(b), both the h 3 and the skewness peak at l ∼ 10 • . The observational results with all the stars considered are roughly consistent with the S10 model prediction, although the exact values are different. Figure 4(b) shows larger h 3 and larger skewness than Figure  4(d) in the bulge region in the range l ∼ 3 • − 12 • . There are two possibilities to explain this difference. First, there might still be foreground disk stars left in our sample; the foreground stars mainly contribute to the low-velocity distribution in the LOSVD, resulting in an HV tail and positive h 3 value. Second, this may reflect that there are clear HV peaks in some fields that cannot be explained by the S10 model. The simulation may lack the HV stars to generate such a feature. Our results here are broadly consistent with the independent skewness analysis in Zasowski et al. (2016).

Results including foreground stars
In our study we first remove the foreground stars in the bulge region using a T eff cut. The results presented here should be less contaminated by disk stars than those in previous APOGEE studies (e.g., Nidever et al. 2012;Zasowski et al. 2016). However, as suggested by Zasowski et al. (2016), with the foreground stars included, the results better mimic integrated light "observations" of the MW. Therefore, we also present the bulge kinematics without foreground stars removed. Figure 5 shows the LOSVDs of the fields in the region defined by −5 • < l < 60 • and −20 • < b < 20 • . The color scheme is the same as in Figure 2. Most of the fields only show smooth HV shoulders. Note that there are almost no stars removed from the HV bins, which means that HV stars live in the bar region, although they need not be bar stars.
The average velocity, σ, h 3 and h 4 maps are shown in the left column of Figure 6. In the bulge region,v shows clear cylindrical rotation. As the longitude increases,v increases monotonically, while h 3 increases and reaches maximum value at l ∼ 15 • and then decreases in the disk-dominated region (l > 20 • ). Apparently,v and h 3 show positive correlation inside the bulge region and negative correlation in the disk part. This is consistent with the simulation predic- For comparison, we also derive the kinematic maps for the S10 model with all the particles considered, as shown in the right column of Figure 6. The general trend inv and h 3 is quite similar with our observational results. However, for the model in the bulge fields, h 4 is generally quite small. This test gives further confirmation that the MW bulge/bar model provides a good match to the data, except for σ and h 4 , which look different from the observation. In the bulge fields, the positive h 4 values in the observation indicate a significant cen-tral narrow peak, which may be contributed mainly by the foreground disk stars. The disk in the S10 model may not be cold enough to generate such a central narrow peak. We also calculated the velocity dispersion directly as the sample standard deviation and found good agreement between the data and model (see also Figure 3 in Zasowski et al. 2016). The lower fitted σ in the data may be due to the shape of LOSVDs which have higher h 4 values. This is also suggested by the relation between the standard deviationσ and the Gauss−Hermite best-fit parameter σ (σ ≈ σ(1 + √ The HV peaks in the three fields are significant at a confidence level of 0.996, which is unlikely as a result of Poisson noise. If the HV peak really exists, focusing on these three fields would be more physically meaningful. Comparing the chemical distributions between HV peaks and the main component in the three fields may reveal the mechanism behind the HV peaks. We also include the five Sgr fields as a control sample for this analysis. The velocity distributions of these fields are shown in the first column of Figure 7 with (6 • , 0 • ) and (10 • , ±2 • ) combined together in the top panels and the SGR fields in the bottom panels. Stars are divided into two components with a velocity cut at 180 km s −1 (in the three bulge fields) and 100 km s −1 (in the five Sgr fields). The chemical abundance distributions including metallicity, [α/M] and [C/N] of these fields are shown in the other three columns of Figure 7. For each histogram the kernel density estimation (KDE) is also shown, where the Silverman's rule 6 is used to determine the kernel size.
The metallicity distributions of the main component and the HV peaks in the three bulge fields are very similar in terms of the range, peak position, and shape of the distribution. In the (6 • , 0 • ) field, Babusiaux et al. (2014) found that the HV stars are ∼ 0.1 dex higher in [Fe/H] than the main component. However, the difference is smaller in our result, where the two components have almost the same mean metallicity (HV stars are ∼ 0.002 dex higher). In contrast, in the Sgr fields the metallicity distribution of the HV peak is quite different from the rest of the stars. The HV stars in the Sgr field are more metal-poor with smaller scatter.
In addition, the [α/M] distributions of the HV peaks are quite similar to that of the main components in the three bulge fields. Moreover, both the main components and HV peaks display bimodal distributions, which is a strong indication of 6 Silverman's rule (Silverman 1986) gives the optimal kernel that would minimize the mean integrated squared error if the data were Gaussian or a Gaussian kernel is used. The optimal bandwidth (kernel) h is: h = 1.06An −1/5 , A = min(s, IQR(X)/1.34), where n is the number of observations on X, s is the standard deviation of the sample, and IQR(X) is the interquartile range (the difference between the upper [top 75 %] and lower [bottom 25 %] quartiles).   multiple stellar populations. In the Sgr field, the HV stars are clearly less α-enhanced, suggesting different stellar popula-tions and star formation history between the stars in the HV peak (the core) and the main component (the foreground stars of MW) in Sgr fields. As shown in Figure 7, the [C/N] distributions of the HV peaks and the main component in the three bulge fields are also similar. On the other hand, in the Sgr field the HV peak and the main component show clear differences: the scatter of HV stars is larger and the mean value of the HV peak is smaller than that of the main component.
For a more quantitative comparison, we also apply the K−S test on the distributions, where the statistic p describes the significance of the difference between the two samples. If p > 0.05, we accept the null hypothesis, which means that there is no difference between the two samples. We test the chemical abundance distribution for all fields in the region 5 • < l < 15 • and −3 • < b < 3 • . As expected, the K−S tests show that [M/H], [α/M] and [C/N] of the two components follow the same distribution in most of these fields.
Although the uncertainties of stellar parameters for cooler stars (T eff < 4000 K) are statistically larger , the difference between the two components is still very significant in the chemical abundance for Sgr fields. For example, in the Sgr field, there are clear differences between the HV and the main components in the metallicity distributions. This is expected because the HV peaks are contributed mainly by the member stars of the Sgr dSph. This demonstrates that the uncertainties are probably not large enough to blur the distribution of different stellar populations. Therefore, the similarities in the chemical distributions between the main component and the HV peaks in bulge fields suggest similar stellar populations.

Age distinctions in bulge fields
Directly measuring the stellar age from observations is nontrivial. Metallicity and α-enhancement can only provide rough estimates of age. Recently, Martig et al. (2016) argued that the relative surface abundances of C and N of RGB stars that have experienced the first dredge-up process can also be a good age indicator. The first dredge-up process brings material produced by the CNO cycle from the bottom of the convection layer to the surface. The amount of N brought to the surface depends on their initial masses. For more massive stars, the fraction of nitrogen in their cores is higher and the convective zone extends deeper. After the dredge-up, the surface abundance of [(C+N)/M] is unchanged since the total amount of CNO is conserved and the abundance of O is only slightly affected by the dredge-up. As a result, more massive stars have higher surface abundance of N and lower surface abundance of C, leading to a lower [C/N] ratio compared to low-mass stars (Martig et al. 2016). Lower-mass stars live longer during the main-sequence phase, and hence are older on average when they proceed to the RGB phase, and vice versa for the high-mass stars. Therefore, [C/N] for RGB stars can be a good indicator of stellar age (Martig et al. 2016); lower (higher) [C/N] values correspond to younger (older) stars. Figure 5 in Martig et al. (2016) shows the empirical relation of [M/H], [C/N] and median age, with a median age scatter of 26 %. Note that the chemical abundances we use are from the same pipeline. The uncertainty of [C/N] is about 0.1 dex (Masseron & Gilmore 2015;Martig et al. 2016).
The three right columns of Figure 7 show different proxies of age. The statistically identical distributions between the main component and the HV peaks for the bulge fields (see Section 4) suggest that the two components seem to have the same stellar populations. Note that [C/N] as an age proxy is only valid for RGB stars, not for asymptotic giant branch (AGB) stars. The contamination from AGB stars may pollute the result. However, given that AGB stars have a much shorter lifetime compared to stars on the RGB, at least statistically there are far fewer AGB stars in the sample than RGB stars.
Inspired by Martig et al. (2016), we use the multivariate distribution of [M/H] and [C/N] to further investigate the potential age differences, since stars with different ages are located at different positions in the [M/H]−[C/N] plane. Based on their velocity, stars in the range 5 • < l < 15 • and −3 • < b < 3 • are divided into two components: the main component (V GSR < 180 km s −1 ) and the HV component (V GSR > 180 km s −1 ). The SGR stars are also defined, but with a different velocity cut (140 km s −1 ). Figure Bensby et al. (2013) studied the chemical abundance of 58 microlensed bulge dwarfs and subgiants and found that metal-rich stars have a wide age distribution with a large fraction of young and intermediate-age stars. Using [C/N] as an age proxy, our result is consistent with that by Bensby et al. (2013). Compared with Martig et al. (2016) the two samples seem to spread over the same range and have similar mean value and standard deviation. The relative fraction of low-α stars in the HV component is slightly higher than that in the main component. We suspect that this is due to the ambiguous separation of the HV peak. In Figure 7, which separates the HV stars much better than Figure 8, the α-enhancement of the HV peak stars shows almost an identical distribution with that of the main peak stars. Of course, α-enhancement is not as good an age indication as [C/N].
The By using simulations in which newly formed stars are considered, Aumer & Schönrich (2015) suggested that the selec-  (2) based on their velocity. The difference between the two components can be clearly seen in Sgr fields, but not in the bulge. tion function of APOGEE is biased to young stars and that the HV feature is preferentially composed of young stars less than 2 Gyr old. Thus, we could speculate that the average age of the HV component is younger than the main component. However, if the age compositions of the two components are different, this should be visible from the chemical abundance distributions. We can infer from Figures 7 (a), 8 (a1), and (a2) that the age compositions of the two velocity components are similar and the fractions of young stars are low.
As shown in Bensby et al. (2017), the age-metallicity relation is quite flat for stars in the bulge region, indicating a wide range of stellar age populations at similar metallicities. To not find a difference in [M/H] distributions does not exclude significant age differences. However, according to Bensby et al. (2017), the metal-poor and alpha-enhanced stars are still generally older than the metal-rich ones. From Figures 8 (c1) and (c2), the [α/M] and [M/H] distributions are very wide, suggesting the existence of multiple stellar populations in the bugle region ranging from young to old.

Chemical abundances in Sgr fields
In contrast to the bulge fields, the chemical distributions for the two components in the Sgr fields are significantly different. The HV peak is due to the core of the Sgr dSph, which lies ∼29 kpc from the Sun (Siegel et al. 2011), while the main component should be primarily composed of thick-disk stars along the line of sight toward the Sgr dSph. Note that the APOGEE selection function is different in the Sgr fields compared to the Galactic bulge region; the K magnitude of the Sgr field stars is systematically ∼1−2 mag fainter than stars in the bulge field. If a giant with the same luminosity is in the bulge (8.5 kpc) and the Sgr dSph (29 kpc), the apparent magnitude difference is 2.5 mag. Toward the Sgr dSph, the extinction of Sgr stars is stronger than for bulge stars. Therefore, luminous giants are more likely included in the Sgr field, which may include more AGBs rather than RGBs. In this case, [C/N] may not be an accurate age proxy for AGBs in the Sgr HV peak. Our metallicity distribution is consistent with that of Cole (2001) who suggested that the metal-licity of stars in the Sgr dSph peaks at [Fe/H]∼ −0.5 ± 0.2 dex. There are two scenarios to explain the observed chemical abundances: the mass-loss scenario and the starburst scenario (McWilliam & Smecker-Hane 2005). If the mass-loss scenario is dominant, a metallicity distribution biased to lowmetallicity stars would be expected, while a discrete starburst might lead to a metallicity function dominated by a narrow peak. Our results seem to support the starburst scenario for the Sgr dSph, since an overdensity clump shows up in The chemical enrichment history of Sgr is different from that of the MW. Separating Sgr from MW stars does not necessarily mean that one can separate stars of different ages from the MW bulge. However, for the MW bulge sample, if the metallicity distributions of the main component and the HV component cover a wide range from metal-poor to metal-rich stars, the stellar populations probably also span a wide age range. This is also supported by the recent results of Schiavon et al. (2017), who used nitrogen and carbon abundances of APOGEE stars to disentangle a population of curious bulge field stars in the inner Galaxy, which are possibly dissolved globular cluster stars. Nidever et al. (2012) suggested that the HV peak stars in the bulge fields are not likely from the Sgr dSph because the velocity distribution is different from their model prediction. We confirm this conclusion with our analysis of chemical abundance.

Velocity distributions of different age populations
In their simulations Aumer & Schönrich (2015) showed that only young bar stars show pronounced HV shoulders. Young stars might also contribute to produce a similar h 3 profile. To compare to these results directly, we study the velocity distribution of the young and the old populations, using [C/N] as an age proxy. We consider the three fields showing clear HV peaks, i.e. (6 • , 0 • ), (10 • ,  ±2 • ), and the region showing high h 3 (5 • < l < 15 • and −3 • < l < 3 • ). The results are shown in Figure 9. Stars with [C/N] < −0.3 belong to the nominally young population (median age <3.5 Gyr), while stars with [C/N] > −0.3 belong to a nominally old population. In this test, the young stars generally have smaller velocity dispersion, as expected. However, the young population shows even weaker HV peak/shoulders than the old stellar population. The shape of the young stars' LOSVD seems roughly compatible with the Aumer & Schönrich (2015) model, but the existence of a separate HV feature for the older stars is inconsistent with that model. We also test two extreme cases by selecting the youngest ([C/N] < −0.4, median age <3 Gyr) and the oldest ([C/N] > −0.1, median age >7 Gyr) populations in the sample (see Figure 10). From these figures we can see that the youngest population does not seem to make a large contribution to the HV peak, which is inconsistent with the suggestion of Aumer & Schönrich (2015).

CONCLUSION
Cold, HV peaks in the bulge region were first reported by Nidever et al. (2012) using the commissioning data of APOGEE . We use the newly released APOGEE DR13 to revisit this result. To understand better the bulge kinematics, for the first time foreground disk stars are excluded by applying a T eff cut at 4000 K. In the LOSVDs, we find that most of the fields display a skewed Gaussian with an HV shoulder. However, only 3 out of 53 fields show a distinct HV peak.
Most of the LOSVDs in the bulge fields can be well described by Gauss−Hermite series (except the three HV fields), and the spatial distribution of coefficient h 3 without removing foreground disk stars is largely consistent with the S10 model. We argue that the indicator of the bar is the positive correlation betweenv and h 3 , which is consistent with predictions from bar models. This confirms that the Galactic bulge kinematics seen by APOGEE are dominated by a bar structure. Given the complicated bar kinematics, Gauss−Hemite moments can better describe the LOSVDs than fitting double Gaussian profiles. We also estimate the skewness of each LOSVD, and these show correlated profiles with the h 3 parameter. However, the skewness parameter can be significantly affected by the velocity outliers compared to h 3 . Again, the two components show similar chemical properties. Using [C/N] to trace age, we find that the age compositions of the two velocity components are similar and neither of them is dominated by young stars. This may be inconsistent with Aumer & Schönrich (2015), who suggested that the APOGEE selection function is more sensitive to young stars and that the HV stars are dominated by a young population ( 2 Gyr).
We separate the sample into a young and an old stellar population and find that the young population shows weaker HV peak/shoulders than the old stellar population.
We also find that the chemical abundance of Sgr core stars is different from that of bulge HV stars, so it is not likely that the HV peak stars are members of Sgr dSph. None of the models mentioned in the introduction (e.g., new halo structure, the tidal tail of Sgr, bar-supporting orbits, a kiloparsecscale nuclear stellar disk and young stars captured by the bar) could explain the identified HV peaks well. The spatial distribution and chemical properties of the HV peak stars in the bulge region are not likely the tidal tail of Sgr or a new substructure in the halo. As shown in Li et al. (2014), the full velocity distribution of the stars making up the bar potential can only produce a HV shoulder rather than a HV peak; the bar supporting orbits may not be able to produce a HV peak. A kiloparsec-scale nuclear disk proposed by Debattista et al. (2015) can explain the peak at l = 8 • ± 2 • , but not off the midplane, which is inconsistent with observations. In this study, we show that the bulge stellar population is complicated with a wide range of age and metallicity. Moreover, both the young and old stellar populations show HV peak features, which are inconsistent with predictions in Aumer & Schönrich (2015). Other mechanisms are needed to explain the HV peaks, and the three observed peaks could be different in nature.
Forthcoming proper motions from Gaia observations of these bright bulge stars will help probe the orbits of the HV stars compared to "normal" bulge stars.