A New Method to Reduce Near-Infrared Data of Palomar Observatory

Compared to large optical spectral surveys, near infrared spectra is overwhelmingly outnumbered. Sloan Digital Sky Survey (SDSS), the largest optical spectroscopic survey, has published about 5,789,200 optical spectra in its latest data release DR16, while no large scale near infrared spectra survey has been performed. Typical studies in near infrared are based on limited samples of a few tens to hundreds of near infrared spectra. TripleSpec is one of the workhorses near Infrared (NIR) spectrograph, mounted on the Palomar 200-inch telescope (P200). We have acquired more than 200 new quasar NIR spectra during the past years, compared to the largest sample ever compiled. During the reduction of our latest TripleSpec data, we found that the well accepted data reduction pipeline Spextool (citation: 835) completely ignored one of the five spectral orders (Order 7) of the raw data, probably due to its low throughput. Astronomical observations are always photons starving, we investigate the possibility of incorporation of the photons on Order 7 by developing a new data reduction pipeline. We test the new pipeline on our new quasar NIR data. Compared to the results from Spextool, 1. The pipeline is also featured by its full automaticity, 2. the wavelength coverage extended from 1-2.5 μm to 0.9-2.5 μm, revealing new features such as rest frame Pa ε, [S III] 9531, [C I] 9824, [C I] 9850, [S VIII] 9910, as well as redshifted spectral features for higher redshift targets. 3. The median signal to noise ratio for 0.9-1.1 μm have been elevated by a factor of 2. The pipeline could be easily generalized to accommodate other popular NIR instruments such as IRTF/SPEX, we plan to publish the pipeline and the 200 new quasar NIR spectra to the community. Keck/NIRSPEC.


Introduction
We see a totally different sky in infrared than our more-familiar optical band, as well as the distinct physics behind the light (c.f. Orion in Figure 1). In 1800, William Herschel found an imperceptible type of radiation just past the red part of the apparent range. He named this type of radiation infrared ("underneath" red). (Herschel was at that point world-renowned for having found the planet Uranus.) Herschel's revelation was the initial phase in setting up the presence of what we presently call the electromagnetic range. Noticeable light and infrared radiation are only two of the many sorts of electromagnetic energy delivered by objects on Earth and all through the universe. Exclusively by concentrating on this load of sorts of radiation can we completely portray heavenly items and gain a total image of the universe, its set of experiences, and its advancement. In this work, we center around a particular part of infrared space science: date decrease methods for close to infrared spectrograph. We improve a high-citation data reduction pipeline and test our new technique on a large quasar sample observed by using the classic, famous, and still state-of-the-art Palomar 200-inch telescope [1].

Ground based infrared instruments and NIR spectroscopic surveys of quasars
Only a few NIR spectra samples for AGNs (Active Galactic Nucleus) and Quasars have been published, most of them are with limited sample sizes with a few tens of spectra. Glikman et al. (2006) published an NIR template for AGNs, made from observations of 27 quasars in the redshift range 0.118 < z < 0.418 [2]. In Riffel et al. (2006), they present a close infrared ghastly chart book of 47 dynamic galactic cores (AGN) of all levels of action in the frequency time frame 2.4 μm, including the transitions of the noticed outflow lines [3]. The sample size is extremely small compared to optical spectral survey such as SDSS, which comprised of 0.75 million quasar optical spectra. A larger sample of NIR spectra of quasars and AGNs are still needed. Studies based on 1-2.5 μm NIR quasar spectra fall generally into two categories.1) For low redshift targets, rest frame NIR features are investigated. Continuum shapes, emission, and ab-sorption features are accessed through NIR composite spectrum or individual spectral atlas from small samples with a few tens of targets [3]. The optical-NIR continuum shape could be de-scribed by a broken power law or a single power law plus a blackbody representing hot dust at T~1200 K. Prominent emission features include H I lines Paαλ18756, Paβλ12822, Paγλ10941, Paδλ 10049, He I λ10830, [S III] λ9531, [S III] λ9069, as well as weaker lines from Fe II and O I. Absorption features of He I* λ10830 are seen in some quasars and proposed to be prevalent in at least some sub-class of quasars such as LoBAL [4,5].2) For higher redshift quasars, NIR spectra provide coverage for the rest-frame Hα and Hβ line regions, allowing robust black hole mass measurements of broad Balmer lines. A blend of the NIR and optical spectra covers head quasar analytic elements, essentially the C IV λ1549, Mg II λλ2798, 2803, Hβλ4861, and [O III] λλ4959, 5007 emanation lines. These samples have been used to correct black hole mass measurement based on UV broad emission lines such as C IV and Mg II [4,6]. Robust black hole mass measurements also enable studies probing for any difference in masses, Eddington ratios, or rest-frame optical spectroscopic properties of quasar subclasses such as LoBALs as compared to normal quasars, or their evolution over cosmic time. Getting close infrared spectra of satisfactory goal and sign to-commotion proportion (S/N) of even tolerably brilliant quasars remains asset escalated. Generally, these studies utilize NIR samples with limited sizes of a few tens to a few hundreds. The sample sizes are extremely small as compared to optical surveys such as Sloan Digital Sky Survey, with 750,414 optical quasar spectra in its latest data release DR16. Glikman et al. (2006) construct NIR composite from 27 quasars obtained at the NASA IRTF telescope [2]. Riffel et al. (2006) present a rest frame near-infrared spectral atlas of 47 active galactic nuclei (AGN) with 0.0038 < z < 0.549, including QSO, Seyfert1, Seyfert2 and NLS1 [3]. Zuo et al. (2016) use NIR spectra of 32 luminous quasars with 3.2 < z < 3.9 to calibrate black hole measurements based on Mg II. Coatman et al. (2017) compiled a sample from heterogenous sources that span wide ranges of source-selection criteria, instrument properties, spectral band and resolution, and signal-to-noise ratio (S/N) [4]. The sample constitutes 230 high-luminosity, redshift 1.5 < z < 4.0 quasars with both C IV and Balmer line spectra, which is used to correct C IV-Based Viral Black Hole Masses. Matthews et al. (2020) present NIR spectra for a flux-limited sample of 226 quasars with 1.5 < z < 3.5 from the Gemini Near Infrared Spectrograph -Distant Quasar Survey (GNIRS-DQS), being the largest uniform, homogeneous survey of its kind.

Aim of this work
We have initiated a near infrared spectroscopic campaign on quasar candidates with He I* absorption line systems. The quasars are selected from SDSS with the presence of optical He I* λ3889 absorption line. The NIR spectroscopic follow-up provide coverage of one important line of He I* multiples, i.e., the He I* λ10830 NIR absorption line [5]. We used Triple Spec spectrograph mounted on the Palomar 200-inch telescope. We are approved 12 nights in the past years, and more than 200 new quasar NIR spectra have been acquired, approx. at the same order of the largest NIR sample ever published. During the reduction of our latest Triple Spec data acquired on Oct. 2020, we found that the well accepted data reduction pipeline Spextool completely ignored one of the five spectral orders (Order 7) of Triple Spec, probably due to its low throughput. However, as we know that astronomical observations are always photons starving, there is no reason to throw away any photon in priori. In this paper, we investigate the possibility of incorporation of the photons on Order 7 by developing a new pipeline. The paper is organized as follow. In Section 2, we outline the basis procedures of our data reduction pipeline. In Section 3, the pipeline is applied to Palomar/Triple Spec and compared with result from Spextools. In Section 4, a Triple Spec NIR spectra atlas is compiled and generalization of our pipeline to other NIR instruments is discussed. The paper is organized as follows. In Section 2, we outline the basis procedures of our data reduction pipeline. In Section 3, the pipeline is applied to Palomar/Triple Spec and compared with result from Spextools. In Section 4, a Triple Spec NIR spectra atlas is compiled and generalization of our pipeline to other NIR instruments is discussed.

TripleSpec NIR spectrograph
TripleSpec is one of the work-horse instruments mounted on Cassegrain focus on the 200-inch Hale Telescope (P200) at Palomar Observatory. The P200/Triplespec is one of the three identical NIR spectrographs built for P200 5.1 m telescope, the Apache Point Observatory 3.5 m telescope, and the Keck 10 m telescope. The entrance slit P200/TripleSpec is 1" × 30". The light is cross-dispersed on 5 orders (order 7, 6, 5, 4, 3) with claimed wavelength range from 1 to 2.45 μm and spectral resolution of 2500-2700 using a 1024×2048 Hawaii-II HgCdTe array. The pixel scale is 0.37"/pixel. Information on the spectrograph design is detailed in Wilson et al. [7]. The spectrograph is also equipped with a K band guider of 4′ × 4′. The throughputs of J, H, K band are measured to be 10%, 20%, 30%, the performance could be found in Herter et al. [8]. A typical exposure of TripleSpec is shown in Figure 2 with an exposure time of 100 s. The target is a famous local Active Galactic Neuclei (AGN) NGC 4151, with the order numbers marked. Even with this limited exposure time, we can see the high background for K band (order 3) and vertical OH sky lines, which require dithering the targets during observation (see section 3 for details). The spectrum of the targets is the nearly horizontal bright stripes.

Remote observation
We have carried out a large observational campaign spanning from 2013 to 2021, and still going to. The project is to investigate special absorption line systems in quasars. Our historic observations have 4 accumulated more than 10 nights of TripleSpec data. During our development of the pipeline, we were approved one additional night on Oct. 27, 2020. The observation log, including targets (with exposure time), seeing, weather, telescopes focus etc., are recorded on standard Triple Spec log sheet as shown in Figure 3. The raw data are downloaded from user1@observer1.palomar.caltech.edu with scp or sftp.

What's the problem?
The raw data are previously reduced with Spextool [9,10], an Interactive Data Language (IDL)-based data reduction package. Spextool is originally designed for Spex on the NASA Infrared Telescope Facility (IRTF) and subsequently modified to accommodate other NIR spectrographs such as TripleSpec, yet with the limitation (Spextool User's Manual, V4.1) of no optimal extraction [11]. During our recent usage of P200/TripleSpec, we noticed that Spextool ignores order 7 of P200/TripleSpec during processing.
In the wavelength map generated by Spextools, there is no coverage of Order 7 (see Figure 4) , where R = λ/∆λ = 2700 is the spectral resolution, AT=17.13m 2 is the effective telescope area of P200, t is exposure time and λ is in A . The throughput is estimated as I(λ)/Ivega(λ), where Ivega(λ) is a model spectrum of Vega [9], scaled to the V magnitude of the telluric star. The FWHM of the telluric star trace is measured and used to correct slit loss as (FWHM/1′′)2 with typical values of 2-3. One estimation is shown in Figure 5. The throughput is estimated from HD 77577 observed on 2013-02-22 and includes both the telluric absorption and instrumental response. Depending on different observational conditions such as airmass, seeing, weather etc., the peak throughputs in K band vary from 10% to 30% during our runs.

Key algorithms of our new pipeline
Our pipeline is similar in spirit to the data reduction pipeline designed for the Magellan Inamori Kyocera Echelle (MIKE) spectrometer and then generalized to the HIRES, UVES, and ESI spectrometers. A comprehensive knowledge of the design, implementation, trade-offs, and limitations of the pipeline is detailed in Bernstein et al. [12]. Our TripleSpec pipeline follows their design with inclusion of the ignored Order 7 by Spextools. Basically, the pipeline performs the standard steps of astronomical spectroscopic data reduction, i.e., the bias subtraction, flat correction, wavelength calibration, telluric correction, and flux calibration automatically. We only highlight our innovative treatments specific to TripleSpec here.  6 We define X axis as along the 2048 columns, which is the approximate spectral direction with bending orders. The Y axis is along the 1024 rows, i.e., the spatial direction.

Image processing
First, a super bias frame is generated by averaging the dark frames with 3-sigma clipping. The super bias is subtracted off from each frame in further processing. A flat is selected to trace the order edges and is referred to as a trace flat. A saw tooth filter is applied to the trace flat in the spatial (Y) direction to highlight the slit edges. The edges are then traced and fitted with 4-order Legendre polynomials. The tracing technique is like object tracing and will be detailed in Section 4.3. The averages of the left and right edges will be later used as the initial guess for object tracing. An example of the traced flat edges is shown in Figure 7. Bias-subtracted flats are averaged with 2-sigma clipping into a super flat, which is then decomposed into an illumination flat and a pixel flat. The average illumination function across the order is fitted. The two profiles are divided out to generate a pixel flat.

Wavelength calibration
The sky lines are not strictly aligned to the CCD Y axis. The tilt is corrected using a fitted pixel map. The sky lines are traced for each order of a science frame. Normally a long exposure of 5 minute is used to guarantee enough strong sky lines on Order 7. Examples of Traced lines are shown in Figure 8.  A 1-D sky spectrum is extracted along the center of each order and correlated with archived reference sky spectrum with previous wavelength calibration fitting. The shift is calculated, and a new wavelength solution is acquired along the order center. With this 1D wavelength solution and the pixel map, a 2D wavelength solution is calculated and shown in Figure 9. The flux below 0.89 is normally too low and is trimmed in further analysis.

Traces
An initial trace of the science target was obtained using IDL code trace crude, then the traces are iteratively refined by weighting the trace positions the target spatial flux distribution. The traces of objects are examined using the data on 2013-02-22. 23 objects were observed. The traces at Position A of Order 3 were plotted in the left panel of Figure 10. The dispersion about the median trace is shown in the right panel of Figure 6 with σ = 1.5 pixels. The maximum drift is about ±4 pixels, equivalent to 1.5" with the plate scale of 0.37"/pixel. The traces show that the P200 pointing is extremely stable through the night, we can use one arbitrary trace as the initial guess. With the traces at hand, object spatial profiles are evaluated for each order. Optimal extraction is performed for each order. The tracing and extraction are performed for all the science targets and standard stars.

Telluric correction and flux calibration
The pipeline then automatically locates the nearest standard star in position and observational time with the target and choose it for telluric correction and flux calibration. Spectra of the earthly norms were prepared along these lines as targets, trailed by a cautious expulsion of the stars' inherent hydrogen assimilation lines. This process was performed by fitting Lorentzian profiles to the hydrogen absorption lines and interpolating across these features to connect the continuum on each side of the line (Matthews 2020). Following the line cancellation, the sense function of the spectrograph is estimated by dividing the extracted telluric star spectrum with a model stellar spectrum model with same spectral type as the selected telluric standard. Finally, we divide the extracted spectra of the target by the derived sense function. There are two significant perspectives to creating a last 1D range from a progression of science openings: the coaddition of different openings and the mix of covering echelle orders. The specific ordering of our algorithms is: (1) coadd multiple exposures of each echelle order; (2) flux the individual echelle orders; and (3) combine the echelle orders to produce a final, continuous 1D spectrum. In parallel, we perform data reduction using Spextool and compare the final 1D spectra with results from Spextool in next Section

Extended wavelength coverage
A detailed comparison is performed for NGC 4151 in Figure 11 (c.f. Figure 19 for Raw data  For higher redshift targets, other important emission features may be redshifted into this addition wavelength coverage between 0.9-1 μm, for example, we see full coverage of [O III] doublets and tentative Hβ in J0352-0711 in Figure 11. Another benefit from the extended short wavelength coverage is that it overlaps with most optical spectrographs and surveys, such as SDSS and BOSS. For example, SDSS's wavelength coverage is 3800 -9200 Å while 3600-9800 Å for BOSS, which connects with the blue end of our reduction. A continuous wavelength coverage form optical to infrared is important to cross-check the flux calibrations for different instruments involved. Also, it is crucial for sciences involving continuum shape and dust properties.

Elevated S/N ratio
In this section, we randomly selected quasars from our historic observations to evaluate the signal noise ratio in the overlapping part of Order 7 and Order 6 for our pipeline and Spextool. The results are shown in Figure 12 with an expanded view of 0.9-1.05 μm for five arbitrarily selected quasars and AGNs. Due to the incorporation of the Order 7, all missing photons in Spextool reduction are essentially recovered. Even a visual evaluation clearly reveals the improved signal to noise ratio near 1 μm for our pipeline, for both emission features and continuum region. Statistics based on all 200+ quasars we observed in the past few years shows the S/N ratio from our pipeline is on average higher than from Spextool by a factor of two at this spectral regime. Also, these randomly selected quasars show the benefits of extended wavelength coverage, e.g. the full coverage of interesting emission features and absorption features fall in this regime due to redshift effect. Figure 12. Expanded view of 0.90-1.05 μm for five randomly selected quasars and AGNs in our sample, with our results in black and those from Spextool in red. Even visual inspection reveals the clear elevation in signal to noise ratio in this spectral regime. Also, additional features are recovered by our reduction.

Conclusion and future work
We have implemented a new pipeline to reduce Palomar/TripleSpec NIR spectra. Our pipeline recovers the ignored Order 7 by the well accepted NIR spectral reduction pipeline Spextool, 1. The pipeline is also featured by its full automaticity, 2. the wavelength coverage extended from 1-2.5 μm to 0. The median signal to noise ratio for 0.9-1.1 μm have been elevated by a factor of two. These new features enable a series of future studies such as optical-infrared continuum shape and spectral features falling on the recovered band (e.g., specific absorption line systems and black hole measurement for specific redshift range). We only address the importance of incorporating Order 7 in this paper. We have applied the new pipeline to our historic TripleSpec data with 200+ new quasar spectra and plan to release the data set along with the pipeline to the community. The details on the sample, the catalog and the statistics of the sample will be investigated in future work.