Marine Group II Euryarchaeota Contribute to the Archaeal Lipid Pool in Northwestern Pacific Ocean Surface Waters

Planktonic archaea include predominantly Marine Group I Thaumarchaeota (MG I) and Marine Group II Euryarchaeota (MG II), which play important roles in the oceanic carbon cycle. MG I produce specific lipids called isoprenoid glycerol dibiphytanyl glycerol tetraethers (GDGTs), which are being used in the sea surface temperature proxy named TEX86. Although MG II may be the most abundant planktonic archaeal group in surface water, their lipid composition remains poorly characterized because of the lack of cultured representatives. Circumstantial evidence from previous studies of marine suspended particulate matter suggests that MG II may produce both GDGTs and archaeol-based lipids. In this study, integration of the 16S rRNA gene quantification and sequencing and lipid analysis demonstrated that MG II contributed significantly to the pool of archaeal tetraether lipids in samples collected from MG II-dominated surface waters of the Northwestern Pacific Ocean (NWPO). The archaeal lipid composition in MG II-dominated NWPO waters differed significantly from that of known MG I cultures, containing relatively more 2G-OH-, 2G- and 1G- GDGTs, especially in their acyclic form. Lipid composition in NWPO waters was also markedly different from MG I-dominated surface water samples collected in the East China Sea. GDGTs from MG II-dominated samples seemed to respond to temperature similarly to GDGTs from the MG I-dominated samples, which calls for further study using pure cultures to determine the exact impact of MG II on GDGT-based proxies.

Members of MG II have not been cultured; however, information on their lifestyle has been obtained through metagenomic studies. It is suggested that MG II live heterotrophically and occur mostly in the photic zone (Iverson et al., 2012;Zhang et al., 2015;Xie et al., 2018;Rinke et al., 2019;Santoro et al., 2019;Tully, 2019). However, MG II ecotypes have also been found in the deeper parts of the ocean (Li et al., 2015;Liu et al., 2017). Deep ocean MG II clades do not contain genes for proteorhodopsin, a light-driven protein present in MG II from the photic zone (Li et al., 2015). Other euryarchaeotal planktonic archaea include Marine Group III (MG III) that occur throughout the water column (Fuhrman and Davis, 1997;Haro-Moreno et al., 2017) and Marine Group IV (MG IV) that occur predominantly in the deep sea (López-García et al., 2001). MG III and MG IV are often present in low abundance. Little is known in terms of their ecological distribution and physiology (Santoro et al., 2019).
GDGTs contain information that can be used to evaluate paleo sea surface temperature (SST) (e.g., TEX 86 index; Schouten et al., 2002), terrestrial organic matter input to the ocean (e.g., BIT index; Hopmans et al., 2004), or biogeochemical redox state in the ocean (e.g., Methane Index; Zhang et al., 2011). Although GDGTs have applications in paleoceanography and microbial ecology, their specific taxonomic sources remain ambiguous. Lipidomic studies on Nitrosopumilus maritimus, the first representative of MG I isolated from marine environments, reported lipids including GDGTs with zero to four cyclopentyl moieties and crenarchaeol, a GDGT containing one cyclohexyl and four cyclopentyl moieties (Könneke et al., 2005;Schouten et al., 2008). Crenarchaeol has so far only been observed in Thaumarchaeota (Sinninghe Damsté et al., 2002b;Elling et al., 2017) and has been postulated as a biomarker for archaeal nitrification (de la Torre et al., 2008;Pearson et al., 2008). Similarly, MeO-AR is reported to be present in all thaumarchaeal strains studied to date but does not appear to occur in crenarchaeal or euryarchaeal species. Thus, MeO-AR may also be used as a tentative biomarker for Thaumarchaeota (Elling et al., 2014(Elling et al., , 2017. Unsaturated acyclic archaeols (uns-ARs) with four double bonds were recently suggested as potential biomarkers for MG II based on analyses of uns-AR 0:4 in suspended particulate matter (SPM) of epipelagic waters from the eastern tropical North Pacific, equatorial Pacific and off the coast of Cape Blanc (Zhu et al., 2016). Using a combination of GDGT analysis, metagenomics, and pyrosequencing of the SSU rRNA gene on samples from North Pacific Subtropical Gyre water column, it has been suggested that MG II also produce GDGTs, including crenarchaeol (Lincoln et al., 2014). This could potentially affect the use of TEX 86 , a SST proxy expressed as the ratio of GDGTs with different degree of cyclization (Schouten et al., 2002). TEX 86 was proposed based on the premise that the large majority of GDGTs in the water column were solely produced by MG I (Schouten et al., 2002). Thus, the significant contribution of a second archaeal clade to the oceanic GDGT pool, as inferred by Lincoln et al. (2014), may complicate the relationship between TEX 86 and SST. The findings of Lincoln et al. (2014) were debated by Schouten et al. (2014) who raised concern about the low abundance of extracted DNA and the use of C-GDGTs instead of IP-GDGTs, which are considered to better represent living biomass. Additional results were published by Besseling et al. (2020) who reported an absence of MG II-derived GDGTs from surface waters in the Atlantic Ocean and the North Sea. In addition, Zeng et al. (2019) identified two enzymes responsible for GDGT cyclization (i.e., GDGT ring synthases) and could only detect the related genes in metagenomes from MG I species and not in the MG II metagenomes. Together, these previous reports suggest that MG I Thaumarchaeota may be the dominant source of cyclized GDGTs in open ocean settings, although GDGT-producing MG II have been reported elsewhere (Wang et al., 2015). Therefore, the potential contribution of MG II to the GDGT pool in the ocean remains controversial.
In this study, we characterized and quantified archaeal membrane lipids in surface water samples from the Northwestern Pacific Ocean (NWPO) and East China Sea and supported these measurements with DNA sequencing and determination of cell density in order to determine the sources of the archaeal lipids. Both sample sets differed markedly in their archaeal community members with MG II being dominant in NWPO samples and MG I in East China Sea samples. Accordingly, the combined sample set was ideally suited to constrain the contribution of MG II to the marine archaeal lipid pool, to evaluate its effect on archaeal lipid based proxies, and to test previous hypotheses regarding candidate lipids of MG II (Lincoln et al., 2014;Zhu et al., 2016). In addition, this study presented the full range of intact and core archaeal lipids that were detected in surface waters, thus providing an important contribution towards a better understanding of the archaeal lipid distribution and related processes in the oceanic water column.

Shipboard Sampling and Environmental Parameters
Samples were collected on board the R/V Dongfanghong II during the East China Sea 2014 cruise (October; E-samples; The sampling plan and region were filed before the cruise started and approved by the Chinese Ministry of Foreign Affairs) and the NWPO 2015 spring cruise (April; N-samples). In situ temperature and salinity were measured by a conductivitytemperature-density (CTD) unit (Sea-Bird 911 CTD). Samples for inorganic nutrients (nitrate, nitrite, phosphate, and silicate) were filtered through 0.45 µm cellulose acetate filters and stored at −20 • C until analysis with an AutoAnalyzer 3 HR (SEAL Analytical).
During each cruise, samples of SPM were collected from surface water (2 to 10 m; Figure 1 and Supplementary Table S1). For each sample, 80-200 L of seawater were filtered using a submersible pump through a GF/F filter (Whatman, 142 mm) of 0.7 µm pore diameter. Filters were then stored at −20 • C until analysis. Previous studies (Ingalls et al., 2012) suggested that 0.7 µm GF/F filters may under-collect GDGTs in general and IP-GDGTs specifically because archaeal cells can be less than 1 µm (Könneke et al., 2005;Engelhardt, 2007). To ensure comparability, lipids and DNA were extracted from the same filters.

Lipid Extraction
For each sample, lipids were extracted from 88% (7/8) of a freeze-dried GF/F filter. The filter was cut into slices and extracted using a modified Bligh and Dyer method (Zhang et al., 2013). In brief, the extraction was performed four times using methanol (MeOH), dichloromethane (DCM) and phosphate buffer at pH 7.4 (2:1:0.8 v/v). After sonication (10-15 min each time), additional DCM and buffer were added to achieve a final MeOH/DCM/buffer ratio of 1:1:0.9. The bottom organic phase was collected with a glass pipette (repeated at least three times). The total lipid extract was dried under N 2 , redissolved in MeOH and filtered using a 0.45 µm PTFE filter before analysis.

Intact Polar and Core Lipid Analyses
Lipid separation was achieved on an ultra-high performance liquid chromatography (UHPLC) system (Dionex Ultimate 3000RS) in reversed phase conditions with an ACE3 C 18 column (2.1 × 150 mm × 3 µm; Advanced Chromatography Technologies) maintained at 45 • C . Target compounds were detected by scheduled multiple reaction monitoring (sMRM) of diagnostic MS/MS transitions (Supplementary Table S2) on a triple quadrupole/ion trap mass spectrometer (ABSciEX QTRAP4500) equipped with a TurboIonSpray ion source operating in positive electrospray ionization (ESI) mode.
Quantification of lipids was achieved by adding an internal glycerol trialkyl glycerol tetraether standard (C 46 -GTGT; Huguet et al., 2006). Structures of the different lipids detected can be found in Supplementary Figure S1. Lipid abundance was corrected for response factors of commercially available as well as purified standards as described by Elling et al. (2014). In brief, abundances of IPLs were corrected by determining the response factors of purified 2G-GDGT-0 (for 2G-OH-and 2G-GDGTs), 1G-GDGT-0 (for HPH-, 1G-OH-, and 1G-GDGTs), 2G-AR (for 2G-AR) and 1G-AR (for 1G-AR) standards versus the C 46 -GTGT standard. Similarly, the abundance of C-GDGTs was corrected by the response factor of purified GDGT-0 standard versus the C 46 -GTGT standard. The abundances of C-AR, C-uns-ARs and MeO-AR were corrected by the response factor of a core archaeol standard (Avanti Polar Lipids, Inc. Alabaster, AL, United States) versus the C 46 -GTGT standard. In this study, C-uns-ARs are presented as C-uns-AR (u), where u = the number of double bond equivalents (DBE), representing either double bonds or rings and thus including both unsaturated and macrocyclic archaeols.

Nucleic Acid Extraction, Quantitative Polymerase Chain Reaction, and Sequencing
DNA was extracted from 12% (1/8) of a GF/F filter (142 mm, ca. 4∼12 L filtered seawater) using the FastDNA SPIN Kit for Soil (MP Biomedical, Solon, OH, United States) with a final elution in 100 µL deionized water.
The archaeal 16S rRNA gene was quantified in all samples by quantitative polymerase chain reaction (qPCR; PIKO REAL 96, Thermo Fisher Scientific). Abundance (cells per liter) was normalized according to the dilution folds of DNA template and the volume of filtered seawater. The qPCR primers were Arch_787F (5 ATTAGATACCCSBGTAGTCC 3 ; Yu et al., 2005) and Arch_915R (5 GTGCTCCCCCGCCAATTCCT 3 ; Stahl, 1991).
Pyrosequencing was conducted on all samples, targeting the archaeal 16S rRNA gene. Primers were Arch_524F (5 TGYCAGCCGCCGCGGTAA 3 ) and Arch_958R (5 YCCG GCGTTGAVTCCAATT 3 ), which showed higher coverage for archaeal 16S rRNA gene as described recently (Cerqueira et al., 2017). Sequencing was performed using the Illumina Miseq platform at Majorbio Bio-Pharm Technology, Co., Ltd., Shanghai, China. Sequencing analysis was performed on the free online platform of Majorbio I-Sanger Cloud Platform 1 . Before analysis, sequences were demultiplexed and quality-filtered using QIIME (version 1.9.1). Sets of sequences with at least 97% identified were defined as an OTU (operational taxonomic unit), and chimeric sequences were identified and removed using UCHIME (Edgar et al., 2011). The taxonomy of each 16S rRNA gene sequence was analyzed using RDP Classifier 2 against the SILVA ribosomal RNA gene database using a confidence threshold of 70% (Cole et al., 2009;Quast et al., 2013).

Calculations and Statistical Analysis
Cell densities of MG I, MG II, and MG III were inferred based on total archaeal community composition derived from sequencing ( Table 1) and total archaeal cell density obtained by qPCR ( Table 2) according to equation 1 (MG I is taken as an example), where n = cell density (cells/L) and f = relative abundance (%): The cellular lipid content (fg/cell) was calculated using the equations below for the MG I community only (Eq. 2) and for the whole archaeal community (Eq. 3), where n = cell density (cells/L) and a = lipid abundance (ng/L). Intact polar GDGTs (IP-GDGTs) were defined as the sum of HPH-, 2G-OH-, 2G-, 1G-OH-, and 1G-GDGTs. Total GDGTs are the sum of IP-and C-GDGTs.
Cellular lipid content of MG I = a total or IP GDGTs n MG I × 10 6 (2) Cellular lipid content of total archaea = a total or IP GDGTs n total archaea × 10 6 (3) Ring Index (RI) was calculated using Equation 4 according to Zhang et al. (2016): Cluster analysis in this study was performed by PAST software using the unweighted pair-group average algorithm. 1 https://www.i-sanger.com 2 http://rdp.cme.msu.edu/ Correlation coefficients and p-values were obtained by analysis using R software. The neighbor-joining trees were constructed using MEGA software.

Oceanographic Settings
The N-sample set was collected in the NWPO in April, 2015 and the E-sample set was collected in East China Sea in October, 2014. In situ temperature, salinity and nutrient content were determined for every sample. Salinity varied little between the two sample sets (Supplementary Table S1), but the temperature and nutrient contents of them changed substantially. The average in situ temperature of the N-samples was 18.1 • C, with a range of 17.5-18.7 • C. The E-samples had an average of 25.4 • C, with a range of 24.3-26.4 • C (Supplementary Table S1).
The fixed nitrogen contents in the N-samples were two to four times higher than those in the E-samples. The average nitrate content of the N-samples was 4.19 µmol/L with a range of 0.77 to 8.84 µmol/L (n = 5); the average nitrate content of the E-samples was 1.1 µmol/L with a range of 0.53 to 2.05 µmol/L (n = 10). The average nitrite content of the N-samples was 0.16 µmol/L with a range of 0.02 to 0.27 µmol/L, while that of the E-samples was 0.04 µmol/L with a range of 0.01 to 0.14 µmol/L.

Archaeal Community Structure
Results of the 16S rRNA gene sequencing showed that archaeal communities substantially differed between the N-and E-sample sets. In the N-samples collected in April 2015 (Figure 1), the relative abundance of MG I ranged from 0.2 to 5.2%. MG II represented the vast majority of 16S rRNA gene reads with a range from 94.1 to 99.7% (Table 1 and Figure 2). In contrast, in most E-samples collected in October 2014 (Figure 1), MG I accounted for more than 88.4% while MG II accounted for less than 9.0% (except E-7 with MG I accounting for 54.9% and MG II accounting for 43.1%; Table 1 and Figure 2). For both sets of samples, MG III only accounted for a small proportion of archaea, from 0.01 to 3.0% and other unclassified archaea accounted for less than 3.0% of the total archaeal sequences at all stations. Hence, we normalized the sequencing results by setting the sum of MG I, II and III to 100% (Table 1).

Archaeal Cell Density and Total Lipid Content
Archaeal cell density in each sample was quantified by qPCR targeting the archaeal 16S rRNA gene. On average, 1.5 × 10 7 archaeal cells per liter of seawater (cells/L; Table 2) were observed in N-samples (ranging from 3.2 × 10 6 to 4.4 × 10 7 cells/L). E-samples showed a slightly higher average archaeal cell density of 3.3 × 10 7 cells/L ( Table 2).
MG I and MG II specific cell densities were estimated by multiplying their relative abundance within total archaeal sequences with the qPCR-derived archaeal cell density (Eq. 1). As a result, MG I ranged between 1.7 × 10 4 and 2.3 × 10 6 cells/L in N-samples and between 7.1 × 10 6 and 5.6 × 10 7 cells/L in E-samples. MG II in N-samples were 1-2 orders of magnitude higher (3 × 10 6 to 4.2 × 10 7 cells/L) than in E-samples (2 × 10 5 to 5.6 × 10 6 cells/L; Table 2 and Supplementary Figure S2).
The total archaeal lipid content varied greatly within each sample set. Total archaeal lipids ranged from 2.9 to 12 ng/L (average was 6.6 ± 3.5 ng/L) in N-samples and from 1.4 to 22.3 ng/L (average was 8.4 ± 6.1 ng/L) in E-samples (Table 1).

Archaeal Lipid Distribution
Except for sample N-B3 (characterized by a predominance of C-GDGTs), all N-samples were dominated by 1G-GDGTs followed by 2G-GDGTs and C-GDGTs (Figure 3 and Table 1). These three components were also the major lipids in E-samples. In both N-and E-samples, 2G-OH-GDGTs with 0-4 cyclopentyl rings were detected, in which 2G-OH-GDGTs -3 and -4 have not been reported in previous studies. These two compounds were identified based on exact mass and retention pattern with the sMRM method and also with a parallel quadrupole time-of-flight tandem mass spectrometer (qTOF-MS) analysis (Supplementary Table S2; cf. Zhu et al., 2013). 2G-OH-GDGTs relative abundance ranged between 6.6 and 13.3% in the N-samples and between 2.1 and 8.9% in the E-samples. HPH-GDGTs accounted for less than 2% of total archaeal lipids in every sample and 1G-OH-GDGTs (including 1G-OH-GDGTs -0, -1, and -2) for less than 1% (Table 1). 2 | Cell density of Archaea based on qPCR and cell densities of Marine Groups I, II, and III inferred by archaeal community composition from sequencing (the cell numbers are equivalent to gene copies assuming one cell contains one 16S rRNA gene of studied archaea); abundances of GDGTs, IP-GDGTs and intact polar crenarchaeol (IP-Cren, including HPH-, 2G-and 1G-crenarchaeol); cellular lipid contents in GDGTs and IP-GDGTs estimated for MG I and the whole archaeal cells.  GDGTs with 0-3 cyclopentyl rings and crenarchaeol were detected (see chemical structures in Supplementary Figure S1 and relative abundances in Supplementary Table S3) but the crenarchaeol isomer, unsaturated GDGTs and OH-GDGTs were not detected. The ring distribution of 2G-GDGTs was different from those of HPH-, 1G-, and C-GDGTs. The former group was dominated by GDGTs -1, -2, and -3 while GDGT-0 and crenarchaeol predominated in the latter (Supplementary Table S3). ARs accounted for 1.4 to 37.3% of total archaeal lipids ( Table 1) with core unsaturated ARs (C-uns-ARs, with 1 to 7 DBE) being the most abundant AR types (Table 1 and Supplementary Figure S1). IP-ARs (including 1G-and 2G-AR), C-AR, and methoxy archaeol (MeO-AR) all represented less than 1% of total archaeal lipids in every sample (Table 1 and Figure 3).

Potential Contribution of MG II Euryarchaeota to the Archaeal Tetraether Lipid Pool
Previous studies showed that MG II usually inhabit the surface photic zone while MG I are found in deeper layers of the water column (e.g., Zhang et al., 2015;Ingalls, 2016), which is consistent with the observed microbial community in the N-samples but not in the E-samples. Seasonality may explain the observed contrasted community structure between the two sample sets as it was previously observed to impact MG I (Massana et al., 1997;Murray et al., 1998;Galand et al., 2010;Hollibaugh et al., 2011;Pitcher et al., 2011b) and MG II (Pernthaler et al., 2002;Galand et al., 2010) communities. Previous studies reported MG I blooms during low phytoplanktonic productivity season (Pitcher et al., 2011a), which is consistent with the predominance of MG I in the E-samples collected in October (He et al., 2013). In contrast, the MG II-dominated samples (N-samples) were sampled in spring. This is in agreement with the detection of a spring MG II bloom in surface waters at German Bight in the North Sea (Pernthaler et al., 2002). Besides, we observe that the predominant MG II OTUs (OTU 160 and OTU 269; Figure 2 and Supplementary Figure S3) in the N-samples collected in April are phylogenetically affiliated to MG II A, which were previously observed to be predominant in summer when nutrients become depleted. On the contrary, MG II B seem to be more abundant during winter when nutrients are replenished (Galand et al., 2010).
The unambiguous difference in archaeal community structure between the two sample sets offers the opportunity to analyze in detail the lipid contribution by the uncultivated MG II archaea in the samples they dominate. For this purpose, IPLs were analyzed from the same filters used for archaeal community analysis. IPLs, particularly those with phosphate head groups, are assumed to be rapidly degraded upon cell death (White et al., 1979;Harvey et al., 1986) and hence are representative of the active living archaeal community at the moment of sampling. We thus consider the overprint from extinct cell biomass to be minimal.
MG I are considered as the dominant source of IPL-GDGTs in the ocean (Schouten et al., 2013). In this study, we determined how much lipid a MG I cell would contain if all detected GDGTs were exclusively produced by MG I cells (Eq. 2). In the E-samples, cellular GDGT content varies from 0.05 to 0.89 fg/cell (0.25 fg/cell in average; Table 2 and Figure 4A) and cellular IP-GDGT from 0.03 to 0.61 fg/cell (0.19 fg/cell in average; Table 2 and Figure 4A) for MG I. These values are close to previous estimates of MG I cells based on both environmental and pure culture samples (1 fg/cell, Sinninghe Damsté et al., 2002b;0.25 fg/cell, Schouten et al., 2012; 0.9 to 1.9 fg/cell, Elling et al., 2014).
In the MG II-enriched N-samples, assuming all GDGTs derive from MG I cells, the calculated cellular archaeal lipid quota would be 5.16 to 165 fg/cell (51 fg/cell in average) for total GDGTs and 1.99 to 132 fg/cell for IP-GDGTs (43 fg/cell on average; Table 2 and Figure 4A). These values are two orders of magnitude higher than the results found in the E-samples as well as in former studies. We hypothesize that the overestimation of cellular lipid content in the N-samples may be due to neglecting GDGT production from the MG II communities as previously suggested by Lincoln et al. (2014) in the North Pacific Subtropical Gyre shallow and intermediate depths. Consequently, cellular lipid contents in total GDGTs and IP-GDGTs were calculated based on the total archaeal cell density (Eq. 3; Table 2). Total cellular GDGT content in the N-samples then ranges from 0.27 to 1.89 fg/cell (0.74 fg/cell on average) and cellular IP-GDGT content from 0.1 to 1.76 fg/cell (0.65 fg/cell on average; Table 2 and Figure 4B). These values are in the same order of magnitude as the estimates from the E-samples as well as from previous studies. Therefore, the observed GDGT distributions can be most plausibly explained by the production of GDGTs by MG II community members.
Our observations are inconsistent with those made in a recent study (Besseling et al., 2020), which estimated the potential contribution of MG II to the IPL pool by analysis of (sub)surface waters of the North Atlantic Ocean and the coastal North Sea. These authors did not detect IP-GDGTs and other archaeal IPLs in samples dominated by MG II and concluded that MG II contributed neither to GDGTs nor to any other known archaeal IPLs (Besseling et al., 2020). Nonetheless, potential alternative lipids belonging to the abundant MG II were not identified. Currently, we can only speculate that the inconsistency between our study and theirs may be due to geographical difference or different quantification approaches. Ultimately the analysis of an MG II isolate, when available, will shed more light on the lipid composition for archaeal lipids of this ubiquitous planktonic archaeal group.

Differences of Lipid Distribution Between MG I and MG II Enriched Sample Sets
Cluster analysis of the lipidomic distributions suggests significant differences between the N-and E-samples, which further supports a potential contribution from MG II to the lipid pool (Figure 3). MG I-enriched E-samples show high abundances of 1G-GDGTs, 2G-GDGTs and C-GDGTs (Figure 3), which is consistent with former studies in MG I-enriched marine environments (Schouten et al., 2008;Pitcher et al., 2011b;Sinninghe Damsté et al., 2012). Besides, Elling et al. (2017) comprehensively described the lipid inventory of 10 Thaumarchaeal cultures in which members of Group 1.1a, inhabiting marine environments, were characterized by high abundances of 1G-GDGTs, 2G-GDGTs and 2G-OH-GDGTs (in decreasing order of abundance). MG II-dominated N-samples exhibit lower amounts of C-GDGTs and increased amounts of 1G-GDGTs, 2G-GDGTs and 2G-OH-GDGTs. Higher relative abundance of 2G-GDGTs in MG II-dominated N-samples is consistent with the observation in the Mediterranean Sea water column, where MG II were positively correlated with 2G-GDGTs (Besseling et al., 2019).
In addition, we observed high abundances of C-uns-ARs in MG I-enriched E-samples (between 7.7 and 36.4%; 23.1 ± 11.7% in average) and low values in the N-samples (between 2.7 and 10.1%; 5.6 ± 3.2% in average; Table 1). Hence, C-uns-ARs may potentially be biomarkers for MG I, as also demonstrated by significant correlations between MG I cell density and abundances of C-uns-ARs (except outlying sample E-7; Supplementary Figure S4a). However, C-uns-ARs have never been observed in pure cultures of MG I as well as in other Thaumarchaeota (Elling et al., 2017). Instead, other studies attributed C-uns-ARs to MG II or methanogen sources (Zhu et al., 2016;Baumann et al., 2018). Specifically, Zhu et al. (2016) observed that C-uns-AR 0:4 was particularly abundant in the euphotic zone of the Equatorial Pacific. Baumann et al. (2018) reported C-MARs in two strains of (hyper)thermophilic methanogens. Thus, these compounds may be produced by a large range of Archaea, including both MG I and MG II. Furthermore, both in the N-and E-sample sets, C-uns-AR (4) dominated, followed by C-uns-AR (6), C-uns-AR (5), and C-uns-AR (3) ( Supplementary Table S5). Accordingly, the unsaturation degree of C-uns-ARs showed little variation between the N-and E-sample sets (Supplementary Table S5). This suggests that the contrasting temperature and archaeal community structure between the two sample sets have little effect on the unsaturation degree of C-uns-ARs in this study.
Previous studies of SPM from surface water identified HPH-GDGTs, especially HPH-crenarchaeol as produced by MG I (Besseling et al., 2019;Sollai et al., 2019). In this study, HPH-GDGTs are detected in every sample as a minor constituent (less than 1.5%); however, HPH-crenarchaeol systematically shows higher relative abundance in MG I-dominated E-samples than in MG II-dominated N-samples (Supplementary Table S3), further supporting potential chemotaxonomic specificity of HPH-crenarchaeol for MG I. Among core lipids, crenarchaeol (Sinninghe Damsté et al., 2002a,b;Schouten et al., 2013) and MeO-AR (Elling et al., 2014(Elling et al., , 2017 were both postulated as biomarkers for MG I. We do not observe any correlation between MG I and crenarchaeol in the present data set, suggesting that core crenarchaeol does not appear to be a universal biomarker for MG I (Supplementary Table S4). Instead, core crenarchaeol correlates with MG II cell density (Supplementary Table S4). This is consistent with the positive correlations between the abundance of specific MG II subgroups with HPH-crenarchaeol and 2G-crenarchaeol observed in the Mediterranean Sea water column (Besseling et al., 2019). But this is inconsistent with the genome-mining results suggesting that MG II do not contain the recently discovered genes encoding the enzymes responsible for ring insertions in GDGTs (Zeng et al., 2019). However, Zeng et al. (2019) also noted that MG II from natural environments may use other pathways for GDGT synthesis (including ring structures), which have yet to be characterized. MeO-AR in both the N-and E-samples shows low absolute abundance (relative abundances less than 0.5%; Figure 3 and Table 1). The relative abundance of MeO-AR slightly increases in MG I-enriched E-samples (Figure 3 and Table 1) but there is no significant correlation between MG I cell density and MeO-AR abundance (Supplementary Figure S4b). Accordingly, its reliability as diagnostic MG I biomarker is also questionable.
Based on the discussion above, no specific biomarkers for MG II could be identified. Previous findings based on genome analysis suggested that MG II may have the potential to synthesize mixed membranes consisting of archaeal type ether lipids with bacterial/eukaryotic G3P glycerol-phosphate backbones (Villanueva et al., 2016;Rinke et al., 2019). However, our results demonstrate a circumstantial link between the existence of the commonly found GDGTs and the dominance of MG II in the NWPO where MG I were present at less than 5.2% of the total archaeal community.

Temperature Is the Main Driver for Archaeal Lipid Distribution in Samples With Different Archaeal Communities
The TEX 86 proxy was developed to reconstruct past SSTs based on GDGT distribution and is now being regularly used in paleoceanography studies (Schouten et al., 2002(Schouten et al., , 2013Tierney and Tingley, 2015). The prerequisite for this proxy is that temperature should be the main driver of GDGT distribution in the environment. It was indeed demonstrated in culture experiments on archaeal strains that higher temperatures led to higher cyclization degree (Uda et al., 2001(Uda et al., , 2004Lai et al., 2008;Boyd et al., 2011). However, several studies pointed to additional environmental factors which could also influence GDGT cyclization degree. For instance, TEX 86 values increase in late growth phases (Elling et al., 2014), at lower O 2 concentrations (Qin et al., 2015) and with lower ammonia oxidation rate (Hurley et al., 2016;Evans et al., 2018). Besides, archaeal community structure is known to have an effect on GDGT distribution (Wuchter, 2006;Herfort et al., 2007;Turich et al., 2007;Elling et al., 2015). In this study, no crenarchaeol isomer was detected precluding the calculation of TEX 86 . The ring index (RI), which behaves similarly to TEX 86 (Schouten et al., 2002), was thus computed in order to estimate the potential impact of MG II-produced GDGTs on the TEX 86 proxy (Figure 5; Eq. 4; Zhang et al., 2016).
The Ring Index of 1G-and C-GDGTs shows strong correlations with in situ temperature, with lower RI corresponding to lower in situ temperatures in MG II-dominated N-samples and higher RI to higher in situ temperatures in MG I-dominated E-samples ( Figure 5 and Supplementary Table S6). This implies that in situ temperatures apparently influenced the RI of these lipid pools in both sample sets with different archaeal communities.
The low sensitivity of 2G-RI to temperature indicates that the cyclization degree of 2G-GDGTs may be less impacted by temperature. 2G-GDGTs are dominated by GDGTs -2 and -3 and crenarchaeol, while the other GDGT pools show higher abundances of crenarchaeol and GDGT-0 (Supplementary Table S3). In addition, the MG II-dominated N-samples show higher abundances of 2G-GDGTs ( Table 1). The lack of correlation between 2G-RI and temperature may suggest a potentially high impact of the planktonic archaeal community structure on archaeal temperature reconstruction proxies. Indeed, our results suggest that (i) MG II may be significant contributors of 2G-GDGTs (Figure 3 and Table 1) and (ii) 2G-GDGT cyclization degree is only weakly impacted by temperature. These data call for further investigation aiming at determining (i) the global contribution of MG II to the archaeal lipid pool produced in the water column and (ii) the export mechanisms of IPLs and particularly 2G-GDGTs to the seafloor. Understanding these two key points are of prime importance for evaluating the potential impact of MG II communities on the TEX 86 -based paleotemperature proxies.

CONCLUSION
16S rRNA gene sequencing results of two sample sets collected from surface waters of NWPO and East China Sea highlighted substantial differences in archaeal community structures between the two sample sets, with MG II dominating the former and MG I the latter. By examining the absolute lipid abundance and archaeal cell densities estimated from qPCR and sequencing analysis, we revealed a potentially high contribution of MG II to archaeal tetraether lipids in MG II-dominated samples collected from surface waters of the NWPO. This is consistent with an early observation of MG II contribution to the GDGT pool in the North Pacific Subtropical Gyre. Archaeal lipid distribution in these samples differed from the MG I-dominated samples collected from East China Sea surface waters. Notably, higher abundances of unsaturated archaeols were observed in the MG I-dominated samples than in the MG II-dominated ones. The widespread occurrence of these unsaturated compounds implies that they may be synthesized by a large range of Archaea. However, the lipid distribution differences seemed to only marginally impact the cyclization degree of the whole GDGT pool produced in the surface waters. Overall, this study provides new clues on the distribution of MG II archaeal lipids and their biological sources in oceanic surface water, while cautioning the use of archaeal lipid-based proxies for paleoclimate reconstruction.

DATA AVAILABILITY STATEMENT
The 16S rRNA gene amplicon reads (raw data) in this study have been deposited in the NCBI Sequence Read Archive (SRA) under BioProject number PRJNA603442, https://www.ncbi.nlm. nih.gov/sra/PRJNA603442.