Two complete 1918 influenza A/H1N1 pandemic virus genomes characterized by next-generation sequencing using RNA isolated from formalin-fixed, paraffin-embedded autopsy lung tissue samples along with evidence of secondary bacterial co-infection

ABSTRACT The 1918 influenza pandemic was the most devastating respiratory pandemic in modern human history, with 50–100 million deaths worldwide. Here, we characterized the complete genomes of influenza A virus (IAV) from two fatal cases during the fall wave of 1918 influenza A (H1N1) pandemic in the United States, one from Walter Reed Army Hospital in Washington, DC, and the other from Camp Jackson, SC. The two complete IAV genomes were obtained by combining Illumina deep sequencing data from both total RNA and influenza viral genome-enriched libraries along with Sanger sequencing data from PCR across the sequencing gaps. This study confirms the previously reported 1918 IAV genomes and increases the total number of available complete or near-complete influenza viral genomes of the 1918 pandemic from four to six. Sequence comparisons among them confirm that the genomes of the 1918 pandemic virus were highly conserved during the main wave of the pandemic with geographic separation in North America and Europe. Metagenomic analyses revealed bacterial co-infections in both cases. Interestingly, in the Washington, DC, case, evidence is presented of the first reported Rhodococcus-influenza virus co-infection. IMPORTANCE This study applied modern molecular biotechnology and high-throughput sequencing to formalin-fixed, paraffin-embedded autopsy lung samples from two fatal cases during the fall wave of the 1918 influenza A (H1N1) pandemic in the United States. Complete influenza genomes were obtained from both cases, which increases the total number of available complete or near-complete influenza genomes of the 1918 pandemic virus from four to six. Sequence analysis confirms that the 1918 pandemic virus was highly conserved during the main wave of the pandemic with geographic separation in North America and Europe. Metagenomic analyses revealed bacterial co-infections in both cases, including the first reported evidence of Rhodococcus-influenza co-infection. Overall, this study offers a detailed view at the molecular level of the very limited samples from the most devastating influenza pandemic in modern human history.

those 15-34 years of age in 1918-1919 were >20 times higher than in previous years (2), this decreased life expectancy in the United States by about 12 years (3).Although evidence was obtained that a filterable virus might be the causative agent of the 1918 influenza pandemic (4)(5)(6), human influenza A viruses (IAVs) were first isolated from patients in 1933, 15 years after the 1918 pandemic (7).
Genetic characterization of the 1918 virus itself began in the mid-1990s when RT-PCR methods were used to amplify tiny viral RNA fragments from formalin-fixed, paraffinembedded (FFPE) autopsy tissue blocks and a lung sample from a permafrost-preserved body of a 1918 pandemic victim (8,9).Unfortunately, the FFPE procedure results in severe RNA degradation with isolated RNA peak size under 100 nucleotides (10).The initial effort to sequence the complete genome of the 1918 IAV took approximately a decade (11) and led to the initial reconstruction of infectious virus by reverse genet ics in 2005 (12).More recently, next-generation sequencing (NGS) technology allowed more efficient genetic sequencing of several additional complete and partial 1918 viral genomes (10,(13)(14)(15).We report here the complete genomes of two 1918 fall wave cases, both initially described in a histopathological case series in 2011 (14).Collectively, these new sequences help confirm that the 1918 pandemic virus was remarkably conserved during the peak fall wave of the pandemic.
Bacterial co-infections are very common during influenza viral infections, especially in severe cases.The majority of deaths in the 1918-1919 influenza pandemic were likely a direct result of secondary bacterial pneumonia caused predominantly by common upper respiratory tract bacteria (16).The pathophysiologic mechanisms proposed to account for increased severity of disease in IAV-bacterial co-infections are numerous (16)(17)(18).NGS technologies allow metagenomic analyses of viral RNA, host RNA, and bacterial RNA from co-infections in these archived samples (10,15).In one of the 1918 samples of the current study, we identified the first known case of an IAV co-infection with a Rhodococcus species.

Biosafety and dual-use research of concern oversight
The work described in this report was evaluated in accordance with the U.S. Department of Health and Human Services (HHS) policies for dual-use research of concern (DURC) and potential pandemic pathogens (P3CO).The NIH Dual Research of Concern review entity reviewed and approved the project, and the Institutional Contact for Dual Use Research reviewed this manuscript prior to publication.The committee determined that this research did not meet the definition of DURC because the results, information, and technology are not reasonably anticipated to be directly misapplied to have a significant impact on public health.

RNA isolation and real-time RT-PCR
RNA was extracted from the postmortem FFPE lung tissue using two to three 6-µm sections (14), and the average size of the sample used is less than 0.5 cm 2 .The final volume of the isolated total RNA solution was 20 µL per sample.RNA isolated from the formalin-fixed, paraffin-embedded 1918 tissues was highly degraded, with peak RNA size under 100 nucleotides (10).cDNA was synthesized using 7 µL of isolated RNA using SuperScript III Reverse Transcriptase (Thermo Fisher Scientific, Waltham, MA) and generated 20-µL cDNA.Real-time PCR (Taqman assay) for detection of positive samples of influenza virus used Matrix primers (JT1283-Matrix Forward: 5′-ARA TGA GTC TTC TRA CCG AGG TCG-3′, JT1284-Matrix Reverse: 5′-TGC AAA GAC ATC YTC AAG YYT CTG-3′, Taqman Probe: 6 FAM-TCA GGC CCC CTC AAA GCC GA-TAMRA) as described (10), and 5 µL of cDNA was used in Taqman assay using TaqMan Universal PCR Master Mix (Thermo Fisher Scientific, Waltham, MA).

Library production and next-generation sequencing
First, isolated total RNA from FFPE samples was subjected to rRNA depletion using the NEBNext rRNA Depletion Kit (New England Biolabs, Ipswich, MA).rRNA-depleted RNA (7 µL) was treated with DNase and used to construct each total RNA library following the TruSeq RNA Library Prep Kit v2 protocol (Illumina, San Diego, CA, USA) but eliminating the poly-A selection and RNA fragmentation steps.
To produce influenza virus-enriched libraries, isolated total RNA from the FFPE sample was amplified using the Ovation RNA-Seq system V2 from NuGEN (NuGEN, San Carlos, CA).For each sample, 5 µL of DNase-treated total RNA was used as input.The ampli fied total cDNAs were analyzed by an Agilent 2100 Bioanalyzer using the Agilent High Sensitivity DNA Kit (Agilent Technologies, Santa Clara, CA).Approximately 300 ng of amplified cDNA was used to generate the Illumina sequencing library using the Agilent SureSelect Target Enrichment Kit (Agilent, Santa Clara, CA) enriched with SureSelect universal influenza virus enrichment panel (19) for Illumina Multiplex Sequencing.Final Illumina sequencing libraries were analyzed with the Agilent 2100 Bioanalyzer using the Agilent High Sensitivity DNA Kit.Libraries were diluted and sequenced on an Illumina NextSeq sequencer (Illumina, San Diego, CA).From start to finish, a total of 12 sections (six section for the non-enriched library and six for the enriched library) were used for each sample.

Case reports
Both 1918 pandemic influenza autopsy cases were included in a histopathological case series described previously (14).The first 1918 case sequenced for this study is 19180929a, derived from an FFPE autopsy lung tissue block from a 28-year-old US soldier who died 2 days following admission to the Walter Reed Army Hospital, Washington, DC, on 29 September 1918.This case was positive for influenza viral antigen by immunohis tochemistry in bronchial and bronchiolar respiratory epithelial cells (Fig. 1A through D).The IAV genome from this sample was designated A/Washington DC/1/1918 (H1N1).The autopsy reported a positive culture for Streptococcus pneumoniae type 2 (14); however, the tissue Gram stain performed noted the presence of a Gram-positive pleomorphic bacteria compatible histologically with Rhodococcus spp.(Fig. 1E).
The second case sequenced in this study, 19180926c, is derived from an FFPE autopsy lung tissue block from a 21-year-old US soldier who died on 26 September 1918 after a 7-day hospitalization in Camp Jackson, SC.No post-mortem culture results were reported, and immunohistochemistry for influenza A viral antigen and tissue Gram stain were also not previously reported.This is the initial 1918 case for which a partial viral genome sequence was reported (8,9).The IAV genome from this sample was previously designated A/South Carolina/1/1918 (H1N1) (8,9).Due to the extremely limited amount of the fixed material in this case, no tissues could be re-embedded for secondary histopathological analysis.

Next-generation sequencing results
Initial real-time PCR (Taqman assay) screening showed positive detection of a conserved IAV matrix M1 amplicon (30) for samples 19180929a and 19180926c with corresponding quantification cycle (Cq) (31) (threshold cycle (Ct) from the instrument) values of 38.861 and 38.841, respectively.Subsequently, Illumina sequence cDNA libraries were prepared from both samples.
In the case of sample 19180929a, the constructed library was diluted and sequenced on a NextSeq 500 sequencer with an 80-bp single-read run.At completion, 26.5 Gb raw sequence data (239.63 million raw reads) were generated, with 89.8% of total bases having Phred score of Q30 or higher.Alignment of all the reads to the 1918 pandemic reference genome revealed that only 0.01% of reads aligned with IAV gene segment coverage ranging from 17.94% of neuraminidase (NA) to 97.13% of nucleoprotein (NP).
For sample 19180926c, the constructed library was diluted and sequenced on a NextSeq 500 sequencer with a 150-bp single-read run.At completion, 41.5 Gb raw sequence data (252.70 million raw reads) were obtained, with 79.9% of total bases having Phred score of Q30 or higher.Alignment of all the reads to the 1918 pandemic reference genome revealed that 3.72% of reads aligned with segment coverage ranging from 73.62% of NA to 100% of HA, NP, PB1, and PB2.
To improve the total number of IAV reads in both samples, influenza virus-enriched libraries were constructed using a universal influenza virus enrichment probe set as previously described (19) and re-sequenced.For sample 19180929a, the universal influenza virus-enriched cDNA library was diluted and sequenced on a NextSeq 500 sequencer with an 80-bp single-read run.At completion, 31.3Gb raw sequence data (324.25 million raw reads) were obtained, with 75.0% of total bases having Phred score of Q30 or higher.Notably, after aligning all the reads to the 1918 pandemic reference genome, a significantly improved alignment rate of 40.18% was achieved following enrichment.
For sample 19180926c, the constructed universal influenza virus-enriched library was diluted and sequenced on a NextSeq 500 sequencer with a 160-bp single-read run.At completion, 34.9 Gb raw sequence data (210.60 million raw reads) were generated, with 88.9% of the total bases showing a Phred score of Q30 or higher.Aligning all the reads to the 1918 pandemic reference genome revealed a similarly improved alignment rate, with 48.94% aligning to the reference genome.
Hence, utilizing the universal influenza virus probe set (19) showed substantial enrichment effects on IAV reads from both 1918 influenza pandemic samples.By combining enriched and unenriched sequence data, the alignment results to the reference 1918 influenza viral genome for both samples are summarized in Table 1.

Consensus sequences
Despite the significant improvement in coverage achieved through enrichment, 100% coverage was still needed for specific viral gene segments of both samples; for example, the lowest segment coverage (91.96%) with the PA segment of 19180929a.Sample 19180929a had eight noncovered regions, and sample 19180926c had six noncovered regions (Table S1).PCR primers were designed specifically for the noncovered regions to close these gaps (Table S1).PCR amplifications were performed on cDNAs derived from both samples, and the PCR products were directly sequenced using Sanger sequencing by Psomagen (Rockville, MD).All the noncovered regions from Illumina sequencing were successfully closed by subsequent PCR amplifications.All the obtained sequen ces matched the 1918 reference genomes except for a single synonymous nucleotide change (C->T) at nucleotide position 462 of the NA segment in both samples.
Consensus sequences of the 1918 pandemic influenza viruses from samples 19180929a and 19180926c were generated by combining Illumina sequence data with super majority rule at variant sites (>90% variant frequency) and PCR amplicon sequencing data.Both consensus sequences have been submitted to GenBank with the following accession numbers: from OR382000 to OR382007 for sample 19180929a and OR391521 to OR391527 for sample 19180926c, respectively.
To further analyze the obtained consensus sequences, a comparison was made with the four published complete/near-complete 1918 IAV genomes (10,11,15).The nucleotide and amino acid differences between these sequences are shown in Table 2. Pairwise distances and a maximum likelihood unrooted phylogenetic tree were also generated among these six genomes from a concatenated nucleotide sequence of all eight respective IAV gene segments, as shown in Table 3 and Fig. 2A.The analysis reveals that the two newly reported German-origin 1918 IAV genomes (15) form one cluster, while the six US genomes cluster separately.

SNP analyses
In comparison to the 1918 pandemic reference genome, A/Brevig Mission/1/1918 (H1N1), a total of 76 SNPs were identified in sample 19180929a, with 46 of them encoding nonsynonymous (NS) changes.Similarly, sample 19180926c exhibited 69 SNPs, with 42 nonsynonymous changes (Table 4).Fifteen SNP sites are shared between samples 19180929a and 19180926c, of which 7 of them were nonsynonymous (yellow highlighted rows in Table 4).Among these common seven nonsynonymous SNPs, only one at nucleotide position NA185 (corresponding to amino acid position NA62) differs between sample 19180929a (nucleotide change T185A, resulting in amino acid change V62E) and sample 19180926c (nucleotide change T185G, resulting in amino acid change V62G).
There is another nonsynonymous change at PB2 amino acid position 333 in samples 19180929a and 19180926c.In sample 19180929a, a C/T base variation with a frequency of 46.29% (T) at nucleotide position 998 in PB2 was observed, which could result in the change from ACA(T) to ATA(I).However, in sample 19180926c, a variation of A/G with a frequency of 42.30% (G) at nucleotide position 997 in PB2 is observed, potentially resulting in the change from ACA(T) to GCA(A) at the same amino acid position 333.All 88 nonsynonymous SNPs observed in samples 19180929a and 19180926c at 81 unique sites are shown in Table 4.
In a recent study by Patrono et al. (15), the 1918 pandemic IAV genome was successfully recovered from lung samples from formalin-fixed 1918 specimens in Germany.One small uncertain sequence region within the original Brevig Mission 1918 IAV PA sequence (GenBank accession: DQ208311.1)(11) was identified.To re-evaluate the sequence of this PA region, RNA was isolated from ten 6 µM sections from the original Brevig Mission FFPE lung block.PCR primers (Table S1) were designed to amplify the specific PA region of interest.The RT-PCR was positive from only 1 of 10 sections, and the amplified product was subsequently sequenced.The sequencing results confirmed the presence of the seven base differences previously identified by Patrono et al. (15) as compared to the original Brevig Mission PA sequence.Only one of these seven base differences, specifically the A to G change at nucleotide position PA473, resulted in an NS amino acid change (PA K158R), a residue located in the endonuclease domain of the IAV RNA-dependent RNA-polymerase PA protein (32).Consequently, the influenza virus A/Brevig Mission/1/1918 PA sequence was updated in GenBank with the new accession number DQ208311.2.

Nonsynonymous changes in consensus sequences
The consensus sequences of the two new 1918 IAV genomes were next compared to the other four available complete 1918 IAV genomic sequences (Fig. 2B; Table S2).A total of 29 NS changes were called between the consensus genomes across all eight gene segments.The A/Brevig Mission/1/1918 reference sequence differed by three NS SNPs a RNA polymerase proteins (PB2, PB1, and PA), hemagglutinin (HA), nucleoproteins (NP), neuraminidase (NA), matrix protein (MP), nonstructural protein (NS); missing base (-).classified reads (35%) were human reads under Vertebrata, followed by reads classified as bacteria and IAV (Fig. 3A).Among the bacteria, the highest percentage of classified reads belonged to Streptococcus (22%) (Fig. 3C), which matched the original histopathology of the 1918 H&E-stained lung sections showing acute bacterial pneumonia, even though there were no available postmortem lung bacteriological results reported for this patient (14).Surprisingly, for case 19180929a, a significant proportion (82%, 36,367,771 reads) of the classified reads were identified as Rhodococcus, while human reads accounted for only 1%, and virus reads represented a mere 0.01% (Fig. 3B).Among the bacteria, a striking 96% of the reads were identified as Rhodococcus (Fig. 3D).Within the Terrabacte ria group, 1% of the reads were classified as Firmicutes (Fig. S1A), and among the Firmicutes, 59% of the reads were identified as Streptococcus (Fig. S1B).Thus, in the case of sample 19180929a, the dominant reads originated from Rhodococcus, while only 0.8% of the reads were from Streptococcus.It is unclear whether the postmortem culture results of Streptococcus pneumoniae type 2 reflected culture from the blood or lung.Due to the significant proportion of Rhodococcus reads obtained in sample 19180929a, all the reads obtained from the unenriched library of sample 19180929a were aligned to various Rhodococcus genomes, including species in groups A, B, C, D, E, and F (33).The alignment results are summarized in Table 5.As shown in Table 5, the highest rates of total aligned reads and reference chromosome coverage were observed for group D strains of Rhodococcus, such as Rhodococcus qingshengii, with 53.09% of total reads aligned and 95.68% of the chromosome covered.Similarly, Rhodococcus sp.008 demonstrated 57.16% of total reads aligned and 92.96% of the chromosome covered.The annotated 16S rRNA regions from both genomes (R. qingshengii and Rhodococcus sp.008) were matched 100% by reads from 19180929a.These results reveal that, in addition to the 1918 IAV infection, Rhodococcus was a prominent bacterial co-infection in the lung tissue of this patient (Fig. 1) and likely contributed to the fatal outcome.The consensus chromosomal sequence of Rhodococcus was obtained based on the alignment on R. qingshengii (GenBank accession number: NZ_AP023172) reference genome and subsequently submitted to GenBank with accession number CP138905.The annotation of the reference genome (NZ_AP023172) revealed the presence of several antibiotic resistance genes, for example, erythromycin esterase family protein (RQCS_RS21590: 4717066 to 4718289), rifampin monooxygenase Iri (RQCS_RS09890: 2156973 to 2158400), aminoglycoside phosphotransferase family protein (RQCS_RS07315: 1570880 to 1571767), beta-lactamase family protein (RQCS_RS03560: 780011 to 781150; RQCS_RS04665: 1027476 to 1028891; RQCS_RS09600: 2090911 to 2092107; RQCS_RS09610: 2093620 to 2094705), and ribosome methyltransferase (RQCS_RS03425: 750266 to 751237; RQCS_RS11555: 2507605 to 2508171; RQCS_RS11835: 2561743 to 2562852; RQCS_RS16580: 3655841 to 3656854; RQCS_RS17455: 3832907 to 3833668; RQCS_RS20910: 4585140 to 4586039; RQCS_RS20955: 4595206 to 4596036; RQCS_RS28875: 6233984 to 6234664) (34)(35)(36)(37)(38)(39).Our sequences have completely covered all the antibiotic resistance genome regions, with only a few SNPs.Therefore, the presence of antibiotic resistance genes in this Rhodococ cus strain, obtained from the year 1918 when commercial antibiotics were not available, further supports the idea that antibiotic resistance is a natural phenomenon predating the modern selective pressure of clinical antibiotic use (40).

genome comparison
To date, six complete/near-complete IAV genomes have been obtained from the 1918 influenza pandemic.Among them, two BE572 and MU162 (A/Berlin/572/1918 and A/ Munich/162/1918) are from Germany and four are from the United States.The pair wise genome differences observed among the four US genomes are from 0.087% to 0.166%, and that between the two German genomes is 0.199% (Table 3).However, the genome differences between US and German genomes are from 0.228% to 0.463% (Table 3).Therefore, viral genomes from the same continent showed lower overall divergence than genomes from the other continent, which is consistent with results from a recent study (15).In addition, the genome differences between summer 1918 pre-peak BE572 (A/Berlin/572/1918) and fall 1918 peak wave cases MU162, 19180929a, 19180926c, 19180924d, and Brevig (A/Munich/162/1918, A/Washington DC/1/1918, A/ South Carolina/1/1918, A/New York/3/1918, and A/Brevig Mission/1/1918) are from 0.228% to 0.279%, which is greater than those among genomes from the peak period (0.087%-0.166%).This is also consistent with the result from the recent study (15).Even so, six genomes still reflect a very small sample size to evaluate the evolution and spread of the 1918 pandemic IAV.For comparison, there are over 15 million SARS-CoV-2 genome sequences in public databases (41).Obtaining additional genomes from archival samples around the pandemic will provide more comprehensive information about IAV genomes and their evolution before, during, and after the 1918 pandemic.

Quasispecies
RNA viruses are well known for their high mutation rates due to the error-prone characteristic of their RNA-dependent RNA polymerase, with the mutation rates from 10 −6 to 10 −4 substitutions per nucleotide per cell infection (s/n/c) (42).This high mutation rate of RNA viruses, including IAV, leads to significant genomic variations that allow the pathogen to overcome host defenses to prevent infection and allow for rapid selection.Specifically for IAV, it has also been reported that more than 90% of influenza A virions may be "noninfectious" (43).Therefore, when performing high-throughput sequencing on the pool of nucleotide molecules from our samples, the nucleotides from all these mutated viral molecules can be recovered.In our study, for example, among the identified 88 nonsynonymous SNPs, there are 5 SNPs that led to premature stop codons (Table 4), which should result in defective IAV particles.Another possible explanation for observing these nonsense stop codon SNPs is because of the deamination reaction from the FFPE process that increases cytosine (C) to thymine (T) and guanine (G) to adenine (A) mutations (known as (C:G > T:A) mutations) (44).Among these five stop codon SNPs in our study, four of them involved C to T transitions and one involved a G to A transition.The reference genome [A/Brevig Mission/1/1918(H1N1)] was obtained predominantly from ethanol-fixed samples (8,9).In addition, it is important to acknowledge that potential errors can arise during the reverse transcription, PCRs, and sequencing process.Therefore, we applied conservative supermajority rules, requiring a minimum variant frequency of at least 90%, when generating consensus sequences at the mutation sites, which is the same method as in our previous 1918 IAV next-generation sequencing study (10).By adopting this strategy, we sought to minimize the impact of sequencing errors and enhance the reliability of the identified mutations.
As shown in Table S2, 29 NS SNPs were identified in the consensus alignments of the six available complete/near-complete 1918 viral genomes.Where these changes are associated with known functional domains, this has been noted in last column of Table S2.Nineteen of these changes were noted in the viral RNP complex-encod ing gene segments.Some of these mutations have been recently characterized in the field (references in Table S2).The PB2 631M/L mutation was found in the A/Munich/162/1918 H1N1 isolate (15), which was recently found in correlation to breakthrough infections of an H5N1 IAV strain in genetically modified chickens for ANP32A (45).Two mutations to the IAV NP, 16D/G and 283P/L, have been connected to the evolution of NP-driven resistance to human antiviral protein MxA (46)(47)(48).To assess the physiologic significance of many of these other changes in domains that are significant to IAV vRNP or RdRp function will require functional studies in concert with cell-based in vitro and in vivo studies with infectious viruses.

Bacterial co-infection and Rhodococcus infection in case 19180929a
In severe cases of IAV infection, bacterial co-infections are frequent and contribute to severe disease and fatal outcomes.For example, during the 1918-1919 influenza pandemic, most deaths were associated with influenza viral pneumonia and secondary bacterial pneumonia co-infections (16,49).Despite the availability of advanced medical services and numerous antibiotics, bacterial co-infection remained prevalent during the latest global influenza pandemic in 2009, caused by a novel influenza A (H1N1) virus and resulted in an estimated 151,700-575,400 deaths worldwide just during the first year of its circulation (50).For example, one report showed that 30.3% of patients infected with the 2009 pandemic influenza A (H1N1) admitted to the intensive care unit in the United States had bacterial co-infections (51).In Australia and New Zealand, during the 2009 H1N1 pandemic, 20.3% of patients in critical care services were clinically diagnosed with bacterial pneumonia (52).In fatal cases of the 2009 influenza pandemic, bacterial co-infection can be as high as 55% (49).The common bacteria found in co-infection with influenza virus include Haemophilus influenzae, Pseudomonas species, Staphylococcus aureus, Streptococcus pneumoniae, and Streptococcus pyogenes (53), reflecting bacteria often carried in the human nasopharynx.In the case of 19180926c, Streptococcus, including Streptococcus pneumoniae was the most prevalent bacterial species identified.However, in the case of 19180929a, although 0.7% of total reads (0.8% of bacterial reads) were classified as Streptococcaceae, which matches the post-mortem culture results from 1918 (14), the top classified reads (82% of total reads and 96% of bacterial reads) were Rhodococcus (Fig. 3).
The genus Rhodococcus, together with the genus Nocardia, is a member of fam ily Nocardiaceae and is classified in the suborder Corynebacterineae (54).Pulmonary nocardiosis is an invasive infection caused by Nocardia, a partial acid-fast microorganism, and the common route of pulmonary infection is inhalation of contaminated aerosols from environmental sources (55).Rhodococcus is a genus of aerobic, nonsporulating, nonmotile Gram-positive bacteria, mostly found in a broad range of environments, including soils, groundwater, marine sediments, animal dung, and the guts of insects as well as from healthy and diseased animals and plants (56,57).Rhodococcus includes the following known human pathogens: R. equi, R. erythropolis, R. gordoniae, R. fascians (R. luteus), and R. rhodochrous (complex) (58).The most commonly reported species associated with human infections is Rhodococcus equi (59) which was first isolated from Swedish foals (60).The first report of R. equi human infection was in 1967 by Golub et al. from lung abscess of a patient with hepatitis who had received immunosuppressive drugs (61).Rhodococcus equi infections in humans are mostly observed in immunocom promised patients (62,63).
In the 19180929a sample, it is likely that the bacteria identified in the Gram stain were Rhodococcus since the bacteria had pleomorphic morphology and were partially acid-fast positive after staining by the Ziel-Nielson procedure (Fig. 1E).Reads obtained from sample 19180929a were aligned to reference genomes of R. equi (group A), R. erythropolis (group D), R. gordoniae (group B), R. fascians (R. luteus) (group E), and R. rhodochrous (group D), respectively (Table 5).Significant alignment was observed only with group D Rhodococcus, including R. erythropolis and R. rhodochrous (33).However, just based on the analysis of obtained sequences, including the alignment to different Rhodococcus species, it is challenging to determine the exact Rhodococcus species detected in this sample because the top alignment rate is from Rhodococcus sp.008, while the top chromosome coverage is from Rhodococcus qingshengii (Table 5).Clearly, the Rhodococcus species sequenced belongs to group D Rhodococcus, not from group A which includes R. equi.We are not aware of any previous reports documenting IAV and Rhodococcus co-infection, and the overwhelming presence of group D Rhodococcus reads in sample 19180929a strongly suggests that this might be the first reported case of IAV and Rhodococcus co-infection, which likely played a direct role in the patient's fatal outcome in 1918.

Limitations
When working on old, archived samples, particularly when identifying low levels of viral or bacterial nucleic acids within them, preventing DNA contamination throughout the entire experimental process is of utmost importance.The handling and processing of the samples described here were performed exclusively in one laboratory pre-PCR room specifically designated for 1918 samples.Moreover, the laboratory where the extraction and sequencing occurred and other laboratories within the same building have never conducted prior research on Rhodococcus bacteria.In addition, the number of reads classified as Rhodococcus in sample 19180929a exceeded 36 million, coupled with the histologic analyses by Gram and acid-fast staining, making it highly improbable for this material to have been introduced during our experimental procedures.Of course, environmental contamination and postmortem overgrowth at the time of autopsy in 1918 cannot be ruled out.However, the well-preserved nature of the tissue sections from the material (Fig. 1) suggests that the postmortem interval prior to autopsy and sample fixation was brief.

Conclusions
Our current study identified two IAV RNA-positive FFPE postmortem lung samples from September 1918, with deaths occurring during the peak wave of the pandemic.Subsequently, we employed advanced high-throughput sequencing technology with PCR amplification targeting specific regions to obtain two complete IAV genomes from this critical period.This increases the total number of available near-complete IAV genomes of the 1918 pandemic from four to six.Our study provides valuable insights into the genomic diversity of the 1918 influenza A pandemic virus, which caused the most devastating influenza pandemic in documented human history (64).The availa bility of these additional genomes will aid in unraveling the origin of the pandemic virus and its subsequent evolutionary path.It will also contribute to understanding the relationship between the 1918 pandemic strain and subsequent seasonal influenza A viruses.Identification of additional IAV RNA-positive 1918 pandemic postmortem case samples is needed, especially from the period before the pandemic peak in September-November 1918 (2).Additionally, an important finding from our study is the identification of the first reported Rhodococcus-influenza A virus co-infection to our knowledge.This discovery emphasizes the potential impact of co-infections in severe influenza cases and underscores the necessity for further research and comprehension of these interactions.

a
The shaded rows are the same between 19180929a and 19180926c.

TABLE 1
Alignment results of 1918 pandemic influenza genome from samples 19180929a and 19180926c

TABLE 2
Differences among obtained 1918 influenza genomes a

TABLE 2
Differences among obtained 1918 influenza genomes a (Continued)

TABLE 3
Pairwise distances among 1918 influenza genomes

TABLE 4
Nonsynonymous SNPs a

TABLE 5
Alignment results for Rhodococcus species from sample 19180929a