A 6-year retrospective study of clinical features, microbiology, and genomics of Shiga toxin-producing Escherichia coli in children who presented to a tertiary referral hospital in Costa Rica

ABSTRACT Shiga-toxin-producing Escherichia coli (STEC) is associated with diarrhea and hemolytic uremic syndrome (HUS). STEC infections in Costa Rica are rarely reported in children. We gathered all the records of STEC infections in children documented at the National Children’s Hospital, a tertiary referral hospital, from 2015 to 2020. Clinical, microbiological, and genomic information were analyzed and summarized. A total of 3,768 diarrheal episodes were reviewed. Among them, 31 STEC were characterized (29 fecal, 1 urine, and 1 bloodstream infection). The prevalence of diarrheal disease due to STEC was estimated at 0.8% (n = 29/3,768), and HUS development was 6.4% (n = 2/31). The stx1 gene was found in 77% (n = 24/31) of STEC strains. In silico genomic predictions revealed a predominant prevalence of serotype O118/O152:H2, accompanied by a cluster exhibiting allele differences ranging from 33 to 8, using a core-genome multilocus sequence typing (cgMLST) approach. This is the first study using a genomic approach for STEC infections in Costa Rica. IMPORTANCE This study provides a comprehensive description of clinical, microbiological, genomic, and demographic data from patients who attended the only pediatric hospital in Costa Rica with Shiga-toxin-producing Escherichia coli (STEC) infections. Despite the low prevalence of STEC infections, we found a predominant serotype O118/O152:H2, highlighting the pivotal role of genomics in understanding the epidemiology of public health threats such as STEC. Employing a genomic approach for this pathogen for the first time in Costa Rica, we identified a higher prevalence of STEC in children under 2 years old, especially those with gastrointestinal comorbidities, residing in densely populated regions. Limitations such as potential geographic bias and lack of strains due to direct molecular diagnostics are acknowledged, emphasizing the need for continued surveillance to uncover the true extent of circulating serotypes and potential outbreaks in Costa Rica.

anemia and nephropathy, which is triggered by the action of Shiga toxins when absorbed into the systemic circulation and which can be exacerbated using antibiotics (4,5).It is estimated that up to 10% of patients with STEC infection may develop HUS, with a case-fatality rate ranging from 3% to 5% (6).Cattle have been associated as a major reservoir of STEC affecting humans (7).STEC is transmitted to humans primarily through the consumption of contaminated foods, such as raw or undercooked ground meat products, raw milk, and contaminated raw vegetables and sprouts (1).
Classic bacterial pathogen detection methods such as culture, microscopy, and biochemical tests are laborious and time-consuming (4,8).These assays are not useful for detecting STEC, since phenotypic assays based on virulence properties or molecular methods are required for the specific identification of virulence factors such as the Shiga-toxin genes stx1 and stx2 (9).
The continuous expansion in massive parallel sequencing (whole-genome sequenc ing, WGS) marks a groundbreaking leap forward by enhancing the capabilities to explore in more detail bacterial characteristics, including sequence type, serotype, genes linked to antimicrobial resistance, and phylogenetic relationships.It can also potentiate clinical practice and public health interventions, e.g., when archived data or bacterial isolates are used to determine phenotypic traits such as resistance mechanisms to antimicrobials, virulence factors, and association with outbreaks and mortality (10).
The National Children's Hospital of Costa Rica (HNN-CCSS) is a tertiary referral hospital within the socialized medical care system and the only pediatric hospital in the country.STEC strains isolated by the National Laboratories require confirmation by the National Reference Center (NRC) for Bacteriology or the NRC for Microbiological Food Safety (Instituto Costarricense de Investigación y Enseñanza en Nutrición y Salud, Inciensa).Between 2005 and 2016, 52 cases of STEC infections were documented in children in Costa Rica, all from HNN-CCSS (11,12).
Here, we provide the first genomic study of STEC in Costa Rica, based on a nationwide surveillance program using whole-genome sequencing.We also analyze the clinical profiles of the children from whom these STEC were recovered and describe possible associations between genomic findings, epidemiology, and illness.

Demographic, clinical, microbiological, molecular, and epidemiological data
All data from the microbiological analysis of stool cultures and STEC identification between January 2015 and July 2020 in patients under 13 years of age were extrac ted from HNN-CCSS laboratory information systems: Labcore (Rochem Biocare, Bogotá, Colombia) and Copernico (Biomérieux, Marcy L'étoile, France) and collated with Microsoft Excel 365.The study included patients ranging from 0 to 13 years of age, reflecting the age range covered by the HNN-CCSS.All patients with STEC detection in at least one fecal or extraintestinal sample were selected.Health records were reviewed for each patient using a standardized form.Demographic (age, sex, and location/province), clinical (underlying diseases, hospitalization, peripheral blood tests, HUS development, and outcome), microbiologic (sample type and characteristics and susceptibility test), molecular and genomic (PCR results and whole-genome sequencing data) variables were analyzed.

Bacterial culture and polymerase chain reaction
Pathogen screening was performed by culturing the stools in several media including Campylobacter agar, Salmonella-Shigella agar, blood agar supplemented with ampicillin (5%), and Tergitol-7 agar.The predominant growth of yellow lactose-fermenting colonies on Tergitol-7 agar (Thermo Scientific Oxoid) was selected for end-point PCR.DNA extraction was performed using Maxwell (Promega Corp, Madison, WI, USA) Cell-DNA extraction protocol according to the manufacturer's instructions.PCR assays targeting STEC genes eaeA, stx1, stx2 (STEC), and rfbE (O157) were conducted using primers and conditions previously described (11,13).When available, samples with STEC detection through Filmarray GI Panel (BioFire Diagnostics, LLC, Salt Lake City, UT, USA) were also cultured in Tergitol-7 agar and were analyzed with the PCR assay mentioned before.
STEC from other sources were identified by PCR upon medical request due to clinical manifestations of the patients (unpaired kidney function) together with the isolation of Escherichia coli.Identified STEC isolates were derived for further confirmation analyses to the Inciensa.These analyses included verification of the genes by multiplex endpoint-PCR targeting eaeA, stx1, stx2, and rfbO157 genes, as described by Leotta et al. ( 14)

Whole-genome sequencing
Whole-genome sequencing was performed in both facilities (HNN-CCSS and Inciensa) using the DNA extracted using the same protocol described for PCR assays.Illumina Nextera Flex (Illumina Inc., San Diego, CA, USA) kit and Illumina Miseq platforms were used for library preparation.At the HNN-CCSS, a 300 bp paired-end fragment protocol recommended by Illumina (Illumina Inc., San Diego, CA, USA) was used.The PulseNet International SOP (https://pulsenetinternational.org/protocols/wgs/) was executed at Inciensa for bioinformatic analyses.First, 250-or 300-bp sequence-read quality was checked with FastQC (16).Bioinformatic analyses were performed using BioNumerics v7.6.3 software (Applied Maths Inc., Biomérieux, Austin, TX, USA).Genome assembly was performed with the Applied Maths cloud-based calculation engine, and the resulting genomes were uploaded to NCBI (see the supplemental material).A wgMLST scheme for Escherichia coli was used to determine the allelic profile (i.e., unique sequences and their variation due to mutations produced by evolutionary events).Two algorithms were used for the identification of alleles: the assembly-free strategy found alleles, based on the raw sequence reads using a k-mer-based approach, and the assembly-based algorithm identified alleles based on the de novo SPAdes (17) assembled genomes using BLAST (18).Only alleles identified by both calling algorithms were considered present in the genome.E. coli functional genotyping plugin v1.2 was used for predicting phenotypic traits such as virulence factors, antimicrobial resistance, O/H predictions, prophages, and plasmid detection.For the cluster identification approach, a dendrogram was construc ted using the categorical values similarity coefficient and the unweighted pair group method with arithmetic mean for hierarchical clustering.

Statistical analyses
All data were recovered and collated using Microsoft Excel.Descriptive statistics were performed using Microsoft Excel.Statistics Kingdom online software was used for statistical analysis.Continuous variables were expressed as median and H-spread (interquartile range, IQR).
Diarrhea was the main manifestation where STEC was associated as a causative agent (94%; n = 29).Nonetheless, one case was determined by direct STEC isolation from blood and another from urine.Blood and mucus in diarrhea were documented in 79% (n = 23/29) of diarrheal cases.

Detection of eae, stx1 stx2, and rfbE genes
Stx1-encoding gene was found in 24/31 (77%) of the cases, while 7/31 (23%) were positive for stx2.Only one case (3%) was found to harbor both stx1 and stx2 genes.Intimin-encoding gene eae was detected in 15/31 (48%).There was no detection of the rfbE gene in the 31 patients included in the study.

Genomic similarity analysis and serotyping
Two clusters were identified according to the core genome analysis.A main cluster was constituted by seven strains that span over 4 years (from 2015 to 2018) and belong to the O118/O152:H2 serotype, exhibiting a maximum difference of 33 alleles.Within this cluster, three strains showed less than eight alleles of difference.All O118/O152:H2 strains harbored in common the mdfA, stx1, and eaeA genes encoding a membrane protein belonging to the major facilitator superfamily of transport proteins, Shiga-like toxin, and intimin protein, respectively.A predominance in children under 1 year of age (5/7, 71%) was observed, although non-statistical significance was established (Chi square 1.52, P = 0.21).The two remaining cases were detected in 3-and 7-year-old children.A second cluster with two E. coli O103:H2 strains was also identified, differing from each other by a single allele in the core genome analyses, even when recovered over a 1-year difference (Fig. 3).All clustered isolates were recovered from children living in different areas of the central valley.

DISCUSSION
In this study, we could establish for the first time a WGS-based study of STEC in Costa Rica.The genomic approach has been demonstrated to be a powerful tool to better investigate and understand the epidemiology of public health threats.Our findings provide further evidence of a modest circulation of STEC within the pediatric population of Costa Rica, mainly in children under 2 years of age who exhibited other gastrointesti nal co-morbidities and within the most densely populated regions (11,12).The main clinical presentation we found was bloody diarrhea with discrete leukocytosis.We could not recover 10 strains detected by molecular means, even along with traditional culture methods.This might bias the prevalence estimation, due to the exclusion of cases that did not meet the predefined inclusion criteria.Rapid molecular strategies provide significant help in establishing the etiology of gastroenteritis, but it may be difficult to recover the agent for downstream analyses such as genomic-based epidemiology and outbreak investigation.Using a combination of conventional PCR and a validated genomic approach (19), we inferred in silico serotypes, virulence factors, and similarities among the strains under study.We found a predominant and clustered circulation of the O118/O152:H2 serotype, and also a discrete cluster was configured by O103:H2 strains.These clusters did not exhibit spatial-temporal association, limiting the probability of being related to an outbreak.While no statistical association was demonstrated, five out of seven cases were detected with children under 1 year of age.This finding highlights the importance of continuing these studies, as infants can be affected in large propor tions, as seen in the most recent outbreak in Alberta, Canada, with more than 400 cases described (20).Notably, we did not find cases of O157:H7 in this study, in contrast with a study among 77 pediatric patients conducted in Argentina that found O157:H7 and O145:NM as the most prevalent (21).Of note, a limitation of this study is the potential geographic bias inherent to the hospital location.The NCH-CCSS is situated in the capital, and as a result, our findings may be skewed toward cases originating from this central region.Children residing far from the capital, particularly those with STEC diarrheal disease characterized by milder symptoms, may be less likely to seek medical attention at this specific facility.
The low prevalence of STEC infections found could suggest that the incidence of these infections may be lower in Costa Rica than in other countries.However, the high rate of underlying diseases among STEC-infected patients and the occurrence of severe complications such as HUS and death highlight the importance of continued surveillance and prevention measures.Whole-genome sequencing proved to be a valuable tool for identifying the serotypes, genotypes, and phylogenetic relationships of STEC strains in Costa Rica.In conclusion, this study provides insights into the epidemiology, clinical characteristics, and genetic diversity of STEC in Costa Rica.More studies are necessary to establish whether our findings are consistent or represent just the tip of the iceberg.Strengthening of the nationwide surveillance strategy could unveil the presence of other circulating serotypes, clusters, and outbreaks that could not be accurately determined due to the limitations of this study.