The Río Hortega University Hospital Glioblastoma dataset: A comprehensive collection of preoperative, early postoperative and recurrence MRI scans (RHUH-GBM)

Glioblastoma, a highly aggressive primary brain tumor, is associated with poor patient outcomes. Although magnetic resonance imaging (MRI) plays a critical role in diagnosing, characterizing, and forecasting glioblastoma progression, public MRI repositories present significant drawbacks, including insufficient postoperative and follow-up studies as well as expert tumor segmentations. To address these issues, we present the “Río Hortega University Hospital Glioblastoma Dataset (RHUH-GBM),” a collection of multiparametric MRI images, volumetric assessments, molecular data, and survival details for glioblastoma patients who underwent total or near-total enhancing tumor resection. The dataset features expert-corrected segmentations of tumor subregions, offering valuable ground truth data for developing algorithms for postoperative and follow-up MRI scans.

a b s t r a c t Glioblastoma, a highly aggressive primary brain tumor, is associated with poor patient outcomes.Although magnetic resonance imaging (MRI) plays a critical role in diagnosing, characterizing, and forecasting glioblastoma progression, public MRI repositories present significant drawbacks, including insufficient postoperative and follow-up studies as well as expert tumor segmentations.To address these issues, we present the "Río Hortega University Hospital Glioblastoma Dataset (RHUH-GBM)," a collection of multiparametric MRI images, volumetric assessments, molecular data, and survival details for glioblastoma patients who underwent total or near-total enhancing tumor resection.The dataset contains MRI data including the sequences: T1-weighted (T1w), T2-weighted (T2w), fluid attenuated inversion recovery (FLAIR), T1w contrast-enhanced (T1ce), and diffusion-weighted imaging-derived apparent diffusion coefficient (ADC) maps.Data is available in both raw DICOM and processed NIfTI format along with expertly refined tumor segmentations.
Clinical information is also available in CSV format.Data collection MRI data were collected retrospectively from the hospital's picture archiving and communication system (PACS).Imaging data were acquired on a 1.5 Tesla MRI scanner and consists of multiparametric structural and diffusion MRI images acquired at three time points: preoperatively, early, and at follow-up when tumor recurrence was diagnosed.In addition, the dataset includes only patients who underwent total or near-total resection of the enhancing tumor.Clinical data were collected from electronic medical records.

Value of the Data
• The value of The Río Hortega University Hospital Glioblastoma Dataset (RHUH-GBM) [1] lies in the inclusion of longitudinal MRI scans obtained at critical points in the disease course: pre-surgery, early post-surgery, and at the time of recurrence.It is important to note that patients in this cohort underwent either gross total or near-total tumor resection, further enhancing the dataset's significance.Moreover, the dataset features meticulously refined expert segmentations, contributing to its overall richness.• Researchers and medical professionals alike can leverage this dataset for a wide range of research purposes.These applications include enhancing automatic segmentation algorithms tailored to brain tumor postoperative scans, developing models for predicting survival rates, and investigating recurrence patterns in patients who have undergone complete tumor resection.
• Furthermore, the dataset is accessible in two distinct formats: NiFTI and DICOM, and it is readily available on the TCIA website.Additionally, the accompanying clinical data is comprehensive, encompassing demographic, pathological, radiological, volumetric, and survival information, further enriching its utility for research endeavors.

Shared files
Table 1 shows the details of the files available through TCIA.DICOM and NiFTI files can be visualized in dedicated and publicly available software such as 3D Silicer (www.slicer.org)and ITK-SNAP (www.itksnap.org).Clinical data is also available in CSV format.

Patient population
The dsataset comprises consecutive patients who underwent surgery between January 2018 and December 2022, with a confirmed histopathological diagnosis of WHO grade 4 astrocytoma.Forty patients were selected based on the following inclusion criteria: 1) Gross total resections (GTR) or Near Total Resection (NTR), defined as having no residual tumor enhancement and an extent of resection exceeding 95% of the initial enhancing volume, respectively [2 , 3] .2) Availability of MRI studies at three time points: preoperative, early postoperative (within 72 h), and the follow-up scan where tumor progression was diagnosed.3) Availability of structural T1-weighted (T1w), T2-weighted (T2w), T1 contrast-enhanced (T1ce), Fluid-attenuated inversion recovery (FLAIR), and diffusion-weighted imaging-derived apparent diffusion coefficient (ADC) maps for each study.4) Receipt of adjuvant treatment with chemotherapy and radiotherapy following the Stupp protocol [4] .Patients with severe image acquisition artifacts or missing MRI series were excluded.The modified Response Assessment in Neuro-Oncology (RANO) criteria were utilized to determine tumor progression [5] .
A summary of the demographic data is presented in Table 2 .The patients had an average age of 63 ± 9 years, consisting of 28 men (70%) and 12 women (30%).The median preoperative Karnofsky Performance Scale (KPS) score was 80. Out of the 40 patients, 38 (95%) were diagnosed with de novo glioblastomas, while two patients (5%) had recurrent glioblastomas previously treated with standard chemoradiotherapy.Four cases (10%) were IDH-mutated, and 36 cases (90%) were IDH wild-type.The mean preoperative contrast-enhancing tumor volume was 34.99 ± 26.59 cm 3 , and the mean postoperative contrast-enhancing residual tumor volume was 0.23 ± 0.47 cm 3 .A graphical representation of tumor location is displayed as a heatmap in Fig. 1 .Among the patients, 27 (67.5%)underwent gross total resection, and 13 (32.5%)underwent near-total resection.The median overall survival was 364 days, and the median progression-free survival was 198 days.

Clinical, pathological, and imaging data
Clinical and pathological information was obtained from electronic medical records, including age, sex, histopathological diagnosis, pre-and postoperative Karnofsky Performance Score (KPS), isocitrate dehydrogenase (IDH) status, use of operative adjuncts, volumetric assessment of the extent of resection of the contrast-enhancing and non-enhancing tumor, presence of postoperative neurological deficits, details of chemotherapy and radiotherapy received, and overall survival (OS) and progression-free survival (PFS) times.OS was measured from diagnosis to death or last follow-up if alive, while PFS was from diagnosis to tumor progression or last follow-up if no progression was noted.Out of the total sample, a subset of 11 patients had initially undergone preoperative and subsequent follow-up MRI scans at a secondary healthcare facility before being referred to the primary center.Details of the MR imaging acquisition parameters are described in Table 3 .

Tumor Subregions Segmentations
The preprocessed images from each time point were used as input for generating computeraided segmentations using Deep-Medic [12] .Three labels were subsequently obtained, corresponding to 1 -necrosis, 2 -peritumoral signal alteration, including edema and non-enhancing tumor, and 3 -enhancing tumor.All segmentations were carefully reviewed and manually corrected by two expert neurosurgeons specializing in neuroimaging (S.C. and S.G.).A summary of the data workflow is depicted in Fig. 2 .

Limitations
Not applicable.

Fig. 1 .
Fig. 1.A graphical representation of tumor location is presented in the form of a heatmap, showcasing the distribution of tumors in a normalized SRI24 atlas template space.Areas of interest are depicted as percentages.

Fig. 2 .
Fig. 2. Schematic representation of the workflow.It displays the main clinical variables collected, the image preprocessing steps, and the results of the tumor subregion segmentations: red = necrosis, green = peritumoral region (T2/FLAIR) signal alteration, yellow = enhancing tumor.
The dataset features expert-corrected segmentations of tumor subregions, offering valuable ground truth data for developing algorithms for postoperative and follow-up MRI scans.© 2023 The Author(s).Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

Table 1
Detailed dataset description.

Table 3
MRI acquisition parameters.