Unpaired MR-CT brain dataset for unsupervised image translation

The data presented in this article deals with the problem of brain tumor image translation across different modalities. The provided dataset represents unpaired brain magnetic resonance (MR) and computed tomography (CT) image data volumes of 20 patients. This includes 179 two-dimensional (2D) axial MR and CT images. The MR cases are acquired using Siemens Verio scanner, while the CT images with a Siemens Somatom scanner. The MR and CT tumor volumes were collected, diagnosed and annotated by experienced radiologists specialized in oncology and radiotherapy. The collected image volumes can be useful for researchers working in the field of artificial intelligence (AI) applications for brain tumor detection, classification and segmentation in MR and CT modalities. The provided tumor masks per each tumor volume can assist data scientists with limited background in cancer imaging. Moreover, clinical interpretation is given per each tumor volume, which can assist in deep learning model training with multiple source data (non-imaging or textual data) as well. The provided dataset can facilitate for annotation-efficient lesion segmentation using bidirectional MR-CT cross-modality image translation.


a b s t r a c t
The data presented in this article deals with the problem of brain tumor image translation across different modalities. The provided dataset represents unpaired brain magnetic resonance (MR) and computed tomography (CT) image data volumes of 20 patients. This includes 179 two-dimensional (2D) axial MR and CT images. The MR cases are acquired using Siemens Verio scanner, while the CT images with a Siemens Somatom scanner. The MR and CT tumor volumes were collected, diagnosed and annotated by experienced radiologists specialized in oncology and radiotherapy. The collected image volumes can be useful for researchers working in the field of artificial intelligence (AI) applications for brain tumor detection, classification and segmentation in MR and CT modalities. The provided tumor masks per each tumor volume can assist data scientists with limited background in cancer imaging. Moreover, clinical interpretation is given per each tumor volume, which can assist in deep learning model training with multiple source data (non-imaging or textual data) as well. The provided dataset can facilitate for annotation-efficient lesion segmentation using bidirectional MR-CT cross-modality image translation.

Value of the Data
• The provided annotated MR and CT image volumes can be useful for categorizing and labeling of human brain lesions for AI clinical applications; particularly, supporting automated medical image analysis for early brain tumor detection. • Labeling medical images is generally challenging, time-consuming, tedious, and requires labor-intensive participation of radiologists; the dataset can assists in dealing with medical data scarcity for training deep learning models, specifically for MR and CT images. • The dataset comes along with detailed clinical description for each of the MR and CT lesion cases (see Table 1 ), which can provide better understanding of lesion physical characteristics. • The collected MR and CT volumes, along with provided tumor masks (see Table 2 ), can be used by data scientist to generate realistic MR images from CT images and vice versa for learning reconstruction models [1] .

Data Description
The MR-CT brain image volumes were acquired by the Diagnostic Radiology Department of the Jordan University Hospital (JUH). The dataset was acquired between the period of April 2016 and December 2019. The dataset consists of brain CT and MR image volumes scanned for radiotherapy treatment planning for brain tumors. The dataset contains T2-MR and CT images for 20 patients aged between 26-71 years with mean-std equal to 47-14.07. Lesion masks were manually delineated by two expert radiologists using a software tool developed in python. Our tool allows for scrolling through all image slices and adjustment of window level settings before selecting the segmentation area. More information can be found in [1] . The MR and CT volume scans are described as follows: ( continued on next page ) ( continued on next page ) Heterogenous putman Lesion with high signal intensity corropending to foci of hemorrhage.

Table 2
Segmentation tumor mask for MR and CT images.

MR and CT scanner configuration
The MR images of each patient were acquired with a 5.00 mm T Siemens Verio 3T using a T2-weighted without contrast agent, 3 Fat sat pulses (FS), 250 0-40 0 0 TR, 20-30 TE, and 90/180 flip angle. The CT images were acquired with Siemens Somatom scanner with 2.46 mGY.cm dose length, 130 KV voltage, 113-327 mAs tube current, topogram acquisition protocol, 64 dual source, one projection, and slice thickness of 7.0 mm. Smooth and sharp filters have been applied to the CT images. The MR scans have a resolution of 0.7 ×0 . 6 ×5 mm 3 , while the CT scans have a resolution of 0.6 ×0 . 6 ×7 mm 3 .

Data collection
The dataset consists of 2D image slices extracted using the RadiAnt DICOM viewer software. The extracted images are transformed to DICOM image data format with a resolution of 256 ×256 pixels. There are a total of 179 2D axial image slices referring to 20 patient volumes (90 MR and 89 CT 2D axial image slices). The dataset contains MR and CT brain tumour images with corresponding segmentation masks [2] .

Shared files and directory structure
The MR-CT dataset is organized in two folders (MR and CT). Each folder consists of two sub folders (images and corresponding masks). The DICOM images in each sub folder are formatted according to the following naming convention: CASE NUMBER | IMAGE TYPE (CT OR MR) | VOL-UME SLICE OR SEGMENTATION MASK NUMBER| FILE EXTENSION | where S indicates image slice and M is the associated segmentation mask.

Ethics Statement
The Jordan University Hospital MR-CT Brain Dataset has been collected after receiving Institutional Review Board approval (IRB no. 16/161/2020) and the consent of patients. All procedures has been carried out in accordance with The Code of Ethics of the World Medical Association (Declaration of Helsinki).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.

Data Availability
Unpaired MR-CT Brain Dataset for Unsupervised Image Translation (Original data) (Mendeley Data).