Published April 3, 2024 | Version 1.0.0
Dataset Open

PENGWIN Task 2: Pelvic Fragment Segmentation on Synthetic X-ray Images

Description

The PENGWIN segmentation challenge is designed to advance the development of automated pelvic fracture segmentation techniques in both 3D CT scans (Task 1) and 2D X-ray images (Task 2), aiming to enhance their accuarcy and robustness. The full 3D dataset comprises CT scans from 150 patients scheduled for pelvic reduction surgery, collected from multiple institutions using a variety of scanning devices. This dataset represents a diverse range of patient cohorts and fracture types. Ground-truth segmentations for sacrum and hipbone fragments have been semi-automatically annotated and subsequently validated by medical experts, and are available here. From this 3D data, we have generated high-quality, realistic X-ray images and corresponding 2D labels from the CT data using DeepDRR, incorporating a range of virtual C-arm camera positions and surgical tools. This dataset contains the training set for fragment segmentation in synthetic X-ray (task 2).

The training set is derived from 100 CTs, with 500 images each, for a total of 50,000 training images and segmentations. The C-arm geometry is randomly sampled for each CT within reasonable parameters for a full-size C-arm. The virtual patient is assumed to be in a head-first supine position. Imaging centers are randomly sampled within 50 mm of a fragment, ensuring good visibility. Viewing directions are sampled uniformly on the sphere within 45 degrees of vertical. Half of the images (IDs XXX_0250 - XXX_0500) contain up to 10 simulated K-wires and/or orthopaedic screws oriented randomly in the field of view.

The input images are raw intensity images without any windowing or normalization applied. It is standard practice to first apply the negative log transformation and then window each image appropriately for feeding into a model. See the included augmentation pipeline in `pengwin_utils.py` for one approach. For viewing raw images, the FIJI image viewer is a viable option, but it is recommended to use the included visualization functions in `pengwin_utilities.py` to first apply CLAHE normalization and save to a universally readable PNG (see example usage below).

Because X-ray images feature overlapping segmentation maks, the segmentations have been encoded as multi-label uint32 images, where each pixel should be treated as a binary vector with bits 1 - 10 for SA fragments, 11 - 20 for LI, and 21 - 30 for RI. Thus, the raw segmentation files are not viewable with standard image viewing software. `pengwin_utilities.py` includes functions for converting to and from this format and for visualizing masks overlaid onto the original image (see below).

To use the utilities, first install dependencies with `pip install -r requirement.txt`. Then, to visualize an image with its segmentation, you can do the following (assuming the training set has been downloaded and unzipped in the same folder):

import pengwin_utils
from PIL import Image

image_path = "train/input/images/x-ray/001_0000.tif"
seg_path = "train/output/images/x-ray/001_0000.tif"

# load image and masks
image = pengwin_utils.load_image(image_path) # raw intensity image
masks, category_ids, fragment_ids = pengwin_utils.load_masks(seg_path)

# save visualization of image and masks
# applies CLAHE normalization to the raw intensity image before overlaying segmentations.
vis_image = pengwin_utils.visualize_sample(image, masks, category_ids, fragment_ids)
vis_path = "vis_image.png"
Image.fromarray(vis_image).save(vis_path)
print(f"Wrote visualization to {vis_path}")

# Obtain predicted masks, category_ids, and fragment_ids
# Category IDs are {"SA": 1, "LI": 2, "RI": 3}
# Fragment IDs are the integer labels from label_{category}.nii.gz, with 1 corresponding to the main fragment.
pred_masks, pred_category_ids, pred_fragment_ids = masks, category_ids, fragment_ids # replace with your model

# save the predicted masks for upload to the challenge
# Note: cv2 does not work with uint32 images. It is recommended to use PIL or imageio.v3
pred_seg = pengwin_utils.masks_to_seg(pred_masks, pred_category_ids, pred_fragment_ids)
pred_seg_path = "pred/train/output/images/x-ray/001_0000.tif" # ensure dir exists!
Image.fromarray(pred_seg).save(pred_seg_path)
print(f"Wrote segmentation to {pred_seg_path}")

The `pengwin_utils.Dataset` class is provided as an example of a Pytorch dataset, with strong domain randomization included to facilitate sim-to-real performance, but it is recommended to write your own as needed.

Files

requirements.txt

Files (35.0 GB)

Name Size Download all
md5:1064af816c0aadcb81af46808c7df6a4
19.4 kB Download
md5:23937d78ebc0d0d9fb95a2a985843e96
50 Bytes Preview Download
md5:9c90215dae54d8f494a85cfc7b19bc96
35.0 GB Preview Download

Additional details

Related works

Is derived from
Publication: 10.1007/978-3-031-43996-4_30 (DOI)

Software

Development Status
Active