Predicting intraventricular hemorrhage growth with a machine learning-based, radiomics-clinical model

We constructed a radiomics-clinical model to predict intraventricular hemorrhage (IVH) growth after spontaneous intracerebral hematoma. The model was developed using a training cohort (N=626) and validated with an independent testing cohort (N=270). Radiomics features and clinical predictors were selected using the least absolute shrinkage and selection operator (LASSO) method and multivariate analysis. The radiomics score (Rad-score) was calculated through linear combination of selected features multiplied by their respective LASSO coefficients. The support vector machine (SVM) method was used to construct the model. IVH growth was experienced by 13.4% and 13.7% of patients in the training and testing cohorts, respectively. The Rad-score was associated with severe IVH and poor outcome. Independent predictors of IVH growth included hypercholesterolemia (odds ratio [OR], 0.12 [95%CI, 0.02-0.90]; p=0.039), baseline Graeb score (OR, 1.26 [95%CI, 1.16-1.36]; p<0.001), time to initial CT (OR, 0.70 [95%CI, 0.58-0.86]; p<0.001), international normalized ratio (OR, 4.27 [95%CI, 1.40, 13.0]; p=0.011), and Rad-score (OR, 2.3 [95%CI, 1.6-3.3]; p<0.001). In the training cohort, the model achieved an AUC of 0.78, sensitivity of 0.83, and specificity of 0.66. In the testing cohort, AUC, sensitivity, and specificity were 0.71, 0.81, and 0.64, respectively. This radiomics-clinical model thus has the potential to predict IVH growth.

Histogram parameters are concerned with properties of individual pixels. They describe the distribution of voxel intensities within the CT image through commonly used and basic metrics. Let X denote the three-dimensional image matrix with N voxels and P the first order histogram divided by Nl discrete intensity levels.

Texture Parameters 54
Texture is one of the important characteristics used in identifying objects or regions of interest in an image, texture represents the appearance of the surface and how its elements are distributed. It is considered an important concept in machine vision, in a sense it assists in predicting the feeling of the surface (e.g. smoothness, coarseness ...etc.) from image. Various texture analysis approaches tend to represent views of the examined textures form different perspectives.
Form Factor Parameters 9 These group of features includes descriptors of the three-dimensional size and shape of the tumor region.

GLCM Parameters 100
The Grey level co-occurrence matrix (GLCM) P (I, j|Ө, d) represents the joint probability of certain sets of pixels having certain grey-level values. It calculates how many times a pixel with grey-level i occurs jointly with another pixel having a grey value j. By varying the displacement vector d between each pair of pixels. The rotation angle of an offset: 0°, 45°, 90°, 135° and displacement vectors (distance to the neighbor pixel: 1, 2, 3 ...), different cooccurrence distributions from the same image of reference. GLCM of an image is computed using displacement vector d defined by its radius, (distance or count to the next adjacent neighbor preferably is equal to one) and rotational angles.

RLM Parameters 180
The grey level run-length matrix (RLM) Pr(i, j | Ө ) is defined as the numbers of runs with pixels of gray level i and run length j for a given direction θ. RLMs is generated for each sample image segment having directions (0°,45°,90° and 135°), then the following ten statistical features were derived: short run emphasis, long run emphasis, grey level non-uniformity, run length non-uniformity, Low Grey Level Run Emphasis, High Grey Level Run Emphasis, Short Run Low Grey Level Emphasis, Short Run High Grey Level Emphasis, Long Run Low Grey Level Emphasis and Long Run High Grey Level Emphasis.

GLZSM Parameters 11
The gray level Size Zone Matrix (SZM) is the starting point of Thibault matrices. For a texture image f with N gray levels, it is denoted GSf(s, g) and provides a statistical representation by the estimation of a bivariate conditional probability density function of the image distribution values. It is calculated according to the pioneering Run Length Matrix principle: the value of the matrix GSf(s, g) is equal to the number of zones of size s and of gray level g. The resulting matrix has a fixed number of lines equal to N, the number of gray levels, and a dynamic number of columns, determined by the size of the largest zone as well as the size quantization. This matrix is particularly efficient to characterize the texture homogeneity, non periodicity or speckle like texture; it had provided betters characterizations than granulometry (or COM, RLM, etc.) for the classification of cell nuclei, dermis, road quality (bitumen condition) and some textures in PET images.