Robust Bayesian general linear models

doi:10.1016/j.neuroimage.2007.01.058

NeuroImage

Volume 36, Issue 3, 1 July 2007, Pages 661-671

https://doi.org/10.1016/j.neuroimage.2007.01.058 Get rights and content

Abstract

We describe a Bayesian learning algorithm for Robust General Linear Models (RGLMs). The noise is modeled as a Mixture of Gaussians rather than the usual single Gaussian. This allows different data points to be associated with different noise levels and effectively provides a robust estimation of regression coefficients. A variational inference framework is used to prevent overfitting and provides a model order selection criterion for noise model order. This allows the RGLM to default to the usual GLM when robustness is not required. The method is compared to other robust regression methods and applied to synthetic data and fMRI.

Introduction

Neuroimaging data contain a host of artifacts arising from ‘physiological noise’, e.g. subject respiration, heartbeat or movement of the head, eye, tongue or mouth or ‘non-physiological noise’, e.g. EEG electrodes with poor electrical contact, spikes in fMRI or extracranial magnetic sources in MEG. The presence of these artifacts can severely compromise the sensitivity with which we can detect the neuronal sources we are interested in. The optimal processing of artifactual data is therefore an important issue in neuroimaging analysis and a number of processing methods have been proposed. One approach is visual inspection and removal of trials deemed to contain artefacts. In the analysis of Event-Related Potentials (ERPs), however, this can lead to up to a third of the trials being removed. Because the statistical inferences that follow are based on fewer data points, this results in a loss of sensitivity.

In fMRI, signal processing methods exist for the removal of k-space spikes (Zhang et al., 2001, Greve et al., 2006), and Exploratory Data Analysis (EDA) methods have been proposed for removal of outliers in the context of mass-univariate modeling (Luo and Nichols, 2003). Alternatively, Independent Component Analysis (ICA) can be used to isolate ‘noise sources’ and remove them from the data (Jung et al., 1999). This is, however, a non-automatic process and will typically require user intervention to disambiguate the discovered components. In fMRI, autoregressive (AR) modeling can be used to downweight the impact of periodic respiratory or cardiac noise sources (Penny et al., 2003). More recently, a number of approaches based on robust regression have been applied to imaging data (Wager et al., 2005, Diedrichsen and Shadmehr, 2005). These approaches relax the assumption underlying ordinary regression that the errors be normally (Wager et al., 2005) or identically (Diedrichsen and Shadmehr, 2005) distributed. In Wager et al. (2005), for example, a Bisquare or Huber weighting scheme corresponds to the assumption of identical non-Gaussian errors. The method was applied to group-level fMRI analysis and was found to lead to more sensitive inferences.

Interestingly, Wager et al. (2005) tested a number of standard robust estimation methods by generating data from a known mixture process as this was thought to capture the essence of signal embedded in artefactual data. In this paper we take this idea one step further and develop an optimal robust estimation procedure for the case of mixture errors.

Specifically, we propose a Robust General Linear Model (RGLM) framework in which the noise is modeled with a Mixture of Gaussians. This allows different data points to be associated with different noise levels and provides a robust estimation of regression coefficients via a weighted least squares approach. Data points associated with high noise levels are downweighted in the parameter estimation step. Moreover, a Bayesian estimation framework (Attias, 2000) is used to prevent model overfitting and provides a model order selection criterion for noise model order. This allows selection of the usual GLM, i.e. a noise mixture with a single component, when an outlier model is not appropriate. This work is based on a similar algorithm for robust estimation of autoregressive processes (Roberts and Penny, 2002).

Section snippets

Theory

We define the General Linear Model (GLM) in the usual way $y = X w + e$ where y is an N × 1 vector of data points, X is an N × p design matrix, w is a p × 1 vector of regression coefficients and e is an N × 1 vector of errors. We can also write this relationship for the nth data point $y_{n} = x_{n} w + e_{n}$ where y_n is the nth data point, x_n is the nth row of X and e_n is the nth error.

In the standard GLM, the noise e_n is modeled as a Gaussian. This implies that the regression coefficients can be set by minimizing a least

Exemplar data

This section compares the standard GLM, robust regression using a Bisquare cost function and the RGLM using synthetic data. The aim of this section is to demonstrate the potential of the RGLM approach.

Data were generated from a GLM with a design matrix X comprising two regressors (i) a boxcar of period 10 samples and (ii) a constant column. The regression coefficients were set to be w = [1, 1]^T and the errors were drawn from a two-component mixture process.

So that the simulations are realistic,

Discussion

We have described a Bayesian learning algorithm for Robust General Linear Models (RGLMs), based on Roberts and Penny (2002), in which the noise is modeled as a Mixture of Gaussians. This allows different data points to be associated with different noise levels and effectively provides a robust estimation of regression coefficients.

A Bayesian inference framework is used to prevent overfitting and provides a model selection criterion for noise model order, e.g. to select noise mixtures with one,

Acknowledgments

The authors are funded by the Wellcome Trust and we are grateful to Mohamed Seghier and Jorn Diedrichsen for helpful comments.

References (16)

J. Diedrichsen et al.
Detecting and adjusting for artifacts in fMRI time series data
NeuroImage
(2005)
Luo et al.
Diagnosis and exploration of massively univariate neuroimaging methods
NeuroImage
(2003)
W.D. Penny et al.
Variational Bayesian inference for fMRI time series
NeuroImage
(2003)
W.D. Penny et al.
Bayesian fMRI time series analysis with spatial priors
NeuroImage
(2005)
T. Wager et al.
Increased sensitivity in neuroimaging analyses using robust regression
NeuroImage
(2005)
X. Zhang et al.
Elimination of k-space spikes in fMRI data
Magn. Reson. Imaging
(2001)
J. Ashburner et al.
Spatial normalization using basis functions
H. Attias
A variational Bayesian framework for graphical models

There are more references available in the full text version of this article.

Cited by (24)

Julia language in machine learning: Algorithms, applications, and open issues
2020, Computer Science Review
Citation Excerpt :
Bayesian linear regression solves the problem of overfitting in maximum likelihood estimation. Moreover, it makes full use of data samples and is suitable for modeling complex data [18,19]. In addition to regression, Bayesian reasoning can also be applied in other fields.
Machine learning is driving development across many fields in science and engineering. A simple and efficient programming language could accelerate applications of machine learning in various fields. Currently, the programming languages most commonly used to develop machine learning algorithms include Python, MATLAB, and C/C ++. However, none of these languages well balance both efficiency and simplicity. The Julia language is a fast, easy-to-use, and open-source programming language that was originally designed for high-performance computing, which can well balance the efficiency and simplicity. This paper summarizes the related research work and developments in the applications of the Julia language in machine learning. It first surveys the popular machine learning algorithms that are developed in the Julia language. Then, it investigates applications of the machine learning algorithms implemented with the Julia language. Finally, it discusses the open issues and the potential future directions that arise in the use of the Julia language in machine learning.
Sparse Bayesian registration of medical images for self-tuning of parameters and spatially adaptive parametrization of displacements
2017, Medical Image Analysis
We extend Bayesian models of non-rigid image registration to allow not only for the automatic determination of registration parameters (such as the trade-off between image similarity and regularization functionals), but also for a data-driven, multiscale, spatially adaptive parametrization of deformations. Adaptive parametrizations have been used with success to promote both the regularity and accuracy of registration schemes, but so far on non-probabilistic grounds – either as part of multiscale heuristics, or on the basis of sparse optimization. Under the proposed model, a sparsity-inducing prior on transformation parameters complements the classical smoothness-inducing prior, and favors parametrizations that use few degrees of freedom. As a result, finer bases get introduced only in the presence of coherent image information and motion, while coarser bases ensure better extrapolation of the motion to textureless, uninformative regions. The space of possible parametrizations consists of arbitrary combinations of basis functions chosen among any preset, widely overcomplete (and typically multiscale) dictionary. Inference is tackled in an efficient Variational Bayes framework. In addition we propose a flexible mixture-of-Gaussian model of data that proves to be more faithful for a variety of image modalities than the sum-of-squared differences. The performance of the proposed approach is demonstrated on time series of (cine and tagged) magnetic resonance and echocardiographic cardiac images. The proposed algorithm matches the state-of-the-art on benchmark datasets evaluating accuracy of motion and strain, and is highly automated.
How to avoid mismodelling in GLM-based fMRI data analysis: cross-validated Bayesian model selection
2016, NeuroImage
Citation Excerpt :
Luo and Nichols (2003) have supplied a range of frequentist methods for the diagnosis of a given model and exploratory data analysis. Measures against overfitting were proposed for certain situations, e.g. noise-model selection (Penny et al., 2007b) and hemodynamic response modelling (Kay et al., 2008a). In addition, there are some recommendations regarding the temporal specification of experimental conditions (Josephs and Henson, 1999; Yarkoni et al., 2009), orthogonalisation of regressors (Mumford et al., 2015) and modelling of non-white noise (Lund et al., 2006).
Voxel-wise general linear models (GLMs) are a standard approach for analyzing functional magnetic resonance imaging (fMRI) data. An advantage of GLMs is that they are flexible and can be adapted to the requirements of many different data sets. However, the specification of first-level GLMs leaves the researcher with many degrees of freedom which is problematic given recent efforts to ensure robust and reproducible fMRI data analysis. Formal model comparisons that allow a systematic assessment of GLMs are only rarely performed. On the one hand, too simple models may underfit data and leave real effects undiscovered. On the other hand, too complex models might overfit data and also reduce statistical power. Here we present a systematic approach termed cross-validated Bayesian model selection (cvBMS) that allows to decide which GLM best describes a given fMRI data set. Importantly, our approach allows for non-nested model comparison, i.e. comparing more than two models that do not just differ by adding one or more regressors. It also allows for spatially heterogeneous modelling, i.e. using different models for different parts of the brain. We validate our method using simulated data and demonstrate potential applications to empirical data. The increased use of model comparison and model selection should increase the reliability of GLM results and reproducibility of fMRI studies.
Variational Bayes
2015, Brain Mapping: An Encyclopedic Reference
Bayesian methods have proved powerful in brain mapping for the inference of model parameters from data. These methods are based on Bayes theorem, which itself is deceptively simple. However, in practice, the computations required are intractable even for simple cases. This article describes the variational Bayes (VB) method that facilitates analytic calculations of the posterior distributions for the model parameters. Derivations are given for the widely used general linear model as well as for more generally applicable nonlinear models. Examples of usage of VB in the brain mapping literature are highlighted.
Semiparametric Bayesian local functional models for diffusion tensor tract statistics
2012, NeuroImage
Citation Excerpt :
Fifthly, BFM can be readily extended to more complex brain structures, such as the medial manifolds of fiber tracts (Yushkevich et al., 2008), functional neuroimaging data (DuBois Bowman et al., 2008; Gössl et al., 2001; Woolrich et al., 2004; Xu et al., 2009), and group analysis of neuroimaging data (Penny et al., 2007; Rosa et al., 2010).
We propose a semiparametric Bayesian local functional model (BFM) for the analysis of multiple diffusion properties (e.g., fractional anisotropy) along white matter fiber bundles with a set of covariates of interest, such as age and gender. BFM accounts for heterogeneity in the shape of the fiber bundle diffusion properties among subjects, while allowing the impact of the covariates to vary across subjects. A nonparametric Bayesian LPP2 prior facilitates global and local borrowings of information among subjects, while an infinite factor model flexibly represents low-dimensional structure. Local hypothesis testing and credible bands are developed to identify fiber segments, along which multiple diffusion properties are significantly associated with covariates of interest, while controlling for multiple comparisons. Moreover, BFM naturally group subjects into more homogeneous clusters. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFM. We apply BFM to investigate the development of white matter diffusivities along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment in new born infants.
Detecting outliers in high-dimensional neuroimaging datasets with robust covariance estimators
2012, Medical Image Analysis
Medical imaging datasets often contain deviant observations, the so-called outliers, due to acquisition or preprocessing artifacts or resulting from large intrinsic inter-subject variability. These can undermine the statistical procedures used in group studies as the latter assume that the cohorts are composed of homogeneous samples with anatomical or functional features clustered around a central mode. The effects of outlying subjects can be mitigated by detecting and removing them with explicit statistical control. With the emergence of large medical imaging databases, exhaustive data screening is no longer possible, and automated outlier detection methods are currently gaining interest. The datasets used in medical imaging are often high-dimensional and strongly correlated. The outlier detection procedure should therefore rely on high-dimensional statistical multivariate models. However, state-of-the-art procedures, based on the Minimum Covariance Determinant (MCD) estimator, are not well-suited for such high-dimensional settings. In this work, we introduce regularization in the MCD framework and investigate different regularization schemes. We carry out extensive simulations to provide backing for practical choices in absence of ground truth knowledge. We demonstrate on functional neuroimaging datasets that outlier detection can be performed with small sample sizes and improves group studies.

View all citing articles on Scopus

View full text

Robust Bayesian general linear models

Abstract

Introduction

Section snippets

Theory

Exemplar data

Discussion

Acknowledgments

NeuroImage

NeuroImage

NeuroImage

NeuroImage

NeuroImage

Magn. Reson. Imaging

Spatial normalization using basis functions

A variational Bayesian framework for graphical models