Elsevier

NeuroImage

Volume 36, Issue 3, 1 July 2007, Pages 661-671
NeuroImage

Robust Bayesian general linear models

https://doi.org/10.1016/j.neuroimage.2007.01.058Get rights and content

Abstract

We describe a Bayesian learning algorithm for Robust General Linear Models (RGLMs). The noise is modeled as a Mixture of Gaussians rather than the usual single Gaussian. This allows different data points to be associated with different noise levels and effectively provides a robust estimation of regression coefficients. A variational inference framework is used to prevent overfitting and provides a model order selection criterion for noise model order. This allows the RGLM to default to the usual GLM when robustness is not required. The method is compared to other robust regression methods and applied to synthetic data and fMRI.

Introduction

Neuroimaging data contain a host of artifacts arising from ‘physiological noise’, e.g. subject respiration, heartbeat or movement of the head, eye, tongue or mouth or ‘non-physiological noise’, e.g. EEG electrodes with poor electrical contact, spikes in fMRI or extracranial magnetic sources in MEG. The presence of these artifacts can severely compromise the sensitivity with which we can detect the neuronal sources we are interested in. The optimal processing of artifactual data is therefore an important issue in neuroimaging analysis and a number of processing methods have been proposed. One approach is visual inspection and removal of trials deemed to contain artefacts. In the analysis of Event-Related Potentials (ERPs), however, this can lead to up to a third of the trials being removed. Because the statistical inferences that follow are based on fewer data points, this results in a loss of sensitivity.

In fMRI, signal processing methods exist for the removal of k-space spikes (Zhang et al., 2001, Greve et al., 2006), and Exploratory Data Analysis (EDA) methods have been proposed for removal of outliers in the context of mass-univariate modeling (Luo and Nichols, 2003). Alternatively, Independent Component Analysis (ICA) can be used to isolate ‘noise sources’ and remove them from the data (Jung et al., 1999). This is, however, a non-automatic process and will typically require user intervention to disambiguate the discovered components. In fMRI, autoregressive (AR) modeling can be used to downweight the impact of periodic respiratory or cardiac noise sources (Penny et al., 2003). More recently, a number of approaches based on robust regression have been applied to imaging data (Wager et al., 2005, Diedrichsen and Shadmehr, 2005). These approaches relax the assumption underlying ordinary regression that the errors be normally (Wager et al., 2005) or identically (Diedrichsen and Shadmehr, 2005) distributed. In Wager et al. (2005), for example, a Bisquare or Huber weighting scheme corresponds to the assumption of identical non-Gaussian errors. The method was applied to group-level fMRI analysis and was found to lead to more sensitive inferences.

Interestingly, Wager et al. (2005) tested a number of standard robust estimation methods by generating data from a known mixture process as this was thought to capture the essence of signal embedded in artefactual data. In this paper we take this idea one step further and develop an optimal robust estimation procedure for the case of mixture errors.

Specifically, we propose a Robust General Linear Model (RGLM) framework in which the noise is modeled with a Mixture of Gaussians. This allows different data points to be associated with different noise levels and provides a robust estimation of regression coefficients via a weighted least squares approach. Data points associated with high noise levels are downweighted in the parameter estimation step. Moreover, a Bayesian estimation framework (Attias, 2000) is used to prevent model overfitting and provides a model order selection criterion for noise model order. This allows selection of the usual GLM, i.e. a noise mixture with a single component, when an outlier model is not appropriate. This work is based on a similar algorithm for robust estimation of autoregressive processes (Roberts and Penny, 2002).

Section snippets

Theory

We define the General Linear Model (GLM) in the usual wayy=Xw+ewhere y is an N × 1 vector of data points, X is an N × p design matrix, w is a p × 1 vector of regression coefficients and e is an N × 1 vector of errors. We can also write this relationship for the nth data pointyn=xnw+enwhere yn is the nth data point, xn is the nth row of X and en is the nth error.

In the standard GLM, the noise en is modeled as a Gaussian. This implies that the regression coefficients can be set by minimizing a least

Exemplar data

This section compares the standard GLM, robust regression using a Bisquare cost function and the RGLM using synthetic data. The aim of this section is to demonstrate the potential of the RGLM approach.

Data were generated from a GLM with a design matrix X comprising two regressors (i) a boxcar of period 10 samples and (ii) a constant column. The regression coefficients were set to be w = [1, 1]T and the errors were drawn from a two-component mixture process.

So that the simulations are realistic,

Discussion

We have described a Bayesian learning algorithm for Robust General Linear Models (RGLMs), based on Roberts and Penny (2002), in which the noise is modeled as a Mixture of Gaussians. This allows different data points to be associated with different noise levels and effectively provides a robust estimation of regression coefficients.

A Bayesian inference framework is used to prevent overfitting and provides a model selection criterion for noise model order, e.g. to select noise mixtures with one,

Acknowledgments

The authors are funded by the Wellcome Trust and we are grateful to Mohamed Seghier and Jorn Diedrichsen for helpful comments.

References (16)

There are more references available in the full text version of this article.

Cited by (24)

  • Julia language in machine learning: Algorithms, applications, and open issues

    2020, Computer Science Review
    Citation Excerpt :

    Bayesian linear regression solves the problem of overfitting in maximum likelihood estimation. Moreover, it makes full use of data samples and is suitable for modeling complex data [18,19]. In addition to regression, Bayesian reasoning can also be applied in other fields.

  • How to avoid mismodelling in GLM-based fMRI data analysis: cross-validated Bayesian model selection

    2016, NeuroImage
    Citation Excerpt :

    Luo and Nichols (2003) have supplied a range of frequentist methods for the diagnosis of a given model and exploratory data analysis. Measures against overfitting were proposed for certain situations, e.g. noise-model selection (Penny et al., 2007b) and hemodynamic response modelling (Kay et al., 2008a). In addition, there are some recommendations regarding the temporal specification of experimental conditions (Josephs and Henson, 1999; Yarkoni et al., 2009), orthogonalisation of regressors (Mumford et al., 2015) and modelling of non-white noise (Lund et al., 2006).

  • Variational Bayes

    2015, Brain Mapping: An Encyclopedic Reference
  • Semiparametric Bayesian local functional models for diffusion tensor tract statistics

    2012, NeuroImage
    Citation Excerpt :

    Fifthly, BFM can be readily extended to more complex brain structures, such as the medial manifolds of fiber tracts (Yushkevich et al., 2008), functional neuroimaging data (DuBois Bowman et al., 2008; Gössl et al., 2001; Woolrich et al., 2004; Xu et al., 2009), and group analysis of neuroimaging data (Penny et al., 2007; Rosa et al., 2010).

View all citing articles on Scopus
View full text