Elsevier

NeuroImage

Volume 153, June 2017, Pages 399-409
NeuroImage

Improving data availability for brain image biobanking in healthy subjects: Practice-based suggestions from an international multidisciplinary working group

https://doi.org/10.1016/j.neuroimage.2017.02.030Get rights and content

Highlights

  • Brain image biobanking with associated clinical data is increasingly common.

  • We provide recommendations following an interdisciplinary meeting of experts.

  • We discuss multidisciplinary issues relating to data collection and heterogeneity.

  • We discuss issues regarding databank infrastructure and management.

  • We suggest how to enhance the use and reuse of neuroimaging and clinical data.

Abstract

Brain imaging is now ubiquitous in clinical practice and research. The case for bringing together large amounts of image data from well-characterised healthy subjects and those with a range of common brain diseases across the life course is now compelling. This report follows a meeting of international experts from multiple disciplines, all interested in brain image biobanking. The meeting included neuroimaging experts (clinical and non-clinical), computer scientists, epidemiologists, clinicians, ethicists, and lawyers involved in creating brain image banks. The meeting followed a structured format to discuss current and emerging brain image banks; applications such as atlases; conceptual and statistical problems (e.g. defining ‘normality’); legal, ethical and technological issues (e.g. consents, potential for data linkage, data security, harmonisation, data storage and enabling of research data sharing). We summarise the lessons learned from the experiences of a wide range of individual image banks, and provide practical recommendations to enhance creation, use and reuse of neuroimaging data. Our aim is to maximise the benefit of the image data, provided voluntarily by research participants and funded by many organisations, for human health. Our ultimate vision is of a federated network of brain image biobanks accessible for large studies of brain structure and function.

Introduction

Neuroimaging has become embedded in substantial research endeavours to understand normal brain function and effects of disease (e.g. Thompson et al., 2003; Fox and Schott, 2004; Lemaitre et al., 2005; Marcus et al., 2009; Wardlaw et al., 2011a, 2011b; Weiner et al., 2015). Until recently, many neuroimaging studies were in single centres and, inevitably, of modest size (Dickie et al., 2012). Many much larger population scanning initiatives are now ongoing (Jack Jr et al., 2008), and many multicentre clinical trials routinely include imaging as part of inclusion criteria and as outcome measures (Cash et al., 2014), providing the potential for large multicentre collections capturing the range of brain structure in the population. The importance of maximising the value captured in this large amount of imaging data – to detect how differences in brain structure and function relate to behavioural or clinical outcomes – is now widely recognised (Toga, 2002, Barkhof, 2012; Poline et al., 2012). The value of data for answering new questions can grow with sample size, e.g. for replication, increasing population representativeness, and increasing study power. To address this issue, a growing number of electronic databanks including brain imaging are available, either from dedicated cohorts (e.g. Alzheimer's Disease Neuroimaging Initiative, UK Biobank, IMAGEN), or collections of studies (e.g. Brain Imaging in Normal Subjects, Dementia Platform UK, Open Access Series of Imaging Studies): see Table 1.

The wide variation in brain structure and function both within and between individuals at different ages has long been recognised (Wardlaw et al., 2011a, 2011b; Dickie et al., 2013). Methodologies that use appropriately representative populations are needed to provide normative populations, particularly for healthy subjects (i.e. those without neurological diseases such as stroke or dementia). They can provide informative reports for users (e.g. ‘brain on 5th percentile for volume at age 70’ for a specified population) and simultaneously embrace the spectrum of individual variation (Dickie et al., 2015a, Dickie et al., 2015b). Brain imaging is increasingly used in the diagnosis of neurological diseases, and mental health disorders (Fox and Schott, 2004). Data from existing cohort or population studies (e.g. Marcus et al., 2009), can help define boundaries between health and disease, to aid diagnosis and trial inclusion, to provide effect size estimates for planning trials, and, where relevant, controls for case-control studies (e.g. Dickie et al., 2015a; ADNI: Potvin et al., 2016).

Large repositories of brain imaging data from well-characterised subjects in accessible databanks are required to achieve this, while ensuring that data protection concerns are also addressed. These comprise data initiatives that are planned around harmonised protocols, such as ADNI (Alzheimer's Disease Neuroimaging Initiative) (Weiner et al., 2015), UK Biobank (Matthews & Sudlow, 2016), Human Connectome Project (van Essen et al., 2013), OASIS (Open Access Series of Imaging Studies) (Marcus et al., 2007a, Marcus et al., 2007b, Marcus et al., 2009), and those that represent data aggregation without initial harmonisation e.g. ENIGMA. (Enhancing Neuro Imaging Genetics through Meta-Analysis - Thompson et al., 2014, Thompson et al., 2015). The value of brain images is hugely enhanced by the information on the characteristics of individual subjects and the study in which they participated, but at present studies vary widely in what data they present on the study, subject or image data, and how these data are presented (Dickie et al., 2012).

Only a small proportion of the images performed for research are included in biobanks, and in existing structural brain image biobanks, normal subjects over 60 years of age are relatively under-represented, with limited cognitive and medical metadata to support their classification as “normal” (Dickie et al., 2012), and available with a limited range of neuroimaging sequences. For example, fluid attenuated inversion recovery (FLAIR) and T2* volumes are often not available, although they are essential for sensitively identifying and quantifying white matter hyper-intensities (WMH) and microbleeds respectively, neuropathologies present in normal ageing but associated with vascular cognitive impairment (Wardlaw et al., 2013; Ritchie et al., 2016). Newer initiatives like BRAINS (Job et al., 2016) provide a range of sequences (e.g., T1, T2, T2*, and FLAIR) for most subjects plus cognitive and medical information. Future data sharing will be facilitated by influencing how new data are collected in terms of core imaging sequences and meta-data variables.

The INCF (International Neuroinformatics Coordinating Facility) Standards for Data Sharing Neuroimaging Task Force the Brain Imaging Data Structure (http://bids.neuroimaging.io/) to advance standard organisation and descriptions of data files, and the Neuroimaging Data Model (http://nidm.nidash.org/) for data provenance tracking, but ongoing work is needed around developing community consensus and adoption of standards (Bjaalie and Grillner, 2007). Issues such as privacy, de-identification, quality control, provenance, avoiding including the same subjects in multiple databases, ethics (historical and future), consent, essential components of ‘good guardianship’, costs, sustainability, software version control, definitions of ‘normality’, and international variations in ethical and legal frameworks, also need further consideration (Rodríguez González et al., 2010). The European Society of Radiology (ESR) published a position paper on Imaging Biobanks (European Society of Radiology, 2015) defining imaging biobanks, outlining their purpose, and advocating the creation of a network/federation of such repositories with existing biobanks.

Many funders advocate or mandate that data generated by studies they fund are made public and the International Committee of Medical Journal Editors (ICMJE) has proposed that deidentified patient information is shared before research manuscripts of randomised controlled trials will be considered for publication (Taichman et al., 2016). While this data sharing may be relatively straightforward for tabular demographic data (i.e. the types of alphanumeric data that can be held in traditional databases), the situation is much more complex for brain image data (Toga, 2002, Marcus et al., 2007a). Factors like the size of imaging files and the possibility of identifying subjects from images impose non-trivial technological challenges. While initiatives such as NeuroVault (www.neurovault.org - Gorgolewski et al., 2015) avoid the problem by publicly sharing statistical maps for data aggregation it does not include whole datasets. By contrast, a repository like OpenfMRI (www.openfmri.org) includes raw-data, with some subject-level variables, which allows newer analyses to be performed. Even when there is a desire to share imaging data, there are a number of technical, legal and practical problems to be overcome: (Poline et al., 2012; Poldrack and Gorgolewski, 2014, Pernet and Poline, 2015).

Section snippets

Learning from existing databanks and population studies

Against this background, a group of experts, including specialists in image acquisition and analysis, clinical disciplines, epidemiology, legal, ethics, and data science, met to discuss and debate conceptual, legal, ethical and technical issues around creating brain image banks. We aimed to highlight the issues that need to be addressed, from the ethical to the practical, achieve some consensus, promote best practice and provide useful advice for ongoing and planned studies. The primary aim of

Data collection

There is a great willingness from many people across the life course to volunteer for brain imaging studies: even when the participants are in their nineties and the study includes prolonged imaging (Deary et al., 2012). However, such willing individuals – irrespective of age – tend to be fitter, better educated and less socially deprived than the general population (e.g. Deary et al., 2012; Stafford et al., 2013). Extra effort is therefore needed to encourage more representative population

Addressing data heterogeneity

Where more than one study is included in a brain image bank, like 3-CITIES (Alperovitch et al., 2002) or BRAINS (Job et al., 2016), there is usually substantial heterogeneity of the acquired demographic/clinical and imaging data. This can be addressed either by describing each variable (3-CITIES), or by harmonising metadata (BRAINS). Having many variables makes the database large and difficult to search, while transforming variables to agreed standards, which is simpler for the end user, is

Database infrastructure

Many of the studies that led to the creation of imaging databanks started over a decade ago, and reported issues relating to changing technology (Mazziotta et al., 2001). For example, technical staff need to consider the impact of hardware changes (e.g. upgrading or changing scanner software or hardware; changes in data storage solutions and formats) and software evolution, which can make keeping track of multiple analyses of the database challenging (Poldrack, 2014). Such changes in technology

Database management

The legal and ethical framework of individual countries, and agreements reached between them, may affect how and where data are or can be stored. Systems are required to ensure data security, but allow appropriate access. Relevant approvals should be transparent, e.g. in publications and on websites.

During the meeting it was recognized that brain image databanks should have a Steering Committee, including independent and lay representatives, to monitor and review progress. This has the

Conclusions

Brain image biobanking is a rapidly evolving field. Several related and relevant projects will complement our recommendations, such as the International Neuroinformatics Coordinating Facility (INCF) Neuroimaging Data Sharing Task Force (wiki.incf.org/mediawiki/index.php/Neuroimaging_Task_Force) meeting held at Stanford University on January 27–30th 2015, which led to the development of the Brain Imaging Data Structure (BIDS - http://bids.neuroimaging.io/, Gorgolewski et al., 2016).

A federated

Funding sources

The writing of this paper did not receive any specific grant from funding agencies in the public, commercial or non-for-profit sectors. Guarantors of Brain, British Geriatrics Society (Scottish Branch), Royal Society of Edinburgh, SINAPSE (Scottish Imaging Network: a Platform for Scientific Excellence) SPIRIT, International Neuroinformatics Coordinating Facility and Nuffield Foundation made contributions towards funding the meeting which formed the basis of this paper.

Acknowledgements

Thanks to Guarantors of Brain, British Geriatrics Society (Scottish Branch), Royal Society of Edinburgh, SINAPSE (Scottish Imaging Network: a Platform for Scientific Excellence) SPIRIT, International Neuroinformatics Coordinating Facility and Nuffield Foundation for contributions towards funding the meeting which formed the basis of this paper. TEN is supported by the Wellcome Trust (100309/Z/12/Z).

References (78)

  • D.B. Keator et al.

    Towards structured sharing of raw and derived neuroimaging data across existing resources

    NeuroImage

    (2013)
  • D. Landis et al.

    COINS Data Exchange: an open platform for compiling, curating, and disseminating neuroimaging data

    Neuroimage

    (2016)
  • K.K. Leung et al.

    Effects of changing from non-accelerated to accelerated MRI for follow-up in brain atrophy measurement

    Neuroimage

    (2015)
  • H. Lemaitre et al.

    Age-and sex-related effects on the neuroanatomy of healthy elderly

    Neuroimage

    (2005)
  • A. Makropoulos et al.

    Regional growth and atlasing of the developing human brain

    Neuroimage.

    (2016)
  • B. Mazoyer et al.

    BIL&GIN: a neuroimaging, cognitive, behavioral, and genetic database for the study of human brain lateralization

    Neuroimage

    (2016)
  • N. Merchant et al.

    A patient care system for early 3.0 T magnetic resonance imaging of very low birth weight infants

    Early Hum. Dev.

    (2009)
  • K.L. Mills et al.

    Methods and considerations for longitudinal structural brain imaging analysis across development

    Dev. Cogn. Neurosci.

    (2014)
  • K. Oishi et al.

    Multi-contrast human neonatal brain atlas: application to normal neonate development analysis

    Neuroimage

    (2011)
  • O. Potvin et al.

    Normative data for subcortical regional volumes over the lifetime of the adult human brain

    Neuroimage

    (2016)
  • D.C. Van Essen et al.

    The WU-Minn human connectome project: an overview

    NeuroImage

    (2013)
  • J.M. Wardlaw et al.

    Neuroimaging standards for research into small vessel disease and its contribution to ageing and neurodegeneration

    Lancet Neurol.

    (2013)
  • M.W. Weiner et al.

    Impact of the Alzheimer's Disease Neuroimaging Initiative, 2004–2014

    Alzheimers Dement

    (2015)
  • Bandrowski A., Brush M., Grethe J., Haendel M., Kennedy D., Hill S., Hof P., Martone M., Pols M., Tan S., Washington...
  • F. Barkhof

    Making better use of our brain MRI research data

    Eur. Radiol.

    (2012)
  • M. Blesa et al.

    Parcellation of the healthy neonatal brain into 107 regions using atlas propagation through intermediate time points in childhood

    Front. Neurosci.

    (2016)
  • J.G. Bjaalie et al.

    Global neuroinformatics: the international neuroinformatics coordinating facility

    J. Neurosci.

    (2007)
  • J.P. Boardman et al.

    Common genetic variants and risk of brain injury after preterm birth

    Pediatrics.

    (2014)
  • D.M. Cash

    Imaging endpoints for clinical trials in Alzheimer's disease

    Alzheimers Res. Ther.

    (2014)
  • S. Das et al.

    LORIS: a web-based data management system for multi-center studies

    Front. Neuroinf.

    (2012)
  • S.R. Cox et al.

    Associations between education and brain structure at age 73 years, adjusted for age 11 IQ

    Neurology

    (2016)
  • I.J. Deary et al.

    Cohort profile: the Lothian Birth Cohorts of 1921 and 1936

    Int. J. Epidemiol.

    (2011)
  • D.A. Dickie et al.

    Do brain image databanks support understanding of normal ageing brain structure? A systematic review

    Eur. Radiol.

    (2012)
  • D.A. Dickie et al.

    Variance in brain volume with advancing age: implications for defining the limits of normality

    PLoS One

    (2013)
  • D.A. Dickie et al.

    Use of brain MRI atlases to determine boundaries of age-related pathology: the importance of statistical method

    PLoS One

    (2015)
  • B. Franke et al.

    Genetic influences on schizophrenia and subcortical brain volumes: large-scale proof-of concept and roadmap for future studies

    Nat. Neurosci.

    (2016)
  • B.C.M. Fung et al.

    Privacy-preserving data publishing: a survey of recent developments

    ACM Comput. Surv. (CSU)

    (2010)
  • S. Gadde et al.

    FBIRN, MBIRN, BIRN-CC XCEDE: an Extensible Schema For Biomedical Data

    Neuroinformatics

    (2012)
  • M. Ganguli et al.

    Who wants a free brain scan? Assessing and correcting for recruitment biases in a population-based sMRI pilot study

    Brain Imaging Behav.

    (2015)
  • Cited by (14)

    • Normal Aging Brain Collection Amsterdam (NABCA): A comprehensive collection of postmortem high-field imaging, neuropathological and morphometric datasets of non-neurological controls

      2019, NeuroImage: Clinical
      Citation Excerpt :

      To NABCA donors are anonymous ID numbers with limited information as shown in Table 3. Regarding MRI scans, brain scans are unique and could allow for identification of the individual (BRAINS (Brain Imaging in Normal Subjects) Expert Working Group et al., 2017), therefore further de-identification of MRI data is applied. Image header information is removed (Rodríguez González et al., 2010), and either defacing (Milchenko and Marcus, 2013), or brain extraction methods (Jenkinson et al., 2012) are applied before distribution.

    • Machine learning of neuroimaging for assisted diagnosis of cognitive impairment and dementia: A systematic review

      2018, Alzheimer's and Dementia: Diagnosis, Assessment and Disease Monitoring
      Citation Excerpt :

      Including nonimaging features, such as CSF biomarkers and cognitive test scores, unsurprisingly also improve performance. Further work is needed to clarify the interplay between data from images and other sources [22]. Most studies started with preprocessed features (“ground truth”) as input to the machine learning method.

    • From calcium imaging to graph topology

      2022, Network Neuroscience
    View all citing articles on Scopus
    View full text