The role of fMRI in drug development.

Functional magnetic resonance imaging (fMRI) has been known for over a decade to have the potential to greatly enhance the process of developing novel therapeutic drugs for prevalent health conditions. However, the use of fMRI in drug development continues to be relatively limited because of a variety of technical, biological, and strategic barriers that continue to limit progress. Here, we briefly review the roles that fMRI can have in the drug development process and the requirements it must meet to be useful in this setting. We then provide an update on our current understanding of the strengths and limitations of fMRI as a tool for drug developers and recommend activities to enhance its utility.


Introduction and scope
Here, we provide an update on the state of the art in the use of fMRI in the drug development process, including the requirements it must meet, its current capabilities, challenges that limit its use, and a set of activities that are proposed to meet the challenges. Although our review covers both task-based and resting-state fMRI, it echoes some of the themes of a recent review that was limited in scope to only resting-state fMRI, including the requirements for use of fMRI as a biomarker, the need for collaborative research efforts and validation, and the challenge of biological confounds [1]. Here, we also provide an update on several of the issues raised by a review on this topic published over 10 years ago, especially in relation to homologies between animal and human fMRI data, limitations to the interpretability of fMRI data, and quantitative fMRI techniques [2]. Finally, we also update information about best practices for fMRI in clinical trials, a topic that has been presented previously [3,4].

Possible roles for fMRI in drug development
The challenges presented by drug development for central nervous system (CNS) indications have motivated the search for PD methodologies that readily translate from preclinical models to patients and predict clinical efficacy [21]. Preclincal studies use transgenic or inducible rodent models of disease and in vitro studies that serve to characterize the pharmacologic properties and predicted clinical effects of a novel molecule [22][23][24][25]. However, many of these approaches lack predictive validity to complex human neuropsychiatric disorders. In addition, comparing a novel first-in-class molecule to a 'gold standard' reference molecule can be difficult: there is no useful reference for many of the psychiatric conditions with large unmet needs, and existing preclinical models might have relatively low sensitivity for novel mechanisms of action [26,27]. These difficulties motivate the search for alternative measurements in animal models that, alongside existing in vitro and animal model techniques, help inform human studies via homologies between animal and human measurements.
Although not yet commonly used for this purpose, fMRI has the potential to help meet this need.
In early-Phase clinical studies, fMRI methods can provide a means to detect a functional CNS effect of pharmacological treatment in brain regions appropriate to the mechanism and/or target population of the compound [9,[28][29][30]. Although it is not technically a marker of target engagement (i.e., of pharmacological agent binding to a target site), an fMRI signal can provide indirect evidence of target engagement if a biologically plausible link can be established between the fMRI response and the molecular target [31][32][33]. Dose-response and exposure-response relationships established using fMRI are of particular value to guide dose selection for later Phases [34][35][36].
Later phases (Phase 2 and 3) involve large patient studies at multiple clinical sites designed to identify or confirm clinical efficacy [37]. The emphasis for fMRI in these types of study is more likely to be on demonstrating normalization of a disease-related fMRI signal, at one or very few dose levels. Most such studies aspire for submission to regulators as part of a new drug approval. Consequently, these studies might include fMRI to provide a more-objective demonstration of disease modification, thereby increasing the evidence base for a regulatory submission.

How fMRI is viewed by regulatory agencies
The US Food and Drug Administration (FDA) and the European Medicines Agency (EMA) are regulatory agencies that approve the commercialization and human use of new drugs [38,39]. Approval is based on the review of evidence provided by drug sponsors on the safety and efficacy of the new drug in treating specific disease indications [40,41]. Both agencies have acknowledged that, for the process of drug development to continue to thrive, novel technologies that facilitate the drug development process have to be continuously developed [42][43][44]. This is why both agencies have proposed a formal process for the qualification of technologies, such as fMRI, for specific fit-for-purpose uses in drug development. Medical imaging tools have typically been considered as biomarkers in this context, that is, 'a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention'. [45]. For the regulatory agencies, biomarkers could be used in drug development to: (i) more accurately define a disease (i.e., a diagnostic biomarker); (ii) stratify patients by disease severity (i.e., a prognostic biomarker); (iii) identify patients most likely to benefit from therapy (i.e., a predictive biomarkers); and (iv) monitor response to therapy (i.e., a PD biomarker) [46]. Interested parties can submit a request for qualification of a biomarker if they believe that there is a need that can be met by the biomarker in a specific context of use and enough data to support its use in that context. The agencies will review the application and issue either an opinion on whether they agree with the strength of the argument for the biomarker use and context, or advice about what might be additionally required to issue a qualification opinion. Once qualified for use in a pre-defined context, the agencies will accept biomarker data as evidence, within the context of use, for new drug safety and efficacy. For example, total kidney volume measured by imaging techniques has been proposed by the Polycystic Kidney Disease Outcomes Consortium as a prognostic biomarker for enrichment of clinical trials in autosomal dominant polycystic kidney disease and qualified by the FDA and EMA. However, change in kidney volume has not yet been qualified as a PD biomarker [47]. Molecular imaging biomarkers of the dopamine transporter are also being considered for qualification as enrichments biomarkers for clinical trials [48].
The burden of proof for qualification of a biomarker, such as fMRI, is high, and has not yet been fully standardized by the FDA and EMA. Typically, the agencies require data from more than a few small trials in which the biomarker has demonstrated value in the specified context of use. Also required are characterizations of the precision and reproducibility of the biomarker. Despite scientific interest in fMRI, there remain few industry-sponsored trials with sufficiently rigorous fMRI data for regulatory agencies to consider when reviewing an application for a new therapeutic. No requests have been made to qualify fMRI as a drug development tool. However, a consortium of academic institutes and industry managing 'The European Autism Interventions -A Multicentre Study for Developing New Medications' (EU-AIMS), a project funded and run under the Innovative Medicines Initiative (IMI), has requested advice and support for the qualification of several imaging biomarkers to be used to stratify populations of patients with autism spectrum disorder (ASD). Among those biomarkers is fMRI using animated shapes theory of mind task and social and nonsocial reward anticipation task paradigms. The EMA is considering these fMRI biomarkers and has issued a letter of support to explore these biomarkers further [49].
To explain why biomarker qualifications of fMRI have been limited to date, below we review what is required for fMRI to demonstrate value and the current challenges that limit its ability to be useful in this setting.

Current state of fMRI as a tool for drug developers
The past 20 years of research on fMRI in drug development have clarified what technical and logistical requirements the technology must meet to be useful in a clinical trial setting, as well as what useful capabilities it is known to have. Here, we summarize these requirements and capabilities.
Requirements fMRI readouts should be both reproducible and modifiable by the pharmacological agent-Ideally, fMRI readouts would be standardized and broadly accepted; this would facilitate comparison between studies conducted at different sites, by different sponsors, and with different molecules. As with any assay designed to assess an intervention, the reproducibility of an fMRI paradigm forms part of its initial characterization and validation as fit for purpose. In addition, the readouts should respond to pharmacological manipulation: a Phase 1 fMRI study, for example, would be expected to establish dose-response and exposure-response relationships between the selected readouts and the administered compound to inform dose selection for subsequent patient trials [32,35,[50][51][52]. Both reproducibility and responsiveness are important: a paradigm that is highly reproducible but impervious to pharmacological manipulation will not be useful, for example. Ideally, evidence of pharmacological modulation should be presented with suitable comparator compounds [53,54].
Measurement characteristics of fMRI equipment need to be carefully established-To ensure the reliability, sensitivity, specificity, and accuracy of collected fMRI data, a quantitative, industry-standard method for assessing the fMRI measurement process is required. For structural MRI, the National Institute of Standards and Technology/ International Society for Magnetic Resonance in Medicine (NIST/ISMRM) system phantom provides such a method, including standardized MRI readouts to which scanners can be calibrated, as well as international standards for those readouts [55]. These readouts include contrast, resolution, and accuracy of distance and volume measurements in the image space. The NIST/RSNA Quantitative Imaging Biomarkers Alliance (QIBA) apparent diffusion coefficient (ADC) phantom similarly provides standardized readouts and international standards for diffusion MRI sequences [56]. Combining this approach with dynamic assessments, such as temporal signal-to-noise ratios (SNR), as recommended by the Functional Biomedical Informatics Research Network (fBIRN) [57], would be required to allow scanners worldwide to be quantitatively evaluated for their fMRI performance. The most widely used and validated techniques could not only be standardized, but also tied to a quantitative gold standard recognized by global regulatory bodies. This level of industrystandard measurement quantification is required for fMRI to be a viable technique in latestage clinical trials, especially multisite ones.
fMRI acquisition and analysis should be prespecified before the study-The extent to which the acquisition, processing, and derivation of fMRI endpoints can be prespecified will impact its utility as a tool for clinical trials. For much of its history, fMRI research has emphasized mapping the linkage between brain activation and behavior, thus representing a signal detection problem [58]. Unfortunately, attempts to distinguish signal from noise are fraught with confounds because of the high dimensionality of fMRI data, leading to a high probability of type I errors. The likelihood of false-positives becomes even more dangerous when users do not understand the statistics underlying their empirical claims (Henson's 'imager's fallacy' [59]) or engage in circular selection ('voodoo correlation') [60] and non-independent analysis ('double dipping') [61]. In relation to this, an important concern of under-powered fMRI studies was first pointed out in the commentary to the 'voodoo correlation' paper [62] in the context of fMRI for human brain mapping. This was then later elaborated on in depth [63,64]. When a study is underpowered, the power (probability) to detect a true effect is low. This results in consequences, such as overestimation of the true effect (because only large observed effects pass the P value threshold). This is also referred to also as the 'winner's curse'. Subsequently, a low reproducibility of follow-up studies ensues, because they find evidence of smaller or no effects, thus failing to reproduce findings of the initial study. As the field has evolved, more emphasis has been placed on a clear reporting of methodology, including experimental design, correction for multiple comparisons, return on investment (ROI) definition, and the statistical tests performed, ideally with sufficient detail to allow replication of the analysis [65,66]. Yet, even the most clearly reported post-hoc analysis is insufficient to inform a clinical trial where experimental power and reliability must be estimated in advance, with implications for study design and sample size [67]. In the context of drug development, fMRI methodology should be held to the same standard as other clinical endpoints, namely, methods must be prespecified and fixed for the duration of the study. This prespecification should include a thorough description of task design and implementation, image acquisition and quality control, data preprocessing, ROI definition, model estimation, and endpoint calculation [3,4]. With such prescribed methodology in hand, an fMRI experiment can be reduced to a binary outcome more suitable to inform drug development decisions. In drug development, primary endpoints and hypotheses are typically based on prespecified ROIs [68,69] and power calculations are performed accordingly to avoid underpowering.
For later-phase human trials, heterogeneous multiple-site implementation must be feasible-Significant logistical requirements must be met when deploying fMRI to large numbers of heterogeneous, nonacademic imaging facilities, as is required for typical late-stage clinical trials. The Alzheimer's Disease Neuroimaging Initiative (ADNI) [70], the IMAGEN study [71], the Human Connectome Project [72,73], and the Function Biomedical Informatics Research Network (fBIRN) [74] have advanced the state of the art in techniques for meeting these requirements. For multicenter fMRI at the level of dozens to hundreds of imaging facilities, methods must be 'turnkey' and able to accommodate all major manufacturers, models, and even field strengths. High-performance MRI scanners are not always located near subject recruitment centers, necessitating pragmatic decisions in the planning phase of a multicenter fMRI study. The most basic decision is whether fMRI acquisition can involve 1.5 T or 3 T systems. If acquisition must occur at multiple field strengths to facilitate recruitment, analytic endpoints emphasizing within-subject designs are most amenable, although ways to account for variance across magnets exist [75,76]. The decision to include 1.5 T systems also has implications for acquisition parameters and how these are standardized across sites. Clinical imaging facilities are so diverse in their equipment (vendor, model, software release, coil, and gradient configuration) and technologist experience level that complete standardization is not feasible. Instead, parameters most likely to impact endpoint derivation should be identified and fixed across sites, while other less-critical settings can be allowed to 'float'. Thus, protocol optimization involves achieving certain fixed parameters and adjusting others to maximize performance at that site. At a minimum, factors that should be consistent across sites are the type of pulse sequence, in-plane voxel size, slice thickness and spacing, temporal resolution (TR) and number of observations, flip angle, coverage, and behavioral conditions. Suggestions for this parameter set, and the impact of different choices, have been proposed elsewhere [77]. A field map (including reconstruction of both magnitude and phase information) should be acquired to allow for distortion correction of the echo planar imaging (EPI) time course [78]. Multichannel coils are preferred for the increased signal: noise ratio they provide [79]. Fat suppression and parallel imaging should be used if available to increase tissue contrast and reduce acquisition times [80,81].
In our experience, conditions allowing the assessment of whole-brain functional connectivity are achievable at most scanners encountered at clinical imaging facilities worldwide. Indeed, published data suggest that carefully controlled, prespecified, auditable task-free fMRI studies can yield a rich connectome amenable to informing drug development, once logistical hurdles are overcome [82]. However, because of broad diversity across sites, leading-edge acquisition techniques, such as multiband fMRI [83], are not presently realistic in this setting, and neither are hardware peripherals needed for task-based fMRI typically available.
Technologists must be proficient in fMRI acquisition-A standardized imaging protocol will only perform well in the hands of a well-trained technologist. Although the technologist is the person most directly involved in acquiring the data, surprisingly little attention is given to the conduct and verification of their training for clinical trial protocols. Although most clinical-imaging personnel have never encountered fMRI before, they can become proficient in its proper execution through a combination of detailed reference materials, training via the phone or web, and test scans on a phantom or healthy volunteer. The technologist must be consistent in setting up the room, positioning and instructing the subject, and placing the field of view. Perhaps most important, they must be equipped to intervene when data quality is substandard, for example, detecting excessive head motion and reimmobilizing the subject for a repeat scan. Finally, special procedures need to be in place for the handling and transfer of an fMRI time series, which can run to several thousand digital imaging and communications in medicine (DICOM) files on some systems [3].
Rigorous quality-control procedures must be in place-A rigorous quality-control (QC) procedure must be in place to ensure that the site acquires analyzable data for the duration of the trial. Errors left undetected quickly become systematic and could render worthless subjects from an entire site. This QC need not only detect errors at the site, but also must be tracked and fully auditable, especially in the context of regulatory approval [3]. Again, suggestions for what such a QC process should entail have been made [84], but at minimum should include DICOM header checks of protocol compliance, tests of dynamic range and temporal SNR, artifact inspection, adequacy of field of view (FOV) placement, and head motion (now commonly reduced to a single vector, framewise displacement [85]).

Current capabilities
Drugs can induce acute and longer-term changes in fMRI signals-A large number of published studies have demonstrated that a range of fMRI methods and paradigms are sensitive to changes following both acute (i.e., after a single dose) or chronic (i.e., multiple dose) pharmacological treatment. Most of these have been academic studies using marketed drugs whose efficacy and effective doses have already been established. Several studies have shown that the fMRI response of the amygdala to photographs of faces with negative affect is increased in patients with depression, and that antidepressant drugs can normalize this response at clinically effective doses [86,87]. Intravenous challenge with the NMDA receptor antagonist ketamine, used as a model of glutamatergic dysfunction in psychiatric disorders, elicits a widespread fMRI response [88][89][90][91][92][93][94]. The ketamine response can be blocked by antiglutamatergic compounds [88,89] and can reverse the phMRI signal evoked by some (but not all) antipsychotic agents and compounds designed to attenuate glutamate release [95,96]. Additional drug classes that induce phMRI signals include analgesics [97][98][99][100][101], antipsychotics [102], cognitive enhancers [103], drugs of abuse [104,105], calcium channel blockers [106,107], cyclooxygenase-2 (COX-2) inhibitors [108], muscarinic acetylcholine receptor modulators [109][110][111], and therapies traditionally thought to impact solely immune system activity [112]. In addition to traditional phMRI studies, pharmacological modulation of functional connectivity in animal models and humans has also been reported [113][114][115][116][117][118].
In many of these cases, phMRI signals can be hypothesized based on a priori knowledge of CNS target expression derived from other data sources, such as PET [119][120][121][122]. The identified drug-associated phMRI signals likely include both direct ('on-target') and indirect ('downstream') effects. In this context, 'on-target' refers to changes in brain activity directly related to and colocalised with the site of the pharmacological action of the compound, whereas 'downstream' effects include activation of connected brain circuits and structures that might not have a high density of the pharmacological target but whose activity is altered by administration of the compound. Both of these contributions to the signal can have implications for efficacy and safety [123].
For example, pain research exemplifies how fMRI techniques can be utilized to not only phenotype patients, but also to determine whether a PD effect is present within specific disease pathways. MRI has been pivotal to identifying the network of brain regions modulated during acute, nociceptive processing of evoked sensory stimuli (e.g., thermal or mechanical stimuli) and how parameters that define the stimulus (e.g., magnitude, duration, frequency, or mode) are represented in the brain [124]. For example, it is well known that as the intensity [125] or frequency [126] of thermal stimuli increases, the level of potentiation increases in spinal cord, brainstem, subcortical, and cortical structures. It is also well known that stimulus-dependent potentiation, as well as attenuation, occurs in CNS structures and circuits mediating not only sensory aspects of pain, but also affective and motivational ones [127,128]. It can be argued that the nonsensory components of pain are just as crucial as the sensory aspects of pain in driving the overall experience of pain and lowering quality of life [129]. In fact, in imaging studies of patients with chronic pain, the function of regions such as the nucleus accumbens [130], insula [131,132], and cingulate [133] have been shown to correlate with the level of reported clinical pain or disease-specific pain (e.g., knee movement in patients with knee osteoarthritis).
An additional example is fMRI of opioid antagonists combined with receptor occupancy (RO)-PET. The complementary value of fMRI when combined with RO-PET, providing central PD in addition to target engagement data, was illustrated in a combined RO-PET and fMRI study of novel (GSK1521498) and comparator (naltrexone) M-opioid antagonists [134], compounds that had been studied as potential therapeutics in disorders of compulsive consumption [135][136][137]. The fMRI paradigm was designed to test the hypothesis that these compounds attenuate the functional brain response to a palatable gustatory stimulus. Following M-opioid RO using [11C]-carfentanil PET, revealing dose-dependent receptor occupancy in the ventral striatum, GSK1521498 attenuated the BOLD responses in the amygdala and nucleus accumbens more strongly than did naltrexone. Given that both modalities were acquired from the same patients, the fMRI analysis could explicitly take account of RO within the same patient.
fMRI can identify converging mechanisms across drugs-Pain provides a clear case in which fMRI not only elucidates dysfunction among common and disparate CNS structures across diverse clinical conditions, but also overlapping PD effects upon aberrant fMRI signals [138]. In particular, in fibromyalgia [139], osteoarthritis pain [140], complex regional pain syndrome [141], and models of neuropathic pain [142], abnormal insula and cingulate cortex activity has been reported as being suppressed through a variety of pharmacological strategies. As importantly demonstrated by Harris et al., functional neuroimaging endpoints derived from the insula and inferior parietal lobe predicted pregabalin treatment response towards experimentally evoked or clinical pain states [143]. In patients with hand osteoarthritis, the change in visual analog scale (VAS) pain levels quantified between placebo and naproxen treatments and reported during evoked hand pain correlated with the change in thalamic and somatosensory cortex BOLD responses. Nonetheless, the causal or correlational relationship between fMRI endpoints and subjective accounts of pain is not ubiquitous within the pain neuroimaging literature. For instance, Flodin et al. did not find that functional connectivity changes observed in patients with rheumatoid arthritis (RA) correlated with clinical endpoints [i.e., global VAS, disease activity score (DAS)-28, or RA duration], an observation that might have resulted as a consequence of enhanced head movement within the patient cohort or inadequate powering of the study [144].
Functional MRI findings in humans [97] and animals [145] clearly indicate that multivariate fMRI patterns are shared across compounds that are effective for the same indication (e.g., acute pain) but nonetheless differ wildly in mechanism (e.g., opioids, nonsteroidal antiflammatories, and even tetrahydrocannabinol). However, a mechanism without direct access to the CNS might not be necessary to impact brain function and yield analgesia; yet, detecting a PD effect in the CNS could represent an early and important biomarker suggestive of analgesic efficacy. For example, in arthritic and rheumatic conditions, peripherally acting therapies result in clinically meaningful analgesia but also induce functional changes in CNS structures mediating sensory, affective, and motivational aspect of pain [108,146,147]. A recent review exemplifying the utility and challenges of fMRI in measuring CNS function in diseases harboring clinical pain in conjunction with analagous evaluation in corresponding preclinical pain models is provided elsewhere [148]. fMRI can support translation between clinical and preclinical studies-In drug development, the clinical biomarker plan for a candidate therapeutic is formulated while the compound is still being optimized in preclinical testing (the discovery phase). It is advantageous to be able to demonstrate an effect of the compound in preclinical species (typically rodents) on the same or similar biomarker as that being considered for use in the clinical phases. Recent studies have addressed the degree to which fMRI responses and their pharmacological modulation are consistent across species [149]. For example, acute intravenous injection of the NMDAR antagonist phencyclidine (PCP) elicits a strong phMRI response pattern in the rat brain involving the prefrontal and cingulate cortices and the thalamus and, to a lesser extent, the hippocampus [96]. This aligns well with the pattern observed in rat 2-deoxy-glucose (2DG) studies in response to ketamine [150]. A similar activation pattern was observed in healthy human volunteers given ketamine [88,89,151]. In a second example, the phMRI response to intravenous buprenorphine was concordant in many regions in rats and healthy humans [120], although deactivation of some regions was noted in rats but not in humans.
Only recently has rsfMRI been convincingly back-translated to rats and mice. Several groups identified functional networks in the rodent brain that align with known anatomical connectivity and are similar to analogous networks in the human brain [152][153][154]. These include a hippocampal-prefrontal system considered to be the rodent analog of the default mode network (DMN) [155][156][157][158], a frontolateral network (anatomically similar to the human 'task-positive' network and anticorrelated to the DMN [159]), and bilateral sensory and subcortical systems [160]. Emerging data have begun to demonstrate consistent pharmacological modulation of resting-state networks across species [116].

Challenges
In addition to the requirements of the drug development process and the capabilities of fMRI in this setting, there is a set of challenges that must be addressed to increase the utility of fMRI in clinical trials. The questions include technical ones about how fMRI studies should be performed, as well as biological ones about the inferences that should be made about the functioning of the brain based on fMRI data.

Technical challenges
Optimal methods for acquisition and post processing of fMRI data are not established-Each fMRI experiment includes a large space of design parameters, both at the acquisition and postprocessing stages. For acquisition, Inglis [161] provides a list of dozens of parameters that must be set, broken down into those that should be required to be reported in publications ( Table 1 in Ref. [161]). Postprocessing steps generally include realignment, slice-timing correction, co-registration, normalization, segmentation, and smoothing [162]. Additional steps include regressing out motion and physiological artifacts. Each postprocessing step is implemented differently in different software packages, each with their own sets of operating parameters. Guidance on how exactly to set these operating parameters to optimally detect fMRI signals of interest or to enhance intra-or intersubject reliability is sparse and incomplete, with some noteworthy exceptions. For example, Poldrack et al. provide guidelines for task-related fMRI experimental design, preprocessing, and statistical modeling, including how many experimental sessions and volumes per session should be acquired [66]. Van Dijk et al. [77] evaluated a variety of acquisition and postprocessing parameters for rsfMRI to determine whether settings of these parameters made any appreciable difference to the detection of DMNs and attention networks, against a reference network. Carp [163] evaluated 6912 unique analysis pipelines for a single eventrelated fMRI experiment, reporting substantial variability across pipelines in terms of BOLD signal strength, localization, and spatial extent. These reviews and studies necessarily only provide partial guidance about the large fMRI design space and suggest that greater knowledge of optimal methods for acquisition and processing of fMRI remains a major research need.
A larger problem than how to set specific operating parameters within existing processing paradigms is whether the paradigms themselves are optimal. The general linear model for modeling fMRI data, for example, is at the heart of most major data-processing software but has limitations, and alternatives have been proposed [164]. In addition, the spatially invariant hemodynamic response function is predominant, despite longstanding evidence that hemodynamic response characteristics vary by location [165]. Given that alternatives to these dominant processing paradigms are not implemented in broadly deployed software, it is not clear in what situations they provide superior results to more-conventional paradigms.
Optimal summary measures from postprocessed fMRI data are not established-Although image-based representations of group-level analyses are commonly seen as the primary fMRI output of interest in a research setting, predefined numeric summary values from each scan are needed to use fMRI as a biomarker in drug development studies. There are several motivations for summary values: (i) the drastically lower dimensionality allows results of the fMRI experiment to be imported into standard databases, combined with other data, such as PK, and analyzed by accredited statisticians. These databases are auditable and operate under strict revision and access control, safeguards that assure a high level of data integrity; (ii) defining summary readouts as primary endpoints before conducting the study requires the practitioner to prescribe specific and simple a priori hypotheses, an exercise that can highlight uncertainty in the anticipated outcome and prevent false positive rate inflation [166]; (iii) summary values can reduce the multiple comparison burden and, thus, avoid ill-defined choices among correction schemes; and (iv) endpoints extracted from voxelwise analyses can be subject to bias because of circularity [61], and determining the neuroanatomical localization of an effect in a set of voxels can be cumbersome or ill-defined.
However, although there is a clear need for low-dimensional fMRI summary measures, there is no widespread agreement on optimal approaches. Graph-theoretical summaries [167] and factor analytic methods, such as independent components analysis (ICA) [168], are used extensively in exploratory research studies, but these methods have many variants and operating parameters. The more-traditional approach of selecting a ROI and reducing all signals in a region to a single summary score [169] requires a choice of summary score (e.g., the mean, median, or mode). The optimal summary measure for any given clinical trial is unclear and can depend on the hypothesized action of the drug. If the treatment is hypothesized to strongly affect a single brain region, an ROI-based analysis might be effective, but if effective correlations among regions are hypothesized to be modified, a graph theoretical approach might be preferred. If the hypothesized effects cover a broad network of regions, a seed-based or ICA approach might capture the hypothesized effect. One reason for the lack of broad agreement on standardized summary measures for fMRI is that the range of task paradigms, pharmacological mechanisms, and imaging sites involved is large, and few standardization studies have been published.
Repeatability of fMRI paradigms is under-reported-One key reason for a lack of agreement on optimal fMRI methods is that rigorous validation studies, such as test-retest experiments, are rarely published. Intraclass correlation coefficient (ICC) is by far the mostfrequently used metric for quantifying test-retest reliability in the few fMRI studies that have been published [67,[170][171][172][173]. The ranges of the ICC for fMRI signals evoked by robust visual stimulus in primary visual cortex vary from fair to high (0.4-0.8) [172]. Fair-high reliability was also found for strong painful stimuli (ICC = 0.5-0.875) [125]. A cognitive emotive test battery [67] comprising faces, reward, and n-Back tasks showed fair-good reliability for the reward task (ICC = 0.56-0.62) and, depending on the ROI, also fair-good reliability for the n-back task (ICC = 0.44-0.57). For the faces task, the reliability was low (0.0-0.16). For phMRI using ketamine [151], ICCs were high for ROIs that were expected to respond to ketamine and ranged from fair to high across the brain (ICC = 0.30-0.73). Similar findings were also reported for the reliability of rsfMRI (ICC = 0.5-0.67) [171,173]. Similarly to stimulus-driven fMRI, there is a consistent association between greater strength of resting-state functional connectivity and greater reliability. However, group-level reproducibility is usually good [174]. Paradigms with low within-subject reproducibility are unsuitable for within-subject (crossover) study designs but might be useful in parallel-group designs. For example, the same faces task that showed poor within-subject reproducibility [67] has been widely used in parallel-group designs, reproducibly showing exaggerated amygdala responses to negative emotional stimuli in conditions such as depression, and pharmacological attenuation of this response following drug treatment [86]. These studies have investigated the repeatability of a very small segment of the large fMRI design space, and there is much room for additional validation studies. Repeatability and/or reliability estimation is an integral part of proof-of-concept (POC) fMRI studies in drug development. Given that the main driver of these studies is the investigation of the modulation of the fMRI signal by an administered drug or placebo, the effect size of this intervention is of most interest. The repeatability and/or reliability indices, such as ICC, inform within-subject variability for within-subject designs (e.g. crossover). There are usually no predefined criteria for minimum ICC or other measures of repeatability, because, even in the case of high within-subject variability, the effect might be strong (or overwhelming) and fMRI biomarkers might be informative. Conversely, high reliability does not warrant necessarily sensitivity to the intervention, which would render the biomarker useless for its particular purpose. However, when the estimate of ICC is very low (e.g., <0.3), additional investigation is warranted to determine the causes of the low reliability of the experiment.
When repeatability is reported, it is incompletely characterized-ICC belongs to a class of scaled indices of reliability, defined as a ratio of variance components derived from a mixed effect model (in the simplest case, a random effects analysis of variance, ANOVA). Thus, ICC simultaneously reflects within-and between-subject variability. Therefore, when ICC is used for comparisons concerning agreement, there is an implicit assumption of equal between-subject variability. Carrasco et al. [175] pointed out that, for the evaluation of reliability, within-subject variability is essential. Given that ICC incorporates both within-and between-subject variability, it reflects not only agreement, but also distinguishability between subjects. As a consequence, although ICC is by far the predominantly reported measure of repeatability, it can provide different answers about concordance than can indices based solely on within-subject variability. Therefore, authors are recommended [175] to report, in addition to ICC, agreement indices that reflect only within-subject variability, such as within-subject standard deviation, total deviation index, repeatability coefficient, or within-subject coefficient of variation. Importantly, reporting within-subject standard deviation (with corresponding 95% confidence limits) obtained from a test-retest study for a particular ROI can be used to inform future studies based on withinsubject designs [76]. When reporting the ICC, it is desirable to refer to a particular underlying statistical model, such as ANOVA, and the type of ICC which results from it. For example, one-way ANOVA results in the ICC referred to as ICC 1.1 in Ref. [170], whereas two-way ANOVA results in ICC 2.1 [170].

It is unclear how to optimally utilize simultaneous multislice acquisitions-
New developments in MRI pulse sequences and gradient hardware now enable higherresolution fMRI, most notably through simultaneous multislice (SMS) or 'multiband' and multiplexed excitations [176]. There is limited information on the precise benefits conveyed by these techniques in vivo. For instance, when SMS is used to achieve greater TR (e.g., [177]), it will add degrees of freedom to model-based analysis and could thereby improve statistical power (e.g., [177]). Higher TR can also enable better characterization of cardiac and respiratory effects in the data [178,179]. The Ernst angles at the shorter TRs mean that smaller flip angles are used. Although this reduces SAR, it also reduces the signal-to-noise ratio of the individual images. When SMS is used to achieve greater spatial resolution (e.g., [180]), it can improve sensitivity by reducing partial volume or susceptibility effects. Alternatively, SMS can be used in combination with multiecho sequences, for example to maintain a 3-mm isotropic/2 s TR spatiotemporal resolution with three or more echo times [177,181]. The peer-reviewed literature thus far lacks a clear empirical evaluation of most of these tradeoffs in vivo.
It is unclear what role low-level sensorimotor paradigms should have in conditions that are not primary sensorimotor deficits-Low-level sensorimotor tasks can have an important role in clinical trials, even for clinical conditions that are not characterized by primary sensory or motor deficits. For example, in diseases of cerebrovascular dysfunction, analysis of the characteristics of the hemodynamic response to a simple sensorimotor task can provide valuable information. This approach takes advantage of the relatively high reproducibility of the BOLD signals generated by sensorimotor tasks [182], as well as the relative ease with which subjects are able to comply with the task. Such paradigms have been used to explore effects of an analgesic on brain activity during visual stimulation [183]. In addition, a disease that is believed to be associated with regional modulation of cerebrovascular function can be matched with a sensorimotor task that reliably activates the same region, enabling localized, non-invasive assessment of vascular function. One example of this application of sensorimotor fMRI was reported in subjects with cerebral amyloid angiopathy (CAA) [184]. CAA is characterized by deposition of amyloid in the walls of leptomeningeal and cerebrocortical vessels [185] and has a predilection to the occipital lobe [186]. A visual stimulus and transcranial Doppler ultrasound had been used previously in patients with CAA to demonstrate a reduced reactivity in the posterior cerebral artery [187]. Dumas et al. used a similar flashing checkerboard stimulus to induce a robust BOLD response in the occipital lobe. They demonstrated that patients with CAA, compared with healthy age-matched controls, exhibited both a reduced BOLD amplitude and increased time to peak BOLD response in the visual cortex, consistent with impaired cerebrovascular reactivity measures shown in both patients with [187], and animal models of, CAA [188]. Taken together, the evidence supporting this functional imaging biomarker as an indication of underlying cerebrovascular pathology of this sort is strong. However, the full breadth of application of low-level sensorimotor paradigms to neurological diseases has not yet been fully explored.
The value added by 7T is unclear-Since the inception of fMRI, there has been a steady push to acquire human data at ever-higher field strengths because of the significant theoretical increases in sensitivity that higher field strengths afford [189][190][191]. These benefits can facilitate higher spatial and/or TR than by using lower strength magnets [191][192][193]. Although a large bulk of early human studies were performed at 1.5T, over the past 10 years, 3T studies have been the norm, and large, multisite clinical trials performed entirely at 3T sites are now feasible. Scanners with 7T field strength are now available at dozens of early-adopting research sites, and there are aspirations to roll out 9.4T scanners as a next step [194][195][196]. However, the practical real-world advantages of 7T over 3T, in a realistic clinical trial setting, have not yet been determined. In particular, increased magnetic field strength creates increases in artifact levels, scaling of physiologic noise with field strength, and a dependence of effects on voxel size [197]; it is not clear whether these downsides (in addition to the high cost of 7T devices [192]) overwhelm the empirical increases in sensitivity achieved by the higher field strength. To date, one study compared 3T with 7T in task-related fMRI, revealing robust activations in the bilateral medial temporal lobe during associative memory encoding at both field strengths but significantly stronger memoryrelated hippocampal activation at 7T, suggesting higher empirical sensitivity at 7T [198]. However, comparative studies that would inform decisions about eventual movement toward clinical trials at 7T are in their infancy.
fMRI time-series are rarely shown in clinical studies-In preclinical experiments, the fMRI time-series from pertinent ROIs are routinely displayed [199]. This is also desirable for clinical fMRI studies because it enhances confidence in the measured signal. Recent examples of clinical studies where the time-series have been displayed include [151,200]. In the study by de Simoni et al. [151], strong repeatable phMRI signals induced by ketamine were clearly demonstrated. In applications of fMRI of pain, exploratory analysis of temporal shapes of the noxious versus innocuous stimuli provided insights into improved modeling of the fMRI signal [73,200]. However, while showing time-series used to be common (e.g., [201]), clinical fMRI time-series are now rarely shown in publications, possibly because the lower field strengths used in clinical studies give rise to noisier timeseries than their higher-field preclinical counterparts. Nonetheless, showing such time-series would strengthen confidence in clinical fMRI study results.

Open biological questions
What is the set of molecular drivers of fMRI signal changes?-PhMRI studies often ascribe treatment-related BOLD signal changes to changes in the functioning of neurons. However, such inferences must necessarily be physiologically agnostic because local changes in the BOLD signal are determined by at least four physiological effects, namely changes in the cerebral blood volume (CBV), cerebral blood flow (CBF), the cerebral metabolic rate of oxygen metabolism (CMRO 2 ), as well as glucose metabolism [202]. Past research shows an undeniable association between cerebral hemodynamics and activity of neuronal populations, although exact relationships are still a matter of debate [203][204][205][206][207]. There is a consensus that graded increases in neuronal metabolism result in graded increases in the BOLD response, whereas corresponding decreases in neuronal metabolism result in negative BOLD responses [208][209][210][211]. However, a precise understanding of the relative contributions of metabolic, vascular, and neural processes to the BOLD signal is still under development. For some pharmacological treatments, such a lack of physiological specificity could be a major limitation to the ability to make development decisions based on phMRI results. For example, investigators might have a priori hypotheses about which of the physiological changes are plausible for the treatment to elicit, based on earlier preclinical experiments; separate measurement of separate physiological effects could provide an important indicator of agreement with prior experiments. In addition, effects on specific physiological parameters might be seen as advantageous by developers. Emerging approaches could help to identify distinct physiological changes induced by the treatment. So-called 'calibrated' BOLD, a T2*-based method, collects BOLD data during a controlled CO 2 challenge to 'calibrate' the BOLD signal; together with simultaneous BOLD-ASL imaging sequences, this allows simultaneous estimation of CBF and CMRO 2 [212][213][214][215]. Another method, a T2* and T2-based method known as 'quantitative BOLD' or 'qBOLD' [216], provides similar information but from a single scan that requires no hypercapnia challenge. T2-based methods, such as TRUST [217] and QUIXOTIC [218], are also promising [219]. The collection of BOLD and fluorodeoxyglucose (FDG) PET data simultaneously, with FDG PET providing the measurement of tissue metabolic properties, is also emerging [220]. Although these methods are promising, no systematic study has evaluated which ones add significant value to specific clinical trial designs or classes of compound.
In addition to fundamental questions about the biological drivers of the BOLD signal, the more-practical question of how to best remove effects of systemic physiological processes (including respiration and cardiac function) from the BOLD signal, and even whether to do so at all when assessing drug effects, remains understudied [221,222]. In addition, any extrinsic action that might change neuronal activity, blood volume, or perfusion is a potential confound of drug effects on fMRI. These include use of concomitant psychoactive drugs or other drugs that affect perfusion, as well as any activities that might affect arousal, cognitive state, and blood flow, such as caffeine intake, sleep, and exercise [223]. However, it is unclear how aggressively each of these factors should be controlled to minimize confounding effects on fMRI data from clinical trials.
What is the full range of homologies between rodent and human fMRI findings?-As described above, similarities in pharmacological modulation of fMRI readouts have been observed between rodents and humans in select clinical domains and treatments. Unfortunately, the full set of such similarities is not known, and is difficult to predict because of the many structural differences between the rodent and human brain [224,225]. Cortical areas are substantially more differentiated and occupy a greater fraction of the total brain volume in humans, and structural homologies for some human brain regions are only beginning to be identified in rats [226,227]. Moreover, it is known that certain human therapeutic targets (e.g., NK1, muscarinic acetylcholine receptor M4) have different characteristics in rodent species [228,229]. A thorough and systematic characterization of similarities and differences in pharmacological effects in rodents and humans across a range of treatments would be critical to delineate the extent to which responses generalize across species and which fMRI readouts are most sensitive.
How can compounds with slow PK be assessed?-The phMRI technique centers on the detection of a direct, pharmacologically elicited hemodynamic response in the brain. Most often, a test agent is administered partway through fMRI scanning, allowing the temporal response to the agent to be quantified (e.g., [50,53,[230][231][232][233]). This approach works well for compounds whose PK and route of administration (e.g., intravenous) leads to a rapid, easily detectable signal change. However, neuroscience compounds entering clinical drug development are most frequently formulated for oral administration and/or would have extended PK profiles. For such compounds, the length of the required phMRI scan would be impractical.
Using perfusion MRI to measure CBF has been proposed as a solution to this problem [234]. Unlike traditional phMRI, each perfusion scan yields quantified maps of CBF in absolute physiological units, thus allowing CBF measures to be compared between scans taken on different hours or days [235]. Some studies have demonstrated that this approach can be used to detect direct effects of orally administered pharmaceuticals on the brain [236][237][238][239][240][241][242][243] as well as yield overlapping, yet slightly different spatial patterns compared with BOLD fMRI [244,245]. However, the sensitivity of this contrast mechanism might be lower than that of BOLD [223]. Moreover, there is some evidence that a same-day baseline scan could be required to increase sensitivity to drug effects, compared with simply comparing CBF maps from the same subject on different days (under different treatment conditions) [241]. The relative ease of acquisition of a resting perfusion scan is a strong practical advantage. However, currently, perfusion scans are not widely collected in a clinical trial setting, and requisite high-density head coils might not be available at all clinical trial sites. Whether the potential advantages of perfusion MRI over traditional fMRI outweigh its costs and disadvantages in a clinical trial setting remains an open question.
Another approach to dealing with slow PK is to assess differences in inter-regional functional connectivity before and after pharmacological challenge (e.g., [236][237][238]). Similar to perfusion MRI, this approach does not require a continuous time-series of imaging data to be acquired that includes both pre-and post-challenge data. However, the readouts from functional connectivity analysis are more complex than in traditional phMRI, making interpretation of results more challenging.
Agonist-antagonist fMRI designs are yet another approach. In these designs, a 'probe' compound (e.g., ketamine) is administered intravenously and elicits a strong phMRI response [53,54]. The compounds of interest are administered beforehand, such that the phMRI scan coincides with the time of high exposure to the test article, allowing any modulatory effect on the probe signal (e.g., reversal) to be examined.
How can fMRI be used to inform dosing?-fMRI has relatively underexplored potential in early-phase drug discovery for informing dosing regimens. In principle, a consistent relationship between dose and the BOLD response could enable predictions about the expected brain response at higher or lower doses. However, there are few examples of the use of this kind of pharmacometric fMRI modeling in the literature. For example, it is not clear whether it is more effective to model fMRI signals as a nonlinear function of plasma concentration of a compound (a pharmacokinetic approach; e.g., [246], or as a mediator of the association between plasma concentration and PD effects [246][247][248]. Future work should evaluate whether important dose-related fMRI effects are present not only in the magnitude of the fMRI response, but also in its temporal derivates and their dependency on spatial location, as suggested by early work on biphasic responses to dopamine and GABA manipulations [249][250][251]. Addressing such complexity with computationally tractable models is a continuing challenge [252]. How to bridge the fMRI 'inference gap' is unclear-Even if a plausible relationship between treatment dose and BOLD response can be obtained, a continuing challenge is the 'inference gap': the uncertain relationship between the magnitude of a pharmacological effect seen in fMRI and probable clinical efficacy. In comparison, RO PET imaging has matured to such a degree that goal occupancy ranges for some targets and molecular modes of action are established or can be estimated from animal studies. For example, striatal D2 dopamine RO of at least ~60% is associated with effective antipsychotic activity but occupancy greater than ~80% elicits adverse effects [32]. This goal range would allow a company to confidently test a clinical hypothesis about the benefit of a treatment. The classic example of this is the neurokinin 1 (NK1) antagonist class of compounds, which turned out to be clinically ineffective in affective and pain indications, despite using doses that were known, based on RO-PET studies, to yield near 100% occupancy [123]. Without the occupancy data, it would not be possible to be sure whether the doses chosen were high enough to fully engage the target. The ability to plausibly bridge this inference gap would enhance the value of fMRI for early-phase drug development. The published literature contains numerous reports of significant pharmacological effects in fMRI studies for a range of marketed compounds, at a single clinically effective dose, and studies with more than one dose are becoming more common [33,253]. Combining RO PET with fMRI can help link target engagement (RO-PET) and PD effect (fMRI) explicitly [254]. However, the ability to interpret the dose-response curve and relate it to probable clinical efficacy, especially for novel therapeutic targets, remains a challenge.
What role should fMRI have in subject selection, stratification, and enrichment?-Most uses of fMRI in Phase 3 clinical trials have centered on characterizing treatment effects. However, fMRI has the potential to identify subjects that should be enrolled in one arm of a trial or another. An example of this type of approach comes from obesity research, in which fMRI can identify obese individuals whose brains are hyper-responsive versus weakly responsive to images of food cues; the hyper-responsive individuals go on to respond more poorly to weight-loss treatment, assumedly because of a poorer ability to maintain low food intake [255]. In clinical trials with weight loss as a primary outcome, the hyperactivators might be at higher risk of nonresponse and, thus, are stratified to a higher dose regimen. Alternatively, clinical trials with fMRI change as a primary outcome might enrich the sample for such hyperactivators, because these are the individuals most able to show a response to treatment: their fMRI activation is the farthest from normal and has the most room to normalize. Whether fMRI should have such a subject-selection role in clinical trials is currently understudied. To determine whether fMRI should have this role in any disease setting, important questions need to be addressed, including how to reduce fMRI signals down to a single number used to make subject selection decisions, and how well the fMRI characteristics predict response to treatment.

Proposed activities
Previous sections have made it clear that major obstacles must be overcome to enable the broader use of fMRI in clinical trials. For several of these obstacles, specific research and development projects that could help to overcome them are obvious. However, in addition, community-wide activities could further enhance the utility of fMRI in the drug development process. Each of these activities are centered on the ultimate goal of providing fMRI tools that are sensitive to drug-induced change, relative to repeatability; valid with respect to established clinical endpoints; and standardized across measurement platforms.

Form public-private partnerships to fund enhancement of novel technologies and enable replication and validation
There is a general consensus that the following activities would enhance the utilization of fMRI in the drug development process: soliciting and reanalyzing data from trials with null findings; replicating findings from prior trials using differing treatments or imaging methods; enhancing the usability of research-grade fMRI processing software to make it easily adopted by the research community; and publishing well-designed and executed data sets to serve as publicly available gold standards for the validation of novel imaging methods. However, each of these activities is difficult to receive funding for, from both funding agencies and industry sponsors. For example, despite isolated funding programs to enhance the usability of already-developed neuroimaging software [e.g., the Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC) program, www.nitrc.org], the US National Institutes of Health (NIH) overwhelmingly focuses on providing funds for the development of novel neuroimaging techniques rather than for the maintenance and further enhancement of successful ones. Meanwhile, individual drug companies are primarily focused on the development of their own treatments rather than of specific research tools (such as fMRI) that are used as part of the development pathway.
Given that no specific entity makes a systematic effort to fund these activities, promising new fMRI techniques 'die on the vine,' novel studies recreate the mistakes or null findings of unpublished prior studies, and the viability of treatments remain unclear because of a lack of validation. Funding for such activities likely requires new public-private partnerships, including entities that jointly recognize that such activities have the potential to enhance the utility of fMRI in all clinical trials, including those sponsored by industry and funding agencies. Software partnerships could follow the NITRC model, funding software developers to make emerging neuroimaging technologies available on additional computing platforms and able to interface with additional imaging data types and software systems. Such partnerships would need effective means to disseminate the resulting software and track the success of dissemination. Replication and validation partnerships could focus on identifying the most-promising fMRI methods and findings, and on funding their replication using complementary measurement techniques or model systems. The end result of these partnerships would be a broader set of validated software tools and greater knowledge of the generalizability of findings from one treatment to another.

Develop infrastructure for sharing clinical trial data without exposing sponsors or CROs to significant risks
The drug industry invests major resources in clinical trials that include fMRI, and there are numerous ways in which additional value could be extracted from that investment after primary analyses are completed. Subsequent analyses could be used to power novel studies, to understand the findings of similarly conducted studies, and to evaluate novel fMRI methods in general. However, any trial sponsor that makes such fMRI data public faces significant risks. A biased party could reanalyze the data to support a spurious claim about the trial. Dissemination of participant data also poses risks for confidentiality. Thus, there is a need to develop whatever infrastructure is needed to enable as much clinical trial data sharing as is possible. Such data sharing could begin with FDA and EMA policies that require that trial operators submit all collected fMRI data to these regulatory bodies or an intermediary as a precondition to registering the trial. The data could be deidentified centrally, and various data characteristics could be provided only in the most-general terms to avoid identification of the participant and trial. Regulatory bodies could, as a starting point, only release fMRI data from placebo arms to accelerate testing of methods for assessing longitudinal fMRI change. Importantly, informed consent forms would need to ensure that participants understand the implications of their consent to such data sharing.

Establish an ongoing, regular conference on fMRI in clinical trials
The previous sections should make it clear that there are significant questions about the role fMRI can and should have in clinical trials, and a major need for ongoing research. However, there is currently no conference venue designed for investigators to exchange information about the advancing state of the art in this area, and for public-private partnerships aimed at overcoming structural issues to develop. None of the current conferences in neuropharmacology, neuroimaging, and neuroscience have the critical mass of focus on this topic to incorporate sessions on it in a well-reasoned way. A 1-day add-on meeting on fMRI in clinical trials attached to one of the major ongoing conferences, or a stand-alone meeting, would be beneficial. We hope that the current exposition makes clear the interest that such a meeting would provoke from industry and academia, and its high potential for self-sustaining growth.

Legitimize and strengthen the publishing of negative results
Functional MRI studies are time-consuming, labor-intensive, and expensive. Negative results in these studies are problematic usually because the study is not published and the data then languish, sometimes because the study was poorly executed, but often because of a lack of power, an assumed statistical model that was incorrect, or some other problem [256]. However, as a result of this 'file drawer' problem, it is difficult, if not impossible, to know how many other investigators have attempted to test the same hypothesis, and found no result. Before funding an fMRI research study to test a hypothesis, it would be good to know whether unsuccessful studies have already been done on the subject or, more likely, whether imaging data exist that could be reanalyzed to answer the question or refocus the study design.
There are several solutions to the file drawer problem. First is the publicly or collaboratively available neuroimaging data repository. This was the topic of a special issue for Neuroimage in January 2016, with a follow-up issue [257] covering several dozen repositories from the USA and Europe. Several of these repositories are open to new studies and data sets, providing a place both to look for relevant data and to share a study that provided no publishable results. Within the USA, there are also the NIH-supported databases, with Research Domain Criteria (RDoC) being the primary repository for National Institute of Mental Health (NIMH)-supported research study data (https://data-archive.nimh.nih.gov/ rdocdb/).
Although public repositories address access to relevant unpublished data, they do not address the risk of nonpublishability, relative to the high cost of doing the fMRI study. The neuroimaging community has realized that the data themselves are worth publishing if they were collected well. The Nature Publishing Group supports an open-access journal Scientific Data (www.nature.com/sdata/) for 'descriptions of scientifically valuable data sets'. Thus, a clean, well-executed study whose data are being made available can also aim for a Nature publication just on the data, regardless of the study results. It will be vital that such data-set publications include enough detail on equipment, subject characteristics, and data acquisition parameters for other scientists to use the data to make decisions about the designs of their own studies.
Beyond disseminating and publishing the data, publishing negative results could be valuable for furthering the qualification of fMRI in specific contexts. However, a minimum of rigor should be placed in the presentation of negative results to enhance the confidence of the community that the report does not constitute a 'false-negative'. To that aim, the following should be observed when presenting negative results: (i) to strengthen the case for publishing the negative results, the paper should present data and arguments that demonstrate that negative results are not the result of (a) poor study design; (b) lack of standardization and/or harmonization of data acquisition protocol; (c) poor protocol compliance; (d) poor data quality; or (e) unique or nonstandard analyses; (ii) ideally, negative results should be published with public access to the original data sets; (iii) a publication discussion section should highlight possible reasons for the negative findings; and (iv) the use of peer-reviewed preregistered reports (e.g., at Cortex, Drug and Alcohol Dependence; https://osf.io/8mpji/wiki/home/) simultaneously with registration at ClinicalTrials.gov is encouraged.
Once negative-result publishing venues become more mature, funding agencies should build on their recent record of demanding more data sharing from their grantees; all grantees should be required to attempt the publication of negative results in the event that they meet the quality standards outlined above.

Conclusion
Here, we have provided an overview of the current state of knowledge regarding fMRI as a tool for clinical trials. A growing body of work has crystallized a set of known performance criteria that the technique must meet to become useful in a clinical trial setting, as well as a set of known limitations of the technique and a set of useful capabilities it is thought to have. In addition, there is a substantial set of open questions regarding how fMRI experiments should be performed, what inferences should be drawn from fMRI data, and how fMRI should be deployed in different trial scenarios. As a result of the known limitations and the open questions, a set of activities are prescribed that have the potential to push the boundaries and make fMRI a more-effective tool in a wider range of trials.

Biography
Owen Carmichael, is an associate professor at Pennington Biomedical and directs its Biomedical Imaging Center. His research interest is in developing new biomedical imaging techniques and applying them to brain aging, Alzheimer's disease, and metabolic disorders.