Introduction

Schizophrenia mainly manifests in late adolescence and adulthood and continues to affect young people life-long. The outcomes of schizophrenia are heterogeneous, and nearly half of the patients with first-episode psychosis (FEP) achieve symptomatic remission and functional recovery1,2. A meta-analysis showed that female sex, higher education, working history, shorter duration of untreated psychosis, milder positive and negative symptoms, and premorbid adjustment, are predictors of a better prognosis in FEP3. For individuals at ultra-high risk for psychosis (UHR), which is defined as an earlier clinical stage of the psychosis spectrum with ~25% of the individuals transitioning to psychosis in 2 years4. a number of studies have attempted to identify the risk factors for later transition to psychosis (UHR-P)5. Furthermore, the predictive factors for clinical outcomes in UHR have become the focus since nearly half of the individuals with UHR without transition to psychosis (UHR-NP) still have a poor prognosis6,7. Similar to FEP prognosis studies, female sex, older age, milder subthreshold symptoms, greater cognitive function, and better global function are the predictors for better functional outcome8.

To understand the symptomatic and functional outcomes in UHR and FEP more clearly, a number of studies have investigated the relationship between the prognosis and psychobiological characteristics using neuropsychological, neurophysiological, and neuroimaging measurements2,9,10,11,12,13. Previous magnetic resonance imaging (MRI) studies in FEP have mostly attempted to reveal the relationships between brain characteristics and treatment response in the short term10,11, Gyrification pattern studies in FEP have shown that non-responders to antipsychotic administration at 12 weeks had hypo-gyrification in the left superior and middle frontal gyrus and bilateral insula14, and reduced gyrification-based connectivity compared with the responders15. For UHR, although many MRI studies have focused on the difference between UHR-P and UHR-NP9,12,13, a structural MRI study with a 6-year follow-up showed that people with poor functional outcomes had smaller volume and cortical thickness in the right caudal middle frontal, orbitofrontal, frontal pole, and triangular part of the inferior frontal gyrus (IFG), and the left precuneus and rostral middle frontal gyrus, and smaller subcortical volume in the corpus callosum and nucleus accumbens, compared with those without poor outcomes16. A machine learning classification study with a 4-year follow-up revealed that people with poor functional outcomes had reduced cortical surface area in the left superior temporal gyrus and pericalcarine cortex, and the right opercular part of the IFG and cuneus17. These results suggest that brain alteration in the prefrontal and temporal cortex could be a factor for long-term prognosis in UHR and FEP.

Although previous studies have attempted to investigate the prognosis using one modality, very few studies have used multi-modal measurements. We recently reported that the deficits in verbal fluency mediated the relationship between the cortical surface area in the right banks of the superior temporal sulcus and 4-month negative symptoms in UHR18, suggesting that analyzing the relationship between the characteristics only from one modality and clinical outcomes could preclude an actual relationship. To the best of our knowledge, there has been no study that investigates the relationship of multi-modal measurements with long-term outcomes in UHR and FEP. A multi-modal investigation was proposed for a better understanding of clinical prognosis and the objective markers12,19,20.

The Integrative Neuroimaging Studies in Schizophrenia Targeting for Early Intervention and Prevention (IN-STEP) research project was designed as a prospective observational study to explore the pathophysiological features around the onset of psychosis and investigate the possible predictive biomarkers for clinical use since 200821. In this project, we obtained multi-modal psychobiological measurements including a cognitive battery, electrophysiological response during mismatch negativity (MMN), structural MRI, and brain activity using functional near-infrared spectroscopy (fNIRS) for UHR and FEP. For the correlational analyses using future prognosis, we previously reported the relationship of brain activity in the middle frontal gyrus with the later global function score in FEP22, duration MMN with later symptomatic remission and frequency MMN with later cognitive function in UHR23, and attention/processing speed in UHR-NP and executive function in FEP with 12-month global function24. Multisite MRI studies revealed that UHR-P showed higher gyrification in the left occipital lobe compared to UHR-NP25. We also reported that the use of a manual tracing method of the IFG subregions that reduced gray matter volume in the triangular part of the IFG was a disease-specific feature of the schizophrenia spectrum (UHR, FEP, and chronic schizophrenia), whereas reduced gray matter in the opercular part was a feature of autism spectrum disorder26,27,28. A multi-modal study showed that the volume in the triangular part of the IFG was associated with fNIRS brain activity in the IFG during a verbal fluency task29,30. Now, the data of all modalities at baseline and the clinical follow-up assessments for >2 years are available.

The present study aimed to explore which demographic, clinical, and psychobiological characteristics derived from multi-modal measurements are associated with symptomatic and functional outcomes at the 13-month and 28-month follow-ups in UHR and FEP (Tables 1 and 2). The hypothesis was that biological characteristics related to the prefrontal and temporal cortex would predict the future prognosis. To examine our hypothesis, we aimed to test the relationship between demographic and psychobiological characteristics and future clinical outcomes using a univariate regression model.

Table 1 Demographic and clinical characteristics in this study.
Table 2 List of demographic, clinical, and psychobiological characteristics in this study.

Results

Demographic characteristics

The UHR group had greater 13-month emotional factor scores assessed using the Positive and Negative Syndrome Scale (PANSS) compared to the FEP group (p = 0.013, Table 3). There was no difference in any other outcomes between the groups. Six individuals with UHR had transitioned to psychosis during the follow-up period; however, there was no difference in any demographic characteristic between UHR-P and UHR-NP, except for the diazepam equivalent dose (p = 0.014, 21 individuals with UHR-NP used benzodiazepines, but none of the individuals with UHR-P did).

Table 3 Clinical outcomes in the UHR and FEP groups.

The relationship between demographic, clinical, and psychobiological measurements and outcomes

In the UHR group, the cortical surface area in the left IFG-Orb was negatively associated with the 13-month PANSS disorganized factor score (UHR: B = −0.020, SE = 0.005, t = −4.11, p = 0.00054, Fig. 1a; FEP: B = −0.006, SE = 0.007, t = −0.78, p = 0.45). In the FEP group, the cortical surface area in the left insula was positively associated with the 28-month Social and Occupational Functioning Assessment Scale (SOFAS) score (UHR: B = −0.002, SE = 0.016, t = 0.14, p = 0.89; FEP: B = 0.050, SE = 0.010, t = 4.84, p = 0.00052, Fig. 1b).

Fig. 1: Relationship between psychobiological characteristics and clinical outcomes.
figure 1

a The relationship between 13-month disorganized symptom and the surface area in the left orbital part of the inferior frontal gyrus (IFG-Orb) in the UHR group. b The relationship between the 28-month Social and Occupational Functioning Assessment Scale (SOFAS) score and the surface area in the left insula in the FEP group.

Discussion

The present study investigated whether psychobiological characteristics from multi-modal measurements are associated with the symptomatic and functional outcomes at the 13- and 28-month follow-ups for UHR and FEP. The results showed that structural characteristics in the left IFG and insula cortical regions were associated with the symptomatic and functional outcomes. The cortical surface area in the left orbital part of the IFG demonstrated an association with disorganized symptoms in UHR, and in the left insula with global social function in FEP. However, no significant associations were observed for other psychobiological characteristics using a neurocognitive battery, MMN, and fNIRS with long-term outcomes. To the best of our knowledge, this is the first study to identify the relationship between structural characteristics in the left IFG and insula and long-term outcomes in UHR and FEP, using a wide range of demographic, clinical, and psychobiological measurements.

A smaller cortical surface area in the left orbital part of the IFG was associated with severe 13-month disorganized symptoms in the UHR group. We previously reported that individuals in the schizophrenia spectrum (UHR, FEP, and chronic schizophrenia) have more severely reduced gray matter volume in the triangular part of the IFG compared to that in the opercular part26,28. In addition, the volume reduction in the right triangular part was associated with severe positive symptoms in UHR26, FEP26, and chronic schizophrenia28. In contrast, individuals with autism spectrum disorders have greater gray matter volume reduction in the opercular part compared that in the triangular part, and the volume is also associated with social communication problems27. These studies were conducted using a manual tracing method, and were unable to visualize the orbital part because of technical and image quality reasons. The IFG, especially on the left side, were well-known as the motor speech center, and the orbital part is believed to contribute to the semantic processing of the language. Therefore, disorganized symptoms in UHR may be driven by the structural alteration in this part.

A smaller cortical surface area in the left insula was associated with the 28-month symptomatic and functional outcomes in the FEP group. A longitudinal volumetric MRI study showed that patients with FEP had smaller gray matter volume at baseline and progressive volume reduction in the left insula, and the reduction was correlated with positive and negative symptoms at the 2-year follow-up31. Patients with FEP without response to antipsychotic administration at 12 weeks had hypo-gyrification in the bilateral insula compared with the responders14. The insula plays a role in interoceptive function, which measures the body and emotional state of the self, and has a connection with a variety of other cortical and subcortical regions32. Therefore, the insula function may be related to longer-term outcomes in FEP.

The significant relationships with clinical outcomes were observed in the structural characteristics, which have been well discussed in the schizophrenia spectrum; however, the significant regions were different between the groups. Several reasons could be attributed to the clinical stages and progressive brain pathology. UHR is an earlier clinical stage and most of the participants in this study did non-transition to psychosis in the follow-up, while most of the patients with FEP were diagnosed with schizophrenia. Therefore, the UHR group could demonstrate a more heterogeneous effect on the relationship between the IFG cortical surface area and the 13-month disorganized symptoms. In the FEP group, the progressive brain pathology in the insular cortex could influence the relationship with the 28-month symptomatic and functional outcomes.

Although we previously reported the relationship between functional measurements using fNIRS22, MMN23, and the BACS18,24 and clinical outcomes, there was no correlation with longer outcomes in this study. The longer gap between biological measurements and outcomes could be more related to trait characteristics, such as structural brain images, and repeated functional measurements could be more useful to assess the present condition and short-range prediction22.

Several limitations should be mentioned. First, some correlational analyses had small sample sizes, especially for the structural MRI and outcomes in FEP, and future studies with larger sample sizes are warranted to confirm the findings. In this exploratory study, we demonstrated the candidate relationships between structural brain characteristics and future outcomes; we can subsequently need a confirmatory study in the multi-sites obtained on brain images and long-term clinical outcomes in UHR and FEP. Furthermore, future studies should use more sophisticated methods, such as machine learning16,33,34. Second, we were unable to obtain detailed symptom severity information, such as that provided by the PANSS, at the 28-month follow-up owing to the naturalistic clinical follow-up. This concern exists especially for young people since they experience more social milestones. In the future, naturalistic longitudinal studies should consider web-based interviews and evaluation methods. Third, we were unable to assess the genetic contribution to the results of this study. Biomarkers from blood collections, such as the polygenetic risk score35 may be more useful as a trait marker, and DNA methylation36,37, RNA-seq38, and blood metabolism37,39 may be more useful as state markers.

In conclusion, the present study showed the relationship between structural characteristics in the left IFG and insula, and symptomatic and functional outcomes at the 13- and 28-month follow-ups in UHR and FEP. These regions manifest well-known brain pathology in schizophrenia and their alteration in the early course of the illness could affect long-term outcomes. Future replication studies in multi-sites are warranted for the application of new therapeutic targets and clinical biomarkers for a better prognosis, and for revealing the pathological mechanisms of structural alteration in these regions.

Methods

Participants

In the IN-STEP project, we registered 53 individuals with UHR and 37 patients with FEP. Of these, 38 (71.7%) individuals with UHR and 29 (78.4%) patients with FEP who were measured using one or more modalities within 90 days of registration (including cognitive battery, MMN, structural MRI, and fNIRS brain activity) and were assessed at 12-month and/or 28-month follow-ups were included (Table 1). For the remaining 15 individuals with UHR and 8 patients with FEP, no modality data and/or no clinical outcomes were obtained; thus, they were excluded from this study. There was no difference in the demographic characteristics between those included and excluded in this study (Table 4). The availability of demographic and clinical characteristics, psychobiological modalities following preprocessing, and clinical outcomes are shown in Table 5.

Table 4 Demographic characteristics between those included and excluded in this study.
Table 5 Number of data available in this study.

The participants were recruited from the outpatient and inpatient units of the University of Tokyo Hospital, University of Tokyo Health Service Center, psychiatry clinics, and internet referrals21. The inclusion criteria were age 15–40 years for FEP and age 15–30 years for UHR, no antipsychotic medications for psychosis for >16 cumulative weeks, and continuous psychotic symptoms within the past 60 months. All the eligible participants were assessed using the Structured Interview for Prodromal Symptoms (SIPS)40,41 by expert psychiatrists, and evaluated using the UHR or psychosis criteria (Supplementary materials). The onset of psychosis, UHR-P condition, was defined according to the SIPS criteria during the 18-month follow-up period, else defined as UHR-NP23. Psychosis in the SIPS criteria is the same as psychotic disorders in the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition42. All the diagnostic assessments and clinical severity assessments were checked in the routine meetings in the project, and any concern and inconsistent ratings were discussed and rated by the project psychiatrists and psychologists.

Exclusion criteria were as follows: (1) previous and/or present severe brain injury and/or neurological illness, (2) previous history of electroconvulsive therapy, (3) a premorbid IQ of 70 or less using the 25-item version of the Japanese Adult Reading Test43,44, (4) previous and/or present alcohol addiction, (5) previous and/or present continuous substance use, and (6) clear comorbidity with autism spectrum disorders according to the DSM-IV criteria. The detailed inclusion and exclusion criteria of this study are described in the protocol paper21.

This study was approved by the ethics committee of the Faculty of Medicine, University of Tokyo (Approval Nos. 397, 629, 630, and 2226), and all the participants, and their caregivers if the participants were under 20 years of age, provided written informed consent to participation in the project and the required measurements following a complete explanation of the experiment.

Demographic and clinical assessment

Demographic, clinical, and psychobiological characteristics used in this study are listed in Table 2. Handedness was evaluated using the Edinburgh Handedness Inventory45. Self and parental socioeconomic status were assessed using the Hollingshead scale46. For the FEP group, we defined duration of untreated psychosis as the time period between the onset of the first psychotic symptoms and the initiation of antipsychotic treatment, and this information was obtained from a detailed review of the clinical records or from interviews with patients and their family members using the Nottingham onset schedule47,48.

Clinical assessments at baseline were obtained using the Global Assessment of Functioning (GAF)49,50 for global symptoms and functions, and the PANSS51 for symptom severity by the expert psychiatrists and psychologists in the project. The PANSS scores were categorized into five categories: positive, negative, disorganized, excitement, and emotional symptoms52. If the patients were taking any antipsychotic, antiparkinsonian, anxiolytic, and/or antidepressant agents, we calculated the chlorpromazine, biperiden, diazepam, and imipramine equivalent doses, respectively53.

The participants reported their subjective depressive symptoms using the Center for Epidemiologic Studies Depression Scale (CES-D)54. The CES-D comprises 20 items with a 4-point Likert scale (0 [no symptoms] to 3 [severe]), and the total score was used for subjective depressive symptoms (range 0 to 60). Subjective quality of life (QOL) was obtained using the 26-item brief version of the WHO Quality of Life Scale (WHOQOL-BREF)55,56. The WHOQOL-BREF comprises 26 items with a 5-point Likert scale (1 [poor] to 5 [good]). Five factors were calculated in an average score of corresponding items: physical domain, psychological domain, social relationships, environment, and general impression of QOL.

Psychobiological measurement

We used the Brief Assessment of Cognition in Schizophrenia Japanese version (BACS-J) for the cognitive battery, duration and frequency MMN, T1-weighted structural MRI, and brain activity in the frontotemporal cortical area during a verbal fluency task using fNIRS, for the psychobiological measurement.

The BACS-J57,58 measures six cognitive subdomains thought to be affected by schizophrenia: (i) list learning as verbal memory, (ii) digit sequencing task as working memory, (iii) token motor task as motor speed, (iv) category, and letter fluency as verbal fluency, (v) symbol coding as attention and processing speed, and (vi) the Tower of London task as executive function. Trained psychologists conducted the examinations. We used z scores for each category standardized by Japanese age-clustered participants57.

MMN is a negative component of electrophysiological response elicited by infrequent deviant stimuli occurring within a series of frequent standard stimuli, and it is considered a promising biomarker among other components59,60. As in the previous studies23,61,62, electroencephalogram was recorded using a 64-channel Geodesic EEG System (Electrical Geodesics Inc., Eugene, OR). We used two types of auditory oddball paradigms using duration and frequency deviant stimuli. Each paradigm had 2000 stimuli consisting of 90% standard tones (1000 Hz, 50 ms) and 10% deviant tones (duration MMN: 1000 Hz, 100 ms; frequency MMN: 1200 Hz, 50 ms). The auditory stimuli were delivered at an 80-dB sound pressure level, 1 ms rise/fall time, and 500 ms stimulus-onset asynchrony. We calculated the duration and frequency of MMN amplitudes as the mean amplitudes from 135 to 205 ms and from 100 to 200 ms post-stimulus, respectively. We used average amplitudes in seven electrodes around the frontocentral electrode as the MMN amplitude (Supplementary materials)23,61,62.

Through the project, we used two T1-weighted structural brain image scanning procedures using 3.0-Tesla MRI machines (n = 46 and 17, Supplementary materials). Abnormalities and anomalies were checked by a trained neuroradiologist using T1- and T2-weighted images. We excluded images from further analyses if there were abnormalities and anomalies, or substantial head motion and poor quality that could affect preprocessing. After visual inspection, we also excluded one participant because of the failure of preprocessing using FreeSurfer 6.063. Finally, 43 images were used in this study. We extracted cortical and subcortical structural information from preprocessed images and applied 150 variables of the cortical surface area, cortical thickness, and subcortical volume to the analyses (68, 68, and 14 variables, respectively). In the preprocessing phase, we also checked outliers and inspected the segmented region of interest images using the ENIGMA quality control protocol64. Since the effect size between scan procedures would affect the analysis, we applied ComBat65 to the preprocessed data set to harmonize the data sets (Supplementary materials).

fNIRS is a portable functional neuroimaging instrument used for the convenient and non-invasively measurement of hemoglobin changes over the surface of the cortex66. Near-infrared light (650–1000 nm) emitted from a source probe on the human scalp is partially absorbed by the hemoglobin in small vessels (<1 mm), and the remaining light is scattered. A detector probe can subsequently perceive the scattered near-infrared light placed 3 cm away from the source probe in adults66.

Brain activity in the prefrontal and anterior temporal cortical surface area during a 160-s block-designed version of the phonological verbal fluency task was obtained using a 52-channel fNIRS instrument (ETG-4000; Hitachi Ltd., Tokyo, Japan) (Supplementary materials)22,30,66,67. The obtained activity was checked using automatic rejection software for visible artifacts derived from the body and head movements30,68, and two measurements were excluded from further analyses. The location of fNIRS measurements for each channel was estimated using a probabilistic location by a virtual registration from the MRI measurements with an fNIRS probe attachment69,70. Relative brain activities according to the task were obtained from the 16 frontotemporal regions30. We recorded the total number of correct words during the task period as task performance. Following the measurement, subjective sleepiness during the task was assessed using the Stanford Sleepiness Scale71.

Outcome measures

We set seven outcome measures: 13-month PANSS five factor scores (mean [SD] = 398.7 [57.3] days from registration), and 28-month SOFAS and symptom subscale score of the GAF (mean [SD] = 857.0 [382.4] days). For the 28-month outcomes, we used the SOFAS and symptom subscale of the GAF with more detailed anchor points49 in the IN-STEP project procedure21. We conducted the long-term clinical outcome assessments during Jan 2018 and Dec 2019.

Statistical analysis

All analyses were performed using R version 3.6.2 (The R Foundation for Statistical Computing Platform, Vienna, Austria). We explored which of the demographic, clinical, and psychobiological measurements listed in Table 2 were associated with follow-up outcomes using a general linear model. For the variables in structural MRI, the models were controlled for intracranial volume. A difference of uncorrected p < 0.001 was considered significant according to the exploratory MRI studies using FreeSurfer features72,73,74 since there was no consensus on the significance threshold set in multi-modal analyses and the sample sizes were too small to tolerate Bonferroni or false discovery rate corrections.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.