Neural correlates of online cooperation during joint force production

ABSTRACT During joint action, two or more persons depend on each other to accomplish a goal. This mutual recursion, or circular dependency, is one of the characteristics of cooperation. To evaluate the neural substrates of cooperation, we conducted a hyperscanning functional MRI study in which 19 dyads performed a joint force‐production task. The goal of the task was to match their average grip forces to the target value (20% of their maximum grip forces) through visual feedback over a 30‐s period; the task required taking into account other‐produced force to regulate the self‐generated one in real time, which represented cooperation. Time‐series data of the dyad's exerted grip forces were recorded, and the noise contribution ratio (NCR), a measure of influence from the partner, was computed using a multivariate autoregressive model to identify the degree to which each participant's grip force was explained by that of their partner's, i.e., the degree of cooperation. Compared with the single force‐production task, the joint task enhanced the NCR and activated the mentalizing system, including the medial prefrontal cortex, precuneus, and bilateral posterior subdivision of the temporoparietal junction (TPJ). In addition, specific activation of the anterior subdivision of the right TPJ significantly and positively correlated with the NCR across participants during the joint task. The effective connectivity of the anterior to posterior TPJ was upregulated when participants coordinated their grip forces. Finally, the joint task enhanced cross‐brain functional connectivity of the right anterior TPJ, indicating shared attention toward the temporal patterns of the motor output of the partner. Since the posterior TPJ is part of the mentalizing system for tracking the intention of perceived agents, our findings indicate that cooperation, i.e., the degree of adjustment of individual motor output depending on that of the partner, is mediated by the interconnected subdivisions of the right TPJ.


Introduction
Cooperation is a type of human interaction in which two or more individuals coordinate their behavior to pursue a common goal (Bratman, 1992). Participants in cooperation must continuously consider their partners' actions to adjust their behavior. Because the evaluation of this process requires monitoring of sensorimotor coordination and the intention behind it, the mirror neuron system (MNS) and mentalizing system are predicted to be involved (Chaminade et al., 2012;Van Overwalle and Baetens, 2009). These systems interactively contribute to cooperation by simulating the other person's behavior and inferring their intentions (Van Overwalle and Baetens, 2009). Several studies have attempted to identify the brain regions involved in cooperation by using simultaneous joint coordination tasks (Chaminade et al., 2012;Newman-Norlund et al., 2008). Newman-Norlund et al. (2008) compared the neural substrates of shared and single control in the virtual bar-balancing task and found that the MNS and the right temporoparietal junction (rTPJ), as well as the precuneus, were activated. The authors argued that the MNS is involved in the internal model, whereas the TPJ and precuneus are involved in self-other distinction. However, since they did not measure the degree of cooperation during this experiment, the neural substrates of cooperation remained unclear. To solve this problem, Chaminade et al. (2012) adopted a paradigm in which a dynamic visual object (stripes of varying width and shades of color) was controlled by joystick movements. The level of cooperation ranged from no cooperation (one subject controlling the color, the other the grating) to full cooperation (each subject controlled half of the color and half of the grating). They found that the degree of cooperation was related to activity in the left parietal operculum and the anterior cingulate cortex (ACC). Because the degree of cooperation paralleled performance, the neural substrates of cooperation were difficult to differentiate from task difficulty. Neither study measured the degree of cooperation between the participants under the task conditions. However, participants may not necessarily follow the degree of cooperation expected by the experimenter; different participants could adopt different degrees of cooperation even under the same task condition or within the same pair. Moreover, it would be natural for different individuals to engage in different types of social interactions based on their own social traits. Therefore, the degree of cooperation in a given joint task should be identified individually based on the actual motor outcome during the task. Based on these issues, reports of the neural substrates of cooperation are inconsistent, and it remains unclear whether this process involves the MNS, the mentalizing system, or both.
Here, we developed a simple joint force-production task, asking paired participants to match their averaged grip forces to a target value. The task was designed to simulate a real-life interaction such as lifting a piano up a narrow staircase, which requires coordinated force generation to maintain the object in a horizontal position. We were interested in the neural substrates of the shared cooperative activities in which two or multiple agents adopt a pattern of coordinated action to achieve a common goal (Bratman, 1992;Newman-Norlund et al., 2008). While the task effort was so small that it did not disturb cooperation, the task required taking into account other-produced force in order to regulate the self-generated one. Specifically, the location of the cursor indicated the averaged forces of the participant and partner. To accomplish the goal, i.e., to adjust the cursor to the target line, the participant had to take into account the partner's force in a real-time manner, which represented cooperation. The ability to coordinate individual motor behavior with feedback monitoring of an interacting partner emerges at around 8 years of age in humans (Satta et al., 2017) and has even been observed in macaque monkeys (Visco-Comandini et al., 2015); thus, it is an evolutionarily conserved ability. The degree of cooperation was implicit to the participants regardless of performance error. To accomplish the task, participants had to pay attention to their partner's behavior, which was then reflected in the control of their force production. Therefore, by evaluating the degree to which each participant's behavior influenced their partner's behavior, we were able to individually identify the degree of cooperation. We adopted Akaike causality analysis (Akaike, 1968) to individually evaluate the degree of influence from the partner's output during the joint force-production task. The time-series data for the exerted forces of these pairs were analyzed using a multivariate autoregressive model that propagates information from the past to the future. We calculated the Akaike noise contribution ratio (NCR; Akaike, 1968), which allows interpretation of causality from one participant to the other. The extent of cooperation of each participant could be quantified based on the causality from the partner. We focused on individual differences in the degree of cooperation under the same task condition and identified how the degree of cooperation in the same task differed across individuals. We used a hyperscanning functional magnetic resonance (fMRI) system (Koike et al., 2016;Morita et al., 2014) to visualize neural activity during the joint force-production task. Our hypothesis was that the cognitive processing of cooperation would be shared with the partner during the joint task (Koike et al., 2016). Based on the results of previous hyperscanning fMRI studies (Saito et al., 2010;Tanabe et al., 2012), we examined pair-specific cross-brain synchronization during the joint task.

Participants
Nineteen same-gender dyads (nine female dyads; age, 19-27 years) participated in this experiment. The members of the dyads did not know and had not met each other before this experiment. None of the participants had any history of major medical or neurological illness, and all provided written informed consent for participation in the study. All participants except one were right-handed according to the Edinburgh handedness inventory (Oldfield, 1971). The study protocol was approved by the local medical ethics committee at the National Institute for Physiological Sciences (Aichi, Japan) and was in accordance with the Declaration of Helsinki.

Data acquisition
Two 3.0-T scanners (Magnetom Verio, Siemens, Erlangen, Germany) with a 32-element phased-array head coil were used to acquire fMRI data. Functional images were obtained with a gradient-echo echo-planar imaging pulse sequence [repetition time (TR) ¼ 700 ms, echo time (TE) ¼ 30 ms, flip angle ¼ 80 ]. The TR was reduced by a multiband sequence (multiband factor ¼ 8) (Moeller et al., 2010). Fifty-six 2.0-mm-thick oblique slices with a 0.5-mm gap (2 mm Â 2 mm in-plane resolution) were acquired for 1058 vol in each session (3174 vol per participant across the three sessions). Anatomical three-dimensional (3-D) T1-weighted images (magnetization-prepared rapid-acquisition gradient-echo sequence, TR ¼ 1800 ms, TE ¼ 1.98 ms, flip angle ¼ 9 , field of view ¼ 256 mm, matrix size ¼ 256 Â 256, slice thickness ¼ 1 mm, total of 208 sagittal slices) were collected between the first and second sessions of the experiment, as described in detail below. Visual stimuli were projected on a screen stand behind the head coil using a liquid crystal display projector (CP-SX12000J, Hitachi, Tokyo, Japan).
Grip forces during the force-production task were recorded using two digital grip-strength testers, custom-made for fMRI experiments (Uchida Denshi, Tokyo, Japan). The force signals were transferred to a laptop computer with a sampling frequency of 200 Hz via an analog-to-digital converter (NI 9215 with NI cDAQ-9171; National Instruments, Austin, TX, USA). The voltage values were normalized as a percentage of the maximum grip force of each participant (% max). The maximum grip force was measured twice before the experiment, and the mean value was used as the maximum grip force. Participants laid supine on the bed of each fMRI scanner room and held the grip strength tester in their right hand. The tester was placed at the side of their right thigh, such that the position would be maintained if the participant released the grip on the tester (Fig. 1A).

Experimental design
Fig. 1B shows the task design used in this study. The dyads performed cooperative tasks in the scanner room. The task had a 2 ('single' vs. 'joint') Â 2 ('perform' vs. 'watch') design. In force production tasks ('perform'), participants had to match their grip forces to the target force, whereas in force-watching tasks ('watch'), they were only required to watch visual cues representing the time-series data of "typical" grip force in the performing condition (PS and PJ). The force was preliminarily recorded from a few subjects who did not join the main experiment under the same task condition, and the median performances were selected as typical performances. These tasks were performed in cooperation with a partner ('joint') or on one's own ('single').
In the 'perform-single' (PS) condition (Fig. 1B,, in the first 5 s, one yellow cursor and a green line appeared on a screen. A cursor and a horizontal line, in real time, represented the individuals' forces and the target force, respectively. The dyads were required to match their grip forces to the target force (20% of their maximum grip forces) as accurately as possible. Even after the cursor color changed to white, participants had to maintain their grip forces to the level of the target force for a remaining 25-s period. The yellow color was used to encourage the subjects to match their forces to the target force within the first 5 s of each trial. Thus, the cursor colors changed from yellow to white after 5 s, independent of the subjects' performance. We required the subjects to minimize force deviation from the target for 30 s to the greatest possible extent and did not define a given range of 'task success.' In the 'perform-joint' (PJ) condition ( Fig. 1B, top-left), the task goal was to match the averaged force of the dyads to the same target force as in the single condition. In the first 5 s, two yellow cursors and one line appeared on a screen. One cursor represented the grip force of one participant, and the other represented the partner's grip force. Dyads were required to independently match the 'own' cursor to the target force, as soon as possible, during the 5-s period. Next, these cursors were replaced by a single white cursor representing the average force of the two individuals. The participants in the dyad were then instructed to produce an averaged force, as accurately as possible, for the remaining 25-s period.
While in 'perform' conditions participants exerted grip forces, in 'watch' conditions ( Fig. 1B, right), they simply watched cursor movements that represented typical performances. The typical performance was recorded before the experiments. During the 'watch' condition, the participants were instructed not to exert any grip force and to release their grips on the tester. In the 'watch-single' (WS) condition (Fig. 1B, bottom-right), participants watched the typical performance in the PS condition. In the 'watch-joint' (WJ) condition (Fig. 1B, top-right), participants watched the typical performance in the perform-joint condition. All conditions involved a task call (5 s) and a 5-s countdown ('5,' '4,' …, '1') before the task (Fig. 1C). Each of the four sequential conditions lasted 40 s, followed by 20 s of rest (fixations on a white cross), which was collectively defined as an epoch (180 s; Fig. 1C). The order of the four conditions was counterbalanced among epochs and dyads. Four epochs, including a 20-s rest at the beginning, were performed as a session (in total, 12 min 20 s; Fig. 1D). Before starting the experiment, participants practiced the task sufficiently to reach a plateau of performance starting from the first session. All dyads completed three sessions with approximately 10 min of rest during which participants remained still within the scanner. All visual stimuli used in the experiment were generated with Psychtoolbox-3 (Brainard, 1997) (RRID: SCR_002881) and implemented in MATLAB 2013a (MathWorks, Natick, MA, USA) (RRID: SCR_001622). No cover story was adopted.

Behavioral data processing and statistical analysis
Force signals were digitally low-pass-filtered with a zero-phase lag fourth Butterworth filter at a cutoff frequency of 20 Hz. We evaluated task errors in 'perform' conditions (20 s) by the root-mean-square error (RMSE), representing the standard deviation of the difference between the observed and target forces (x(t) and x target , respectively), as follows: xðtÞ À x target Á 2 r (1) Moreover, to determine which frequency range was dominant for establishing a cooperative link with the partner, we calculated coherence and phase for the force signals in the PJ condition. The coherence function (C xy (f)) between signals of two grip forces, x(t) and y(t), in a pair was estimated using the auto-spectra of these signals (G xx , G yy ) and the cross spectra (G xy ) as follows: where f is the frequency. The phase function was calculated as the phase angle of the cross-spectra. The latter 2 12 data points (approximately 10-30 s) in each trial were analyzed and the pooled coherence for all subjects' data was evaluated. For these behavioral data, we applied repeated measures two-way analysis of variance (ANOVA; task and session effects) to the RMSE. The significance level was set at p < 0.05, except for coherence, for which it was set at p < 0.001. Statistical analyses were performed in R (R Core Team) (RRID: SCR_001905).

Calculation of the NCR and statistical analysis
We applied Akaike causality (Akaike, 1968;Okazaki et al., 2015) to the time-series data of force signals and evaluated the NCR in order to identify the degree of interaction with the partner in 'perform' conditions. The NCR quantifies the degree of influence from the stochastic noise involved in the partner's signal, which is a good indicator of the degree to which the participant is sensitive to the partner's behavior during the joint action (Okazaki et al., 2015). This method also allowed us to select specific frequency ranges for evaluation.
The details of the NCR calculation are as follows. First, using the following equations, we adopted a multivariate autoregressive model to analyze the time-series data recorded from two participants, xðtÞ and yðtÞ, representing a fluctuation of the exerted grip forces: where a i , b i , c i , and d i are autoregressive coefficients and u x and u y indicate the residual noise in one's own and the partner's exerted grip forces, respectively. Using xðtÞ and yðtÞ, we estimated the power spectrum of these time series by the sum of the contributions of the x-specific (i.e., jαðf Þj 2 σ ux 2 ) and y-specific (i.e., jβðf Þj 2 σ uy 2 ) noises. Using a set of autoregressive coefficients, Fourier transformation via an impulse response function yields α(f) and β(f), which are response functions in the frequency domain. σ ux and σ uy indicate the variance of the residual noise, u x and u y , respectively. The NCR y→x ðf Þ, an index of how the participant's exerted grip force xðtÞ is influenced by that of the partner yðtÞ at a specific frequency f, was calculated from the ratio of part of the spectral density of xðtÞ contributed by σ uy 2 to the total spectral density of xðtÞ at frequency f.
Therefore, the NCR y→x ðf Þ is expressed as follows: To assess the total influence from yðtÞ to xðtÞ, we mathematically integrated the NCR value over the entire frequency range using the following trapezoidal function: where f s is the sampling frequency of the time series, xðtÞ and yðtÞ. Based on the coherence and phase for the force signals in the PJ condition, we set f s at 1 Hz.
In the behavioral data analysis, the parameters were as follows: f s at 1 Hz, the autoregressive order N defined the time length of history, and the N was estimated to minimize the Akaike information criterion in a range from 1 to 20.
Because each participant underwent three sessions, each containing four PJ and four PS conditions, we obtained 12 ΣNCR values for each condition per subject. We averaged them to generate one summarized P NCR value for each participant in each condition. Using the P NCR for each subject, we performed a paired t-test (task effect) to determine whether the cooperation index, i.e., the P NCR, differed between conditions. The significance level was set at p < 0.05. The evaluation of the P NCR was performed using an in-house script written in MATLAB 2014, and statistical tests were conducted in R.
2.6. Imaging data processing and statistical analysis 2.6.1. Data preprocessing Imaging data were analyzed with SPM12 (Wellcome Trust Centre for Neuroimaging, London) (RRID: SCR_007037) implemented in MATLAB. The first 14 vol of each session were discarded because of unsteady magnetization. The remaining 1044 vol from each session (3132 vol per participant) were analyzed. The functional images were realigned to correct for 3-D head motion. We did not perform a slice-timing correction procedure because of the short TR and usage of the multiband sequence. After the realignment, all functional images from each subject were coregistered with the T1-weighted anatomical image, which was then normalized to the Montreal Neurological Institute (MNI) T1 image template (ICBM152) (Evans et al., 1994;Friston et al., 1995). Using the estimated normalized parameters, all functional images were spatially normalized to the template brain and resampled to a final resolution of 2 Â 2 Â 2 mm 3 . The spatially normalized functional images were smoothed using a Gaussian kernel of 8 mm full width at half maximum. At this stage, the imaging data of one participant were removed due to head movements that were larger than those of the other participants (>8 mm translation and >6 rotation).
2.6.2. Statistical analysis 2.6.2.1. Whole-brain general linear model (GLM) analysis. In the individual analyses, we fitted a GLM to the fMRI data from each participant (Friston et al., 1996;Worsley and Friston, 1995). Neural activity during each condition (PJ, PS, WJ, and WS) was modeled with boxcar functions convolved with the canonical hemodynamic response function. The time series for each voxel was high-pass filtered at 1/128 Hz. A first-order autoregressive model, AR(1), was used to remove serial correlations in the signals (Friston et al., 2007). In the individual-level analysis, we obtained images representing the normalized task-related increment of the MR signal of each subject for each predefined contrast. We considered the following five contrasts individually: PJ, PS, WJ, WS, and [PJ > PS].
Contrast images from the individual analyses were then used for the group analysis, with the between-participants variance modeled as a random factor. The contrast images obtained from the individual analyses represented the normalized task-related increment of the MR signal of each participant. Using the contrast images of PJ, PS, WJ, and WS, we conducted a group analysis to reveal the brain activation corresponding to each condition, main effect of action, and main effect of cooperation. A flexible factorial design was used to reveal the group-level activation. We also performed one-sample t-test using individual [PJ > PS] contrast images, with parametric modulation by the averaged P NCR value of each participant as the covariate; this is because the [PJ > PS] contrast represents cooperation. In the PS condition, the task was to adjust the location of the cursor, which indicated the force exerted relative to the target line. In the PJ condition, the location of the cursor indicated the averaged forces of the participant and partner. To accomplish the goal, i.e., to adjust the cursor to the target line, the participant had to take into account the partner's force in a real-time manner, which represented cooperation. Thus, the [PJ > PS] contrast controlled for performance, leaving the component of cooperation. The resulting set of voxel values for each comparison constituted a statistical parametric map (SPM) of the t-statistic [SPM(t)]. The statistical threshold for the spatial extent test on the clusters was set at p < 0.05 and corrected for multiple comparisons at the cluster level over the whole brain (family-wise error), with a height threshold of p ¼ 0.001 without multiple comparisons (Friston et al., 1996).
2.6.2.3. Effective connectivity between the rTPJp and rTPJa. To reveal the direction of information flow between the rTPJp and rTPJa, we adopted an Akaike causality analysis, which is also used in behavioral data analysis (Akaike, 1968;Ozaki, 2012), on the timeseries extracted from the rTPJp and rTPJa. Firstly, after the whole-brain GLM analysis with SPM12, we saved Neuroimaging Informatics Technology Initiative (NIFTI) volumes including residual timeseries free from task-related activation and deactivation in each subject (Fair et al., 2007). Next, we extracted residual timeseries in two ROIs, i.e., the rTPJp and rTPJa. ROIs were common to all participants. The centre of the ROIs was defined by the local peak MNI coordinate in our GLM analysis (see Tables 1 and 2). The centers of the rTPJp and rTPJa ROIs were on [50, À50, 26] and [64, À32, 20], respectively. The ROI timeseries were extracted from voxels within a sphere with a 4-mm radius. We considered the average of the timeseries of all voxels to obtain a summarized value. The above procedures to evaluate the ROI timeseries were performed by in-house MATLAB scripts with MarsBar functions. In our experiment, each participant joined three sessions, each containing four PJ and PS blocks.
Each block lasted 30 s. The sub-timeseries corresponding to each block was clipped from the original blood oxygenation level dependent (BOLD) timeseries. The sampling rate in fMRI data acquisition (TR) was 0.7 s. Therefore, through the above processes, we obtained 12 timeseries for the PJ and PS conditions, each consisting of 43 timepoints. On these timeseries, we adopted the Akaike causality analysis, by which we could obtain two P NCR values representing effective connectivity between the rTPJa and rTPJp for each block: rTPJa→rTPJp and rTPJp→rTPJa. By averaging ΣNCR in each block, we could obtain the corresponding values for the PJ and PS conditions, for each subject and for each connectivity direction. The evaluation of P NCR was performed by an in-house MATLAB script, which was also used to calculate the causality between timeseries of exerted gripping forces. Finally, the statistical comparison was performed using R. We conducted a repeated two-way ANOVA (condition [PJ vs PS] Â connectivity direction [rTPJp→rTPJa vs rTPJa→rTPJp]) to test whether the effective connectivity between rTPJp and rTPJa could be modulated by interaction with a partner in the PJ condition.
2.6.2.4. Inter-brain synchronization analysis. Based on our hypothesis that the attentional process is shared between the partners during the joint task (Koike et al., 2016), we tested differences of inter-brain synchronization in the right TPJa during the PJ condition, with the method used in previous hyperscanning fMRI studies (Saito et al., 2010;Tanabe et al., 2012). First, activation or deactivation related to the force generation task was removed from the BOLD timeseries using the GLM implemented in SPM12 to obtain the 3D-NIFTI files representing the residual time series through the process. Based on the experimental design, we selected the NIFTI images corresponding to the PJ condition, which were concatenated into one long 4D-image representing the PJ condition.
Using the Pearson's correlation coefficient, we calculated the interbrain synchronization between voxels representing the same MNI coordinate positions (x, y, z) of two participants in the right TPJa cluster, as shown in Fig. 8. The correlation coefficient r was transformed to the standardized z-score using Fisher's r-to-z transformation. Next, we acquired the average of the Z-value in all voxels within the rTPJa cluster. We repeated the procedure across all possible combinations of participants. Finally, we compared the inter-brain synchronization between the paired participants (Pair) and that between non-paired participants (Non-Pair), using the two-sample t-test. The t-test was conducted using an

Behavioral results
Task errors in the PS and PJ conditions were evaluated by the RMSE (Fig. 2A). Repeated-measures two-way (condition and session effects) ANOVAs revealed no significant main effects of conditions (F (1,18) ¼ 0.001, p ¼ 0.975), session (F (2,36) ¼ 1.522, p ¼ 0.232), or their interaction (F (2, 36) ¼ 1.089, p ¼ 0.348). Fig. 2B shows a representative profile of grip forces during the PJ condition. Although all dyads could maintain their joint forces (F Joint , gray line) around the target force (dotted line), they exhibited force distributions specific to each dyad (F 1 and F 2 , blue and red lines). In the specific case shown in Fig. 2B, F1 was consistently larger than F2. The force distributions were highly stable even though the dyads did not receive any feedback regarding the individual forces. For over half of the dyads, the magnitude relationships between the two forces did not change within each trial or among trials. We divided the force data in the last 10 s into five bins (2 s each) and examined the magnitude relationships using the mean values in these bins. Of the PJ trials, 92.5% did not exhibit changes in the magnitude relationships within a trial. In these trials, 82% maintained the same magnitude relationships between trials in each dyad. Fig. 2C shows the averaged coherence and phase between two grip forces timeseries in the PJ condition for all participants. There were two frequency bands of significant coherences that had different phases. The significant coherence peaks (above the green line that corresponds to the significance level threshold) around 1 Hz exhibited an almost zerodegree phase (in-phase), indicating that the dyads responded similarly and simultaneously to a common input, i.e., their error signals. By contrast, the coherence below 0.5 Hz had an almost 180-degree phase (anti-phase), indicating that the members of dyads compensated each other's forces in order to maintain their average force around the target force (Fig. 2B). Because the cooperative interaction for the task goal should be reflected in the latter frequency regions, we evaluated the integration of the NCR below 0.5 Hz as P NCR (unit: %), indicating the degree of influence from the partner. Paired t-test revealed that the P NCR was significantly larger in joint than in single conditions ( Fig. 2D; t (36) ¼ 17.447, p < 0.001).

fMRI results
In the PJ condition, where participants cooperatively exerted their grip forces, significant activation was observed in the dorsomedial prefrontal cortex (dmPFC) extending to the ACC, superior frontal gyrus (SFG), frontal eye field, supplementary motor area, middle temporal gyrus (MTG), premotor cortices, inferior frontal gyrus (IFG) extending to the anterior insular cortex, TPJ (AG, STG, and SMG), inferior temporal cortex, inferior occipital cortex, cerebellum, and thalamus (Fig. 3A). In the PS condition, activated regions were present in the bilateral dmPFC, SFG, supplementary motor area, frontal eye field, supplementary motor area, MTG, premotor cortices, inferior temporal cortex, inferior occipital cortex, cerebellum, and thalamus. Activation was also observed in the right IFG extending to the anterior insula, lateral orbitofrontal cortex, and middle frontal gyrus (BA45) (Fig. 3B). In contrast to these 'performing' conditions, in the 'watching' conditions, activation was significant in the bilateral MTG and left SFG in the medial wall only in the WJ  Typical pattern of grip forces in the PJ condition. In many cases, the magnitude relationship of the two grip forces was specific to each dyad and did not change within or among trials. C. Coherence (top) and phase (bottom) for two grip forces in the PJ condition. The profile was averaged for all participants. D. Integral of the noise contribution ratio (NCR) below 0.5 Hz. Error bars show the standard deviation. condition ( Fig. 3C and D). Contrasting the 'performing' (PJ and PS) with the 'watching' (WJ and WS) conditions revealed that the widespread fronto-parieto-occipital network was activated when participants were performing the tasks (Fig. 4A). With the contrast reversed, the right precentral area and primary visual cortex around the calcarine sulcus exhibited greater activation in the 'watching' than in the 'performing' conditions (Fig. 4B). Two clusters, the precuneus and TPJ, exhibited greater activation in the 'joint' (PJ and WJ) than in the 'single' conditions (PS and WS) (Fig. 4C).
Because our main interest was to reveal the neural substrates of cooperation during task performance, we compared brain activation between the PJ and PS conditions (see Methods). The contrast of [PJ > PS] (Fig. 5A, Table 1) revealed significant activation in the bilateral dmPFC, SFG, middle frontal gyrus, MTG, and precuneus. Significant activations were also observed in the right posterior-medial frontal cortex, ventromedial PFC (vmPFC), intraparietal sulcus, the anterior portion of the TPJ (including the AG and STG), and in the left IPL, middle occipital gyrus, the posterior portion of the TPJ (SMG), and the cerebellum (Fig. 5A, Table 1). Fig. 5B and C, and Table 2 show that the cooperation-related activation observed using the contrast of [PJ > PS] was significantly correlated with the P NCR, i.e., the behavioral measure of the participant's cooperation obtained during the PJ condition.
To assess the relationship between the rTPJp and rTPJa in the PJ and PS conditions, the effective connectivity between these two regions was estimated by the NCR value (Fig. 7). Repeated measures two-way ANOVA showed a significant interaction between condition [PJ vs PS] and connectivity direction ([rTPJp→rTPJa vs rTPJa→rTPJp]; F (1,37) ¼ 8.234, p ¼ 0.007). A post-hoc test revealed that effective connectivity from the rTPJa to the rTPJp was greater in the PJ than in the PS condition (t ¼ 3.433, p ¼ 0.002). We could not find any modulation in connectivity from the rTPJp to the rTPJa (t ¼ 0.417, p ¼ 0.679).
Finally, we found pair-specific enhancement of the cross-brain functional connectivity of the right TPJa during the joint task (Fig. 8).

Behavioral measures of cooperation
In this study, we adopted a simple joint force-production task that had three distinct characteristics. First, it contained a clear goal shared by the participants. Second, because the goal was to maintain the average exerted force at a predefined level, continuous cooperation was required. Third, timeseries data of the cooperative adjustment of the pair were recorded as the exerted force. We then applied multivariate Fig. 3. Task-related activation. Highlighted brain regions with significant taskrelated activation in the PJ, PS, WJ, and WS conditions compared with the implicit baseline were superimposed on the 3D surface rendered high resolution MRI of the template brain (left and middle column) and the sagittal section at x ¼ À10 (right column). For all data, the threshold for SPM{t} was set at p < 0.05, with a family-wise error at the cluster level for the whole brain. PJ, perform-joint; PS, perform-single; WJ, watch-joint; WS, watch-single. Fig. 4. Main effects of performance and togetherness. A, regions exhibiting significant activation during 'performing' (PJ and PS) versus 'watching' (WJ and WS) conditions. B, regions exhibiting significant activation during 'watching' versus 'performing' conditions. C, regions exhibiting greater activation in 'joint' (PJ and WJ) versus 'single' (PS and WS) conditions. The statistical threshold for SPM{t} was set at p < 0.05, with a family-wise error at the cluster level for the whole brain. PJ, perform-joint; WS, watch-single; WJ, watch-joint; PS, perform-single. There was no significant activation by Single > Joint contrast. autoregressive model analysis (Akaike, 1968;Ozaki, 2012) to quantify the degree of cooperation, as a measure of the influence of the partner's exerted force on the force exerted by the participant, and vice versa.
The goal of the PS and PJ conditions was to generate force, using visual feedback, to maintain the target value. The PS condition required participants to adjust motor output based on continuous visual feedback of their motor control. The PJ condition additionally required each participant to continuously monitor the visual feedback to detect the signals originating from their partner's performance. Thus, participants had to estimate the outcome of their motor command and subtract it from the visual feedback. Therefore, the PJ condition was primarily characterized by the detection of the partner's performance, in contrast with the PS condition, primarily characterized by the detection of selfperformance.
Our coherence data indicated an additional process. We found that the in-phase coherence (0 ) peaked at 1 Hz, indicating that the pairs responded similarly and simultaneously to the common input. By contrast, the anti-phase coherence (180 ) of the exerted force of the pair occurred below 0.5 Hz, indicating that the two forces compensated to maintain the joint force at approximately the target level. This anti-phase coherence in the lower frequency range implies the inference of the partner's intention to accomplish the cooperation. The P NCR within the specific frequency ranges, below 0.5 Hz, was greater in the PJ than in the PS condition, indicating the causal effect of the partner's performance on the participant's performance. This finding suggests that the exerted forces were adjusted below the 0.5-Hz level according to the partner's performance, using information provided by the visual feedback. Thus, in contrast to the PS condition, the PJ condition allowed the detection and monitoring of the other's performance and the inference of the other's intention to predict performance, which was in turn integrated to accomplish the shared goal.

Mentalizing system brain network in joint action
Our fMRI results demonstrated that the PJ condition, in contrast to the PS condition, activated specific brain regions, belonging to what is known as the mentalizing system, i.e., the bilateral mPFC, precuneus, MTG, and TPJ (Brunet et al., 2000;Ferstl and Von Cramon, 2002;Gallagher et al., 2002;Goel et al., 1995;Happ e et al., 1996;Vogeley et al., 2001). Mentalizing is defined as thinking about the mental state of another person (Amodio and Frith, 2006;Frith and Frith, 2003;Van Overwalle andBaetens, 2009), e.g., inferring others' goals, intentions, andbeliefs (Amodio andFrith, 2006;Moriguchi et al., 2006;Saxe, 2010;Van Overwalle, 2009;Van Overwalle and Baetens, 2009). The mPFC is involved in sharing communicative intent during joint eye movement (Schilbach et al., 2010). Because the joint force-production task in this study contained a goal-inference process based on sharing of communicative intent, activation of the mentalizing system is reasonable. The results of our [PJ > PS] contrast are consistent with the findings of a previous study by Newman-Norlund et al. (2008), who compared the neural substrates of shared control with single control in a virtual bar-balancing task, which caused activation of the MNS and the right TPJ and precuneus. However, our study is the first to show a positive correlation between the P NCR and [PJ > PS] in the anterior portion of the rTPJa, indicating that rTPJa activation reflects the degree of cooperation between partners during joint action; in contrast, the posterior portion exhibits constant activation.
Furthermore, we found enhanced cross-brain functional connectivity of the right TPJa during the joint task. The present finding obtained with hyperscanning fMRI is consistent with those of previous hyperscanning EEG studies, which showed the cross-brain synchronization of the alphamu frequency band over the right-centro-parietal scalp regions during spontaneous imitation of hand movement (Dumas et al., 2010). During piano playing of a musical piece by a pair of pianists, high behavioral entrainment was associated with self-other integration, as indexed by alpha suppression over the right-centro-parietal scalp regions (Novembre et al., 2016). Based on the EEG hyperscanning experiment of the mutual gaze between mothers and infants, Leong et al. (2017) found interpersonal neural synchronization. They argued that the phase of cortical oscillations reflects the excitability of underlying neuronal populations to incoming sensory stimulation (Schroeder and Lakatos, 2009), a possible mechanism for temporal sampling of the environment (Giraud and Poeppel, 2012). Interpersonal neural synchronization could increase within a dyad during the course of social interaction because each partner is continuously producing salient social signals that act as synchronization triggers to reset the phase of his or her partner's ongoing oscillations. As a result, infants' most receptive periods become well-aligned to adults' speech temporal patterns (e.g., prosodic stress and syllable patterns), optimizing communicative efficiency (Leong et al., 2017). In the present case of cooperative joint action, salient social signals were the visually presented force output of the partner, which should be processed in the rTPJa because the joint action related activity of the right TPJa was positively correlated with the NCR. Thus, the cross-brain synchronization of the right TPJa reflects the shared attention toward the temporal patterns of the motor output of the partner.

Functional anatomy of rTPJa and rTPJp
The rTPJ is located at the conjunction of the posterior superior temporal sulcus, IPL, and lateral occipital cortex (Corbetta et al., 2008). Several previous studies using different methods have reported that the TPJ is parcellated into subregions (Igelstrom et al., 2015). Applying independent-component analysis to task-free fMRI data, Igelstrom et al. (2015) reported six subdivisions in the rTPJ area. A cytoarchitectonic study also reported that the TPJ contains six subdivisions (Caspers et al., 2006). Using diffusion-weighted imaging tractography-based parcellation and resting-state functional connectivity, Mars et al. (2012) identified three separate regions in the rTPJ: a dorsal cluster covering the middle part of the IPL, an anterior cluster (rTPJa) including the SMG, and a posterior cluster (rTPJp) including the AG.
Recent studies have suggested that the rTPJp and rTPJa contribute to  5C) and a previous study that classified the TPJ into three sub-regions (Mars et al., 2012). PJ, perform-joint; PS, perform-single; NCR, noise contribution ratio. Fig. 7. Effective connectivity analysis. Left: regions of interest used in effective connectivity analysis. Right: effective connectivity between the right anterior and posterior temporoparietal junction (rTPJa and rTPJp), in the PJ and PS condition, represented by the P NCR value through Akaike causality analysis. PJ, perform-joint; PS, perform-single; NCR, noise contribution ratio. Fig. 8. The inter-brain synchronization on the right TPJ was conspicuously greater in paired participants (Pair) than in non-paired participants (Non-pair) (t ¼ 2.300, df ¼ 322, p ¼ 0.022). different aspects of cognitive function. Utilizing activation likelihood estimation (ALE)-based meta-analysis, Kubit and Jack (2013) found that the rTPJa is associated with target detection, as revealed by the oddball task, while the rTPJp is associated with social reasoning, as revealed by a theory-of-mind task. The authors also reported that the rTPJa and rTPJp belong to different networks, using a public resting state dataset (Kubit and Jack, 2013). Based on resting-state functional connectivity, the rTPJa and rTPJp were found to be connected to the salience network, i.e., the anterior insula and ACC, and the default mode network, i.e., the mPFC and posterior cingulate cortex, respectively (Kubit and Jack, 2013). The authors observed anti-correlated patterns between networks connecting to rTPJa and rTPJp, suggesting that they are independent to each other. Anterior-posterior functional differentiation was also reported by other fMRI studies (Bzdok et al., 2013;Mitchell, 2008), an ALE-based meta-analysis (Decety and Lamm, 2007;Krall et al., 2015), and a connectivity study (Bzdok et al., 2013;Krall et al., 2015;Mars et al., 2012). These studies support that the rTPJa and rTPJp are associated with the domains of attention control and beliefs, respectively.
In the present study, the rTPJp was more strongly activated during the PJ than during the PS condition, suggesting that rTPJp activation is related to the coordination of self-and other-behavior (Carter and Huettel, 2013). Coordination may be supported by the psychological intention to consider the partner's behavior or the monitoring of the partner's intention to adjust the motor output in order to accomplish the goal (Carter and Huettel, 2013;Geng and Vossel, 2013).
In contrast, in the rTPJa region, we did not detect any significant activation, as depicted by the [PJ > PS] contrast. However, the degree of activation closely correlated with the P NCR index, representing one's own behavior, i.e., the gripping force variation was influenced by that of the partner. The activation correlated with the P NCR index was specific to the PJ condition, suggesting that the rTPJa may be related to directing social attention toward a partner's behavior, rather than representing a simple bottom-up visual attention process. The ventral attentional network, including the rTPJa, may function as a system to switch or reorient between "internal, bodily, or self-perspective and external, environmental, or another's viewpoint" (Corbetta et al., 2008). This "attentional switching" function of the rTPJa may be related to the self-other distinction (Blakemore and Frith, 2003;Newman-Norlund et al., 2008) and the detection of a mismatch between our expectations and the actual communication outcomes with a partner (Corbetta et al., 2008;Koster-Hale and Saxe, 2013), as shown by our behavioral results. Thus, the right TPJa is related to the detection of the partner's performance, in contrast with self-performance.
These functional distinctions between the rTPJp and rTPJa raise the possibility that information flows from the latter to the former during cooperation. Indeed, our effective connectivity analysis revealed that information flow from the rTPJa to rTPJp was significant specifically in the PJ but not the PS condition. Thus, these two regions, although engaged in different cognitive functions, likely work together during cooperative tasks. This notion is consistent with arguments raised by previous studies. Corbetta and colleagues argued that the rTPJ is an interface between the dorsal and ventral attention networks (Corbetta et al., 2008). They assumed that the whole TPJ structure is important for self-and other-related processes. They argued that internally directed processing, such as introspection, self-referential thoughts, or projecting oneself into a situation, involves a default mode network (Raichle et al., 2001) that markedly overlaps with the mentalizing network (Amft et al., 2015;Spreng et al., 2009); in contrast the dorsal attention network controls environmentally directed processes, such as perception and action.
The notion that the TPJ acts as an interface between self-and otherrelated or between external-or internal-triggered information processing was also suggested by Bzdok et al. (2013). The authors suggested that the rTPJ links two antagonistic brain networks processing external versus internal information. Lee and McCarthy (2016) used a within-subjects design and multivoxel pattern analysis to discriminate the neural representation of biological motion, theory of mind attribution, and attention reorienting and found cross-task classification in the right TPJ, suggesting a shared neural process underlying these three tasks. A previous hyperscanning fMRI study also suggested that the whole TPJ cluster might serve as a self-other interface. Bilek et al. (2015) found that the cluster covering both the rTPJa and rTPJp exhibited cross-individual neural coupling during a joint attention task and that the strength of coupling correlated with the social network index, a measure of social behavior complexity. Because the rTPJ was implicated in reorienting attention and social cognitive functions, such as processing of social cues and inferring social intention, these authors attributed their findings to cross-individual information flow, i.e., self-and other-related information, relevant to joint attention. Taken together, these studies suggested that the collaboration between the rTPJp and TPJa is critical in linking self-and other-related information.
An unanswered question remains whether the involvement of the right TPJ is related to the mutual interaction of the paired participants (cooperation) or to the adjustment of the perturbation per se. Although the cross-individual synchronization of the right TPJa strongly suggests the former notion, this issue can be clarified by comparing the joint work condition with the condition in which the participants had to adjust their force for the perturbation by the "agent" that cannot be mutually interacted (such as a PC-driven force generating device). We did not include this "unidirectional" control condition because of the time limitation of the experiment. Future study including this condition is warranted.

Conclusions
Combined with the findings of previous studies, our results suggest that the rTPJ is involved in controlling the flow of information relevant to a goal-oriented joint action to the mentalizing system, i.e., considering the partner's performance to adjust one's own action. Therefore, it seems that the rTPJ is involved in the self-other distinction of feedback signals of joint actions, and thus in inferring whether another agent is involved in the current behavior (Carter and Huettel, 2013;Geng and Vossel, 2013); in turn, this provides information relevant to inferring the partner's intent to the mentalizing system.

Declarations of interest
None.

Funding
This work was supported, in part, by CREST from the Japan Science and Technology Agency (to K.W.), Grants-in-Aid for Scientific Research (#15H01846 to N. S., #24700608 to M.O.A, #15H05875 to T. K.) from Japan Society for the Promotion of Science, the Cooperative Study Program of the National Institute for Physiological Sciences (N.S. and K.W.), and the Japan Agency for Medical Research and Development (AMED) under Grant Number JP18dm0107152. The funding sources had no involvement in the study design, collection, analysis, and interpretation of data; writing of the report, and in the decision to submit the article for publication.