Viewing personalized video clips recommended by TikTok activates default mode network and ventral tegmental area

Cutting-edge recommendation algorithms have been widely used by media platforms to suggest users with personalized content. While such user-specific recommendations may satisfy users' needs to obtain intended information, some users may develop a problematic use pattern manifested by addiction-like undesired behaviors. Using a popular video sharing and recommending platform (TikTok) as an example, the present study first characterized use-related undesired behaviors with a questionnaire, then investigated how personally recommended videos modulated brain activity with an fMRI experiment. We found more undesired symptoms were related to lower self-control ability among young adults, and about 5.9% of TikTok users may have significant problematic use. The fMRI results showed higher brain activations in sub-components of the default mode network (DMN), ventral tegmental area, and discrete regions including lateral prefrontal, anterior thalamus, and cerebellum when viewing personalized videos in contrast to non-personalized ones. Psychophysiological interaction analyses revealed stronger coupling between activated DMN subregions and neural pathways underlying auditory and visual processing, as well as the frontoparietal network. This study highlights the functional heterogeneity of DMN in viewing personalized videos and may shed light on the neural underpinnings of how recommendation algorithms are able to keep the user's attention to suggested contents.


Introduction
The recommender system, which makes recommendations to a user by predicting his/her interest, has been successfully used in numerous media platforms. As such a video sharing and recommending platform, TikTok (also known as Douyin in China), has gained widespread popularity all over the world. An online survey reported that TikTok was the most-downloaded app with 738 million downloads in 2019 and the total downloads were over 1.5 billion ( https://www.businessofapps.com/data/tik-tok-statistics/ access date: July 28, 2020). Just like other video sharing platforms, YouTube, for example, TikTok possesses both recreational and social attributes, allowing users to upload, follow, share, comment, and so on. Yet a special feature of TikTok that distinguishes it from YouTube is its video length, typically within 15 seconds and a small minority over one minute . In addition, the powerful recommender systems of TikTok can predict each user's interest and suggest personalized videos for them based on their previous browsing records ( Bobadilla et al., 2013 ) and tagged video classification .
While most users use these short-videos apps for entertainment, immoderately watching short videos may bring many notable prob-lems in some individuals. Relatedly, existing literature has documented addiction-like behaviors associated with some other digital applications, such as Internet gaming ( Ding et al., 2014 ;Kuss, 2013 ), Facebook ( Koc and Gulyagci, 2013 ;Ryan et al., 2014 ), and YouTube ( de Bérail et al., 2019 ). But to date, there are very few studies on problematic shortvideo watching behaviors, partly due to the fact that short-video apps have just been emerging in recent years. Hasan et al. (2018) reported that the use of recommender system and lack of self-control contribute to individuals' excessive involvement with video streaming services provided by YouTube and Netflix. Zhang et al. (2019) found that high social interaction anxiety and social isolation contribute to short-form video app addiction. Previous studies have also demonstrated the negative impacts of excessive digital application use. For example, overuse of the Internet and digital games may increase anxiety and depression ( Akin and Iskender, 2011 ;Weinstein and Lejoyeux, 2010 ). From initial reinforced learning to habit, then to compulsive use with prolonged engagement is thought to be a route leading to substance disorder ( Everitt and Robbins, 2005 ). Theoretically, the excessive use of digital apps like TikTok may exert influence on learning systems and memory circuits, which will progressively transform recreational use into a habit, then compulsion in vulnerable individuals ( Hyman et al., 2006 ). However, the initial reinforcement and corresponding neural activation elicited by recommended contents have not been fully examined.
The core question we aimed to address in this work is why people easily indulge in watching personally recommended short videos, and we chose TikTok as a typical example in our experiment. We approached this question in two steps. Because no study has reported whether Tik-Tok may cause addiction-like behaviors or not, we first carried out a survey study to address this question. Following a previous work characterizing the severity of problematic YouTube use ( de Bérail et al., 2019 ), the severity of problematic TikTok use was characterized with a questionnaire adapted from IAT ( Young, 1998 ) in the present study. In addition, previous studies have provided some evidence that dysfunction of self-control plays a role in developing Internet-related addiction ( B ł achnio and Przepiorka, 2016 ;Khang et al., 2013 ;Mehroof and Griffiths, 2010 ;Montag et al., 2010 ). Self-control is described as the ability to inhibit the inner urges, refrain from the desired behavior, and resist external temptation ( Muraven and Baumeister, 2000 ). The lack of selfcontrol might bring about personal problems ( Tangney et al., 2004 ). Based upon these studies, we predicted that the severity of problematic TikTok use would inversely correlate with self-control ability.
As mentioned above, the recommender system plays an important role in continuously engaging individual's attention to video viewing, we then sought out to examine brain activity in response to video watching by comparing the Blood Oxygen Level Dependent (BOLD) signal changes associated with two types of videos: personalized videos (PV) and generalized videos (GV). The PV are customized by TikTok recommender system for experienced users. In contrast, the GV are recommended for a new user, hypothetically lack of user-specific preference. We were interested in which brain regions were activated by PV specifically and how they interacted with other brain regions differently when viewing PV, in contrast to GV. Technically, we used general linear modelling (GLM) to examine brain activation/deactivation, then determined regions of interest (ROIs) and applied a psychophysiological interaction (PPI) analysis to measure context-dependent connectivity of these ROIs. One of the features that distinguish these two types of videos is that the content of PV is closely linked to users' experience while GV have no particular relationship with a given user. Therefore, PV may evoke selfreferential processes and autobiographical memories associated with the activation of default mode network (DMN) ( Buckner et al., 2008 ;Molnar-Szakacs and Uddin, 2013 ;Rameson et al., 2010 ;Spreng and Grady, 2010 ). The DMN was initially identified as a set of brain regions showing reduced activity when participants performing externally oriented cognitive tasks . However, many studies have also suggested that DMN can be activated during a wide range of cognitive tasks, including monitoring of the external environment, selfreference, social cognition and autobiographical memory Sestieri et al., 2011 ;Spreng et al., 2009 ). Considering the vital role of DMN in self-relevant information processing and the inherent self-relevant feature of PV, we generally hypothesized that the DMN would show more activation when participants were viewing PV compared to that of GV. Besides, considering the pivotal role of the ventral tegmental area (VTA), substantia nigra (SN), and nucleus accumbens (NAc) in addiction and reward learning ( Ikemoto, 2007 ;Kelley and Berridge, 2002 ;Nestler, 2005 ), we also explored brain activation in these regions.

Survey study
In the survey study, 208 young adults completed a 20-item Problematic TikTok Use scale (see details in 2.1.1 below) and a 19-item Brief Self-Control Scale ( Tan and Guo, 2008 ). Their demographic and Tik-Tok use information was also collected, which included participants' sex, age, role in TikTok (videos viewer or videos creator), TikTok use history, and average time spent on TikTok per day.

Problematic TikTok use
There have been quite a few studies on YouTube addiction, but few on "Problematic TikTok Use " (or Problematic Douyin Use). The scale currently used to measure YouTube addiction is an adapted version of the Internet Addiction Test (IAT) ( Young, 1998 ). Similarly, here we used the Chinese version of Young's 20-item IAT and substituted "the Internet " with "Douyin ", the official name of TikTok in Chinese. The 20 items have a Likert-type scale ranging from 1 (rarely) to 5 (always), and a higher score indicates that the problem caused by the use of TikTok is more severe. The total scores yielding from the questionnaire were referred to as PTU scores. This questionnaire concerns undesired behaviors including salience, excessive use, lack of control, neglect of work, neglect of social life, and anticipation ( Widyanto and McMurran, 2004 ).

Brief self-control scale
The Chinese version of Tangney's self-control scale (SCS) ( Tangney et al., 2004 ) revised by Tan and Guo (2008) was used in the present study. Previous studies have evaluated the psychometric properties of Brief Self-Control Scale (BSCS) in China and suggested that BSCS is valid and reliable to measure self-control ( Unger et al., 2016 ). Based on the 13-item BSCS, this revision selected, deleted, and added some items from the full version of SCS and the final version consists of 19 items with five dimensions: Impulse control (6), work performance (3), healthy habits (3), entertainment restraint (3) and temptation resistance (4). Fifteen items need reverse scoring, and a higher score means higher self-control ability.

Participants
Thirty healthy students from Zhejiang University participated in the fMRI experiment (14 females; age ranges 19-30, Mean = 23.73, SD = 2.38). Written informed consent was obtained before the experiment from each participant. This study was approved by the Ethic Committee of Zhejiang University. All the fMRI participants were TikTok users. They were also required to complete above questionnaires. Seventy percent (20) of them had used TikTok for at least one year and 46.7% (14) reported that they spent more than one hour on watching short videos with this app every day. Based on the IAT criteria, seven of them were self-disciplined users (scores < 39), and the rest twenty three showed mild or moderate problematic TikTok use (39 < scores < 69). None of them was found to have severe TikTok use problem. After fMRI scanning, twenty eight participants completed the preference assessment scales and the other two participants were excluded from the preference statistics due to their incomplete data. Twenty six participants (93%) reported that they generally preferred personalized videos to generalized videos.

Stimuli
There were two types of videos in this experiment: generalized recommended videos for new users (GV) and personalized recommended videos for experienced users (PV). All short videos were recorded from TikTok using a smartphone (Device Model: MI9). The GV refer to videos randomly recommended by the system according to the public preference when TikTok is first downloaded and login as a new user. We recorded a 6-min GV that included 29 short music videos ranging from 5 s to 21 s in advance. When subjects arrived at the experiment site, we obtained their consent to login into their TikTok accounts, and recorded their personalized recommended videos (i.e. PV) also for 6-min. Between every two video clips, a white fixed cross on a black background was presented for 30 s. All the videos presented in the experiment were not shown to the participants until they started the fMRI experiment in the scanner.

Experimental design
The experiment adopted a block design ( Fig. 1 ) including three conditions: PV, GV, and interval/break. There are 6 blocks in PV condition After a 15-s instruction, participants were required to watch short video clips, including 6 personalized videos (PV) blocks and 6 generalized videos (GV) blocks. Each video block lasted for 1 min and was followed by a 30 s fixation rest block. The order of PV and GV was counterbalanced between participants. and 6 blocks in GV condition. Each block lasted for 1 min and consisted of 1 to 6 short videos. A 30-s image of a black screen with a centered cross was presented after each block. Half participants watched GV followed by PV and another half did the contrary to counterbalance the effect of order. All the stimuli were presented by E-prime 3.0 (psychology software tools, Pittsburgh, PA; https://www.pstnet.com ) and participants could watch them in an angled mirror and hear the soundtrack of videos by headphones. In order to control the field of eye movements and match the vertical display mode in smart phone, all video stimuli filled approximately a quarter of the screen. Participants were instructed to be relaxing when watching these videos. After scanning, each participant fulfilled the questionnaires mentioned above. Then they were interviewed to evaluate their preference to each video by rating from 1 (extremely unlike; low eagerness) to 3 (extremely like; high eagerness) for two questions below. To what extent do you like this video? To what extent do you want to watch videos of same category like this one? Participants were also asked which type (PV vs GV) of videos they preferred more in general.

Image preprocessing
Preprocessing of fMRI data included the following steps. First, slice time correction and head motion correction were performed using AFNI ( Cox, 1996 ). Then, tissue segmentation was conducted to extract brains using SPM12 ( https://www.fil.ion.ucl.ac.uk/spm/ ). Structural and functional images were normalized to the MNI space using ANTs ( http://stnava.github.io/ANTs/ ). Finally, spatial smoothing was conducted with a 5 mm full-width-at-half-maximum Gaussian kernel.

First level fMRI modelling
To characterize task-induced brain activation, a general linear modelling (GLM) was conducted using the command 3dDeconvolve in AFNI.
The blocks for PV and GV were convolved with hemodynamic response function to create 2 regressors to assess brain activity elicited by the two conditions, respectively. In addition, four event regressors were created to capture the transient response to the start and end of each block separately for the two video types. The effects of head motion were regressed out using the six head motion parameters. To further mitigate the impact of head motion, the censor option in 3dDeconvolve was used to exclude any volumes with framewise displacement (FD) greater than 0.5 mm.
Following the GLM analysis on brain activation, voxel-wise psychophysiological interaction (PPI) analyses were implemented by generalized PPI toolbox in SPM12. PPI provides information about taskrelated connectivity changes between a seed region and other brain regions ( Friston et al., 1997 ). The first eigenvariate of each seed region's BOLD signal was extracted as the physiological regressor. The two block regressors were multiplied by the contrast [1 − 1] to create the psychological regressor (PV > GV). Then the interaction of physiology (seed region signals) and the psychological regressor were computed as the PPI regressor. A GLM mode, including these three regressors, four transient response regressors, and six motion parameters as confounds, was run for each seed and each subject to produce the first level PPI results.

Group level fMRI statistical analysis
To characterize brain activation responding to the two types of videos, one sample T -tests on the beta maps of the block regressors were conducted at the group level with gender and age as covariates.
To further identify brain activation modulated by PV, a whole-brain voxel-wise paired T -test was used to compare the activation (beta maps) of two video-watching conditions. Next, one-sample T -tests on the PPI beta maps were used to identify brain regions showing connectivity difference between the two conditions. Finally, an ROI-based complementary analysis was conducted for VTA, NAc and SN because they are small structures, and our cluster size based multiple comparisons correction would treat small clusters as artifacts in whole-brain voxelwise analysis. The VTA and the NAc ROIs were drawn according to D'Ardenne et al. (2008) and the Atlas of the Human Brain ( Mai et al., 2015 ). The SN ROI was made based on the work by Pauli et al. (2018) .
The beta values of both PV and GV blocks were extracted from these three ROIs ( Table 3 ). Then, one-sample T -tests were conducted separately for each ROI to assess whether the PV/GV induced brain activation was significantly different from 0. Paired T -tests were used to assess whether the PV condition induced higher activation than the GV condition did.
All voxel wise statistical results were corrected for multiple comparisons at a cluster level P _ corrected < 0.05 with voxel P < 0.001 and minimum cluster size of 39 voxels based on simulation implemented using 3dttest ++ with the option of -Clustsim in AFNI ( Cox, 1996 ). The Bonferroni method was applied for ROI-based multiple tests correction.

Descriptive statistics of the survey sample
Among the 208 participants in the survey study, 55 participants reported that they had never used TikTok, and 153 (73.6%) participants (91 females) reported to use TikTok, and 55% of them had used TikTok for more than one year. Among those who reported using TikTok, most participants (67%) used TikTok less than 1 h per day and 30% more than 1 h but less than 2 h per day. Only a small portion (3%) used the App more than 2 h per day. In the following statistical analyses, only these 153 participants with TikTok use were included, resulting in a sample with ages ranging from 17 to 31 years (Mean = 22.8, SD = 3.14).

Characterization of PTU and its relationship with SCS
The Cronbach's alpha of PTU and SCS questionnaires were 0.936 and 0.896, respectively. Based on the criteria for characterizing the severity of addiction-like symptoms with IAT scores ( Young, 1998 ), 74 (48.3%) of the participants who scored below 39 were considered self-restrained and self-disciplined TikTok users, 70 (45.8%) of the participants who scored between 40 to 69 were considered to have mild TikTok use problems, and 9 (5.9%) participants who scored above 69 may have significant problems related to excessive TikTok use. To explore factors associated with individual differences in PTU, correlations between PTU scores and age, time of TikTok usage per day, as well as SCS scores were examined. Not surprisingly, the PTU score was found to have a positive correlation with the amount of daily time spent on TikTok ( r = 0.474, p < 0.001). In addition, PTU was negatively and significantly correlated with SCS ( r = − 0.279, p < 0.001), which supports our hypothesis that more severe TikTok addiction-like symptom is related to lower self-control. No significant correlation between PTU and age was found ( r = 0.04, p = 0.626) and a marginal significant difference in PTU was seen between female and male participants ( t = 1.92, p = 0.057).

Brain activation in response to TikTok videos
The brain regions activated by watching short videos are widely distributed ( Fig. 2 ). Both conditions (PV and GV) elicited extensive activations in bilateral primary and secondary auditory (BA 41,42) and visual cortices (BA17,18), fusiform gyrus (BA37), parahippocampal gyrus, superior and inferior temporal cortices, inferior prefrontal (BA47, BA8), premotor cortex (BA6), precuneus (BA7), thalamus, and cerebellum. In contrast, deactivations were found in the dorsal and ventral anterior cingulate cortex (dACC, vACC), dorsal posterior cingulate cortex (PCC), precuneus, inferior parietal lobule (IPL) (BA39,40), orbitofrontal cortex (BA11), dorsolateral prefrontal cortex (BA9), and caudate. As shown in Fig. 2 c, paired T -test revealed higher activation induced by TikTokrecommended videos (i.e. PV condition) in bilateral superior and middle temporal gyri, temporal pole (TP), ventral PCC, medial prefrontal cortex (MPFC), and angular gyrus, which were collectively considered as parts of the default mode network. In addition, left dorsal lateral and inferior frontal regions, anterior thalamus and cerebellum also showed higher activation. When examining the polarity of activation, one would notice that the difference was driven by less deactivation under the PV condition in angular gyrus, dorsal PCC, superior frontal and lateral inferior prefrontal regions. Particularly, the PCC region showed such a complex activation pattern under the PV condition that the very ventral part showed positive activation, whereas the dorsal part showed deactivation. In contrast, supramarginal, postcentral gyrus, right IPL showed lower activation in the PV condition comparing to the GV condition. Brain regions showing differences in activation were also summarized in Table 1 .

Psychophysiological interaction analyses
In light of the GLM results showing the engagement of subregions of DMN ( Fig. 2 ), we chose three clusters that showed significantly different activation in the contrast (PV vs GV) as seed regions, namely, MPFC, PCC, and TP ( Fig. 3 ). However, the definition of a standard DMN anatomy is still in lack ( Buckner and DiNicola, 2019 ). To ensure that the PPI seeds do locate within the DMN, the seed masks were further constrained in the following way. We searched with the keyword "default network " in Neurosynth ( https://neurosynth.org/ ), yielding an association test map (FDR corrected p < 0.01, Supplementary Fig. S1 ) based on automatic online meta-analyses (for more details, please see Yarkoni et al., 2011 ). This map was used as a DMN mask and was multiplied with the MPFC, PCC and TP clusters, producing the final PPI seeds ( Fig. 3 ).
The PCC seed showed increased coupling with primary visual and auditory cortices, subcortical regions, cerebellum, inferior frontal, and the frontoparietal network in response to personalized videos. A decreased coupling with PCC was found in precuneus (BA7), IPL (BA40), and middle cingulate cortex (BA23,24) ( Fig. 3 a).
In short, PPI analyses showed increased connectivity between the three DMN seeds and a distributed set of brain regions including visual network (BA17,18,19), primary auditory cortex (BA41), middle frontal gyrus when watching personalized videos compared to generalized videos. In contrast, cingulate cortex, cuneus and IPL showed decreased connectivity with three DMN seed regions when watching personalized videos ( Table 2 ). To evaluate the reliability of these results, we adopted three DMN coordinates ( Andrews-Hanna et al., 2010 ) and created the PPI seeds with 8-mm spheres for MPFC, PCC and TP. The PPI results were quite similar ( Supplementary Fig. S2 ).

Activation in the reward system
VTA showed significant positive activation in response to the PV, but not to the GV, and the difference between the two conditions was statis-  (c) The difference in activation between PV and GV condition. The MPFC, PCC, and TP in DMN were more activated under PV condition than GV condition (red: PV-GV > 0). tically significant ( Fig. 4 b). In contrast, neither the activation under PV and GV condition, nor the difference between the two conditions was statistically significant in the NAc region with multiple comparisons correction. Besides, SN exhibited significant activation under both PV and GV conditions, but their difference was not significant. The statistical results were summarized in Table 3 .

Discussion
Our survey data showed that the use of TikTok may cause significant problems in about 5.9% of users. The severity of problematic TikTok use was inversely related to self-control in young adults. Using fMRI, we found that the dMPFC subsystem of DMN and VTA were more ac-  tive under the PV condition than under the GV condition. The PPI results revealed that three DMN nodes (MPFC, PCC, and TP) showed enhanced coupling with primary visual and auditory areas and decreased coupling with precuneus and cingulate cortex for PV-GV. In light of prior studies discussed below, these findings may suggest that (1) the DMN activation and its enhanced coupling to visual and auditory pathways may contribute to the problematic TikTok use through modulations of attention and high-level perception, (2) the regions with decreased DMN coupling, precuneus and cingulate cortex, may involve in self-control, and thus decreased coupling with these areas may lead to loss-of-control use, and (3) the higher activation of VTA might associate with a higher level of value-based representations for personalized videos.
The negative relationship between TikTok usage and self-control indicated the problematic behavior associated with TikTok use linked to the lack of self-control, which is consistent with findings in other behavioral addictions. For example, previous studies have found that self-control has a negative relation to Internet addiction ( Özdemir et al., 2014 ;Shirinkam et al., 2016 ), Online gaming addiction ( Mehroof and Griffiths, 2010 ), smartphone addiction ( Han et al., 2017 ), and social media addition ( B ł achnio and Przepiorka, 2016 ). Beyond research of addictive-like behaviors, low self-control also links closely to other personality traits such as high anxiety ( Gailliot et al., 2006 ), loneliness ( Hamama et al., 2000 ), and impulsivity ( Denson et al., 2011 ). Bertrams et al. (2013) suggested that the ability of self-control is essential for shifting attention from anxiety-related worries to other stimuli. Similarly, we speculate that individuals with lower self-control ability have more difficulty shifting attention away from favorite video stimulation. Furthermore, there is a possibility that individuals with low self-control are susceptible to worrisome thoughts evoked by anxiety, and such an unpleasant feeling may drive them to devote to external stimuli for the pursuit of relief. The causal relationship between problematic TikTok use and self-control, however, warrants further study with a longitudinal design.  Abbreviations: VTA, ventral tegmental area; NAc, nucleus accumbens; SN, substantia nigra; PV, personalized videos; GV, generalized videos. The brain regional activation difference between the two types of short videos (PV and GV) was examined in order to reveal the potential neural basis of why people engage so much in viewing short videos. Personally recommended videos are regarded as user-specific, while generally recommended videos are non-user-specific. Unlike images or other static stimuli, videos contain ample information, including colors, figures, objects, music, voice, spatial locations, movements, and so on. In the fMRI experiment, both ventral (occipitotemporal) and dorsal visual pathway (occipitoparietal), as well as the auditory pathway were extensively activated regardless of video type. It has been well known that the ventral visual pathway is involved in object identification and representation, whereas the dorsal visual pathway is involved in processing object location and visually guided action ( Amedi et al., 2001 ;Goodale and Milner, 1992 ;Pietrini et al., 2004 ;Shmuelof and Zohary, 2005 ). Given that short videos contain rich dynamic visual and auditory stimulations, this result suggests that participants were consistently engaged in the video presentation. It is worth noting that the activities of both primary visual and auditory cortices did not show a significant difference between the two conditions, suggesting that it is not because of low-level features that drive the preference of video presentation. In contrast, it is the high-level processing that differentiates the two types of videos. As we discussed below, high-level processing centralized in the DMN may primarily contribute to the perception of video preference.
Short video watching is a dynamic process that might involve complex and multiple self-referential processes, such as the recall of the past experience, the future-oriented thinking, and the flow of the present moment. However, one would focus more on the present stimuli and have less future-or past-oriented thoughts once he/she was engaging in the video watching, and this is so-called "immersion ". This assumption was supported by our current findings in the DMN activation pattern. Comparing with viewing GV videos, the DMN showed such a functional activation disassociation that the dorsal MPFC and ventral PCC together with bilateral temporal gyri showed higher activation, whereas the ventral MPFC (vMPFC) and hippocampus showed no difference when viewing personalized videos. An extensive body of literature has supported that the DMN can be divided into several subcomponents ( Buckner et al., 2008 ;Campbell et al., 2013 ;Damoiseaux et al., 2008 ;Laird et al., 2009 ), with distinct components contributing differently to a wide range of cognitive tasks ( Bellana et al., 2017 ). For example, Andrews-Hanna et al. (2010) revealed that the DMN is a heterogeneous system comprising two subsystems. One subsystem is the "dMPFC subsystem " including the dorsomedial prefrontal cortex (dMPFC), temporoparietal junction and temporal pole. The other one is the "medial temporal lobe (MTL) subsystem " that is composed of the vMPFC, posterior IPL, parahippocampal cortex (PHC), and hippocampal formation (HF + ). In addition to these two subsystems, the PCC and anterior medial prefrontal cortex are of a midline core. While future thinking processes and mnemonic scene construction were selectively activated in the MTL subsystem, the dMPFC subsystem presented preferential activation when concerning one's present situation and mental state ( Andrews-Hanna et al., 2010 ). This point gets further support from other relevant studies ( Bellana et al., 2017 ;Xu et al., 2016 ). Based upon these findings, higher activation in the dMPFC subsystem in our study is likely to indicate that participants focused more on the present stimuli, whereas the lack of MTL activation may indicate the absence of future plans when they were exposed to the personalized videos. Such a functional disassociation is a potential cause that partially contributes to indulged video watching behavior.
The deactivation of DMN was interpreted as the suppression of ongoing internally focused thoughts, or task-irrelevant spontaneous activity ( Fransson, 2006 ;McKiernan et al., 2006 ). Consistent with this initial conjecture, later researches with internally oriented tasks (for example self-referential, moral decision, and mentalizing) found the increased activation of DMN Reniers et al., 2012 ). Although the overall activation patterns in these tasks were similar, distinct features were found ( Buckner et al., 2008 ), suggesting the existence of DMN functional heterogeneity. For example, the prefrontal cortex (mainly dMPFC and vMPFC) are functionally disassociated in many types of tasks, including selfknowledge ( Mitchell et al., 2006 ;Ochsner et al., 2005 ), decision making ( Kahnt et al., 2011 ), social cognition and moral judgment ( Forbes and Grafman, 2010 ), episodic memory ( Cabeza and Nyberg, 2000 ). Combining with these existing findings regarding functional heterogeneity of DMN subregions ( Amodio and Frith, 2006 ;Andrews-Hanna et al., 2010 ;Laird et al., 2009 ;Seghier and Price, 2012 ), our results suggested the DMN may play a pivotal role in self-referential, allocation of attention, and social cognition during the high-level processing of video perception. We elaborated these three DMN functions in detail as below.
The content-based recommendation system may elicit more selfreferential processing that is closely related to DMN activity. This is supported by the piling evidence showing the engagement of DMN in self-referential processing including autobiographical memory ( Andreasen et al., 1995 ;Fox et al., 2015 ;Philippi et al., 2015 ;Svoboda et al., 2006 ;Whitfield-Gabrieli et al., 2011 ). Compared with non-self-related stimuli, the MPFC and PCC show more neural activity in response to self-related stimuli ( van Buuren et al., 2010 ). Regardless of algorithmic details, a recommender system that does better than random selection must be able to provide new video content that relate to a specific user, which should not only capture his/her attention but also evoke a process that holds the attention on. The processing of userrelevant video content discovered by the algorithm, therefore, would include self-referential processes in terms of 'favorite', 'interested', and 'familiarity'. Further, such user-relevant information may trigger previous positive viewing experience, which, in turn, can modulate the allocation of attention for processing the present video stimuli. As elaborated below, this conjuncture is supported by the PPI results showing enhanced communications between DMN regions and other brain areas including primary visual and auditory cortices and the frontoparietal network.
A higher level of activation in the PCC under the PV condition may indicate a higher level of attention shifting from the external/broad attentional focus to internal/narrow attentional focus towards video contents. Although, as a key hub of DMN, PCC has been initially considered as a neural substrate underlying internal-oriented thoughts ( Buckner et al., 2008 ;Mason et al., 2007 ), there emerges a new perspective that PCC also functions as a central area to keep the balance between the internally and externally oriented attention ( Leech and Sharp, 2014 ). Accumulating evidence indicates the ventral part of PCC is more related to self-referential processing, whereas the dorsal part is involved in allocating attention to external stimuli ( Leech et al., 2011 ). According to the theoretical model proposed by Leech (2014) , the level of arousal state is associated with the activity of the whole PCC region, whereas the balance of internal/external attention and attentional focus is linked to the alteration of brain activity and functional connectivity within PCC. Considering the PV condition is more user-specific and history-relevant, it is possible that the personalized videos modulate one's attention by shifting it from the external/broad attentional focus to internal/narrow attentional focus for deeper processing of video contents. The PCC, which was more active under the PV condition than the GV condition ( Fig. 2 c), may serve as a key neural substrate of this attentional modulation process. Enhanced functional connectivity between PCC and other brain regions revealed by PPI analysis ( Fig. 3 a) further supports this argument. Under the PV condition, the PCC showed a greater connection with the middle as well as superior frontal gyrus, both of which have been implicated in working memory ( Leung et al., 2002 ;McCarthy et al., 1994 ). Anderson et al. (2006) found bilateral activation in the PCC when watching normal visual action sequences compared to random sequences, indicating PCC also plays an important role in understanding, evaluating and memorizing comprehensible/meaningful visual stimuli. Taken together, we conjuncture that the increased PCC activation and its connectivity with the frontoparietal network is associated with inward attentional shift and focus for processing and integrating the video-related information accessible in working memory, which might yield user-specific deeper comprehensions.
As a social media platform, TikTok is born to possess social and emotional attributes. Higher activation of dMPFC and temporal regions ( Fig. 2 c) may relate to a deeper-level of the appraisal processing of recommended video contents. The different parts of MPFC show functional heterogeneity. The ventral part is mainly associated with reward processing, negative emotion regulation and affective process (like pain and compassion) ( Hiser and Koenigs, 2018 ). In contrast, the dMPFC is implicated in social cognition such as theory of mind, reasoning and judgements ( Jahn et al., 2016 ;Mitchell, 2008 ;Mitchell et al., 2005 ;Wagner et al., 2012 ). Using reverse correlation analysis ( Hasson et al., 2004 ), Wagner et al. (2016) found that the response of dMPFC showed a strong category preference for scenes depicting and social interaction during natural movie viewing. Therefore, higher activation of dMPFC in the present study is likely involved in processing video contents at a relatively high and abstract level. In contrast, the temporal lobe, which has been considered as a "semantic hub " of the brain ( Patterson et al., 2007 ), may process video contents at a relatively low-level regarding multisensory information integration ( Olson et al., 2007 ), recognition of familiar objects (both faces and scenes) ( Nakamura et al., 2000 ). In addition, mounting studies suggest that the TP is also associated with social cognition and emotion process ( Olson et al., 2007 ;Vogeley et al., 2001 ;Völlm et al., 2006 ;Wong and Gallate, 2012 ). For example, a study employing empathy-evoking film clips combined with music and text to investigate the role of TP in social cognition showed that TP acts as a hub to integrate visual, auditory, and contextual information. Furthermore, the dynamic causal modeling (DCM) and Bayesian Model Selection (BMS) results revealed increased effective connectivity from TP to fusiform gyrus when it comes to contextual information, indicating a top-down modulation on the ventral visual stream ( Pehrs et al., 2017 ). Such a top-down signal carries rich information to help us to interpret the visual scene ( Gilbert and Li, 2013 ). In light of these findings, we speculate that TP is involved in both bottom-up and top-down processing -on the one hand, the TP has a top-down modulation on ventral visual pathway; on the other hand, it relays information to higher-level cognitive areas (MPFC, for example) to achieve a deeper understanding and evaluation of video contents.
In addition to higher activation in PCC, dMPFC, and temporal lobule, we also observed lower activation in DMN subregions (e.g. vMPFC), dACC, caudate, and part of the thalamus under PV condition ( Fig. 2 ). The suppression of DMN subregions seems to be a mechanism through which the brain moderates some internal activity so that the externallyoriented cognitive function can reach optimization ( Anticevic et al., 2012 ). In contrast, the ACC is typically associated with monitoring action and signaling when encountering error or conflict ( Bush et al., 2000 ;MacDonald et al., 2000 ). Reduced activation in ACC, caudate, and thalamus was found during various inhibition tasks ( Hart et al., 2013 ), suggesting these regions are involved in attention and inhibition control. Furthermore, the PPI analysis indicated a reduced coupling between three DMN nodes and ACC and precuneus under the PV condition comparing to the GV condition. Such a decreased DMN coupling with these regions may contribute to excessive video watching behavior as these regions are important for cognitive control ( Carter and van Veen, 2007 ;MacDonald et al., 2000 ) and conscious awareness ( Cavanna and Trimble, 2006 ;Vogt and Laureys, 2005 ). Taken together, it is plausible that the reduced activation of these regions might collectively play a role in attention retaining when watching videos and contribute to deep engagement and immersion.
Outside of the DMN, the cerebellum (VI, Crus I, and Crus II, Fig. 2 c) showed greater activation when watching personalized short videos compared to generalized short videos. Although it has been well-known that the cerebellum plays an important role in movement control, converging evidence has supported its involvement in cognitive function, such as working memory, attention, language, and emotional processing ( Baillieux et al., 2008 ;Stoodley, 2012 ;Stoodley and Schmahmann, 2009 ). Recently, researchers have suggested that the cerebellum has a direct projection to the VTA, and plays a role in addiction and reward processing ( Carta et al., 2019 ). The role of the cerebellum in addiction-like behaviors (including uncontrollable video watching) warrants more studies in the future.
Lastly, our complementary analysis revealed a selective activation of VTA to personalized videos. Another area containing dopamine neurons, SN, displayed significant activation when watching short videos independent of types. Numerous studies have revealed the pivotal role of VTA and SN in reinforcement learning and motivated behavior ( Redgrave and Gurney, 2006 ;Wise, 2004 ). Matsumoto and Hikosaka (2009) found that dopamine neurons in VTA and SN play different roles in conveying motivational signals, with saliency-coding neurons preferentially in SN and reward value-coding neurons in VTA. Given that both PV and GV contained dynamic visual and auditory stimuli, the strong activation of SN may reflect the neural response to saliency of video content. The selective activation in response to personalized video in VTA might indicate a difference in value-based representations between personalized videos and generalized videos. Besides, the VTA and SN dopamine neurons have different afferent inputs, which might partly account for their different response to the video stimuli. Interestingly, the VTA projection region, NAc, did not show significant activation under either condition. In contrast to the mesolimbic dopaminergic pathway that originates from VTA to NAc, there also exists a meso-cortical dopaminergic system, which originates in VTA and extends to dorsal and ventral PFC, cingulate and perirhinal cortex ( Arias-Carrión et al., 2010 ). Collectively, it is an interesting topic for future studies to elaborate on the differences of the excitatory inputs into the VTA-NAc and VTA-MPFC pathways in behavioral addiction.
This study has several limitations. First, we recorded two 6-min videos as experiment materials in this study. Although we strictly limit the time of two types of videos and try to exclude other confounding factors, there still exists a possibility that not all these recommended videos were the user's favorite. It is challenging to design an fMRI experiment that is highly close to real circumstances. The second limitation is participants are mostly college students whose ages range from 19 to 30. The relatively narrow age range makes the finding hard to generalize to the less-educated adults and younger students. Children and the aged population should be considered in future studies. The third limitation is that we only concern the acute effect of short videos in the present study. Generally, the development of addiction-like behavior involves a transitional process from accidental exposure to habitual and uncontrolled use. More studies to compare the brain activation map between severe problematic TikTok users and self-disciplined users are warranted.

Conclusion
To the best of our knowledge, this is the first study to explore the neural activity of watching short videos from a neuroimaging perspective. Comparing with non-personalized videos, the recommended userspecific videos not only activate the dMPFC, PCC, and bilateral parietal and temporal regions that collectively compose the DMN, but also enhance the functional couplings between DMN and primary visual and auditory regions as well as the frontoparietal network. In addition, both types of videos activate SN but only personalized videos activate VTA. These results suggest that the recommender algorithm is able to discover contents to up-regulate the activity of a set of DMN subregions and VTA to reinforce video-watching behavior. The DMN subregions might serve as the neural underpinnings of the high-level processing of personalized video perception. In sum, our data provides a new perspective to understand the brain activity evoked by the presentation of personalized video contents, which may shed light on the neural mechanisms underlying excessive video use and abuse.

Data and code availability statement
Data are available upon reasonable request to corresponding author Dr. Yuzheng Hu (E-mail: huyuzheng@zju.edu.cn).

Declaration of Competing Interest
The authors declare no competing interests.