Daily Social Isolation Maps Onto Distinctive Features of Anhedonic Behavior: A Combined Ecological and Computational Investigation

Background Loneliness and social isolation have detrimental consequences for mental health and act as vulnerability factors for the development of depressive symptoms, such as anhedonia. The mitigation strategies used to contain COVID-19, such as social distancing and lockdowns, allowed us to investigate putative associations between daily objective and perceived social isolation and anhedonic-like behavior. Methods Reward-related functioning was objectively assessed using the Probabilistic Reward Task. A total of 114 unselected healthy individuals (71% female) underwent both a laboratory and an ecological momentary assessment. Computational modeling was applied to performance on the Probabilistic Reward Task to disentangle reward sensitivity and learning rate. Results Findings revealed that objective, but not subjective, daily social interactions were associated with motivational behavior. Specifically, higher social isolation (less time spent with others) was associated with higher responsivity to rewarding stimuli and a reduced influence of a given reward on successive behavioral choices. Conclusions Overall, the current results broaden our knowledge of the potential pathways that link (COVID-19–related) social isolation to altered motivational functioning.


S1. Exclusionary criteria
Exclusionary criteria were (a) history or presence of serious medical conditions; (b) self-reported formal diagnosis of psychiatric disorder or problematic substance use; (c) neurological disorders; (d) use of drugs/medications; and (e) pregnancy or breast-feeding.Participants were informed that they could win up to 20 euros for their participation.

S2. Revised UCLA Loneliness scale
Examples of items are "How often do you feel that you lack companionship?" or "How often do you feel outgoing and friendly?",and answers are provided on a Likert scale from 0 = Never to 4 = Often.
The total score ranges from 20 to 80.

S3. Questions regarding the COVID-19 situation
The following questions about individuals' lifestyle, social isolation (objective or perceived) and recent changes in habits due to the COVID-19 outbreak and subsequent restrictions were administered: 1. Do you live alone?

S4. Probabilistic Reward Task
The PRT was administered in the laboratory via E-Prime (version 3.0).As depicted in Figure S1, the task involves 300 trials, divided into 3 blocks separated by 30-sec breaks.In each trial, participants saw a cross for 1000-1400 ms in the center of the screen, then a cartoon face without the mouth for 500 ms.After that, a mouth appeared for 100 ms.The mouth could be short (10.00 mm) or long (11.00 mm).Next, the cartoon face without the mouth appeared again for an additional 1500 ms.The difference between the short and long mouths was small (just 1 mm), making it difficult to distinguish them (22).Participants were asked to decide if the mouth was short or long by pressing a button ("v" or "m") on the keyboard, counterbalanced across participants.Importantly, not every accurate answer receives a reward.Moreover, and unbeknownst to participants, correct identification of one of the mouth lengths (defined as the "rich stimulus") was rewarded three times more frequently compared to the other ("lean stimulus").Within each block, the short and long stimuli were presented equally often in a pseudorandomized sequence with the constraint that the same stimulus was not presented more than three times consecutively.For each block, reward feedback ("Correct!!You won 20 cents") was presented after 40 correct responses for 1500 ms after the participant's choice and was followed by 250ms of blank screen.During each block, correct identifications of the rich stimulus received reward feedback for 30 times while correct identifications of the lean one were followed by the positive feedback only 10 times.A controlled reinforcer procedure was used to provide reward feedback according to a pseudorandom schedule that determined which specific trials were to be rewarded for correct choices.If a participant failed to make a correct identification for a trial in which feedback was scheduled, reward feedback was delayed until the next correct response of the same stimulus type (rich or lean).When the reward was not given because the participant was inaccurate (or accurate, but no feedback was scheduled) a blank screen was displayed for 1750 ms.The total task duration was approximately 24 minutes.Participants were informed that they could potentially win up to €20; however, the actual reward was fixed and not contingent on their performance.Therefore, at the conclusion of the task, all participants received a monetary reward of €20.
Response Bias (log b) was calculated as: In addition, discriminability (log d) was computed as a control measure of participants' ability to discriminate between the two stimuli.Discriminability reflects task difficulty and was calculated as: � ( + 0.5) × ( + 0.5) ( + 0.5) × ( + 0.5) �, 0.5 was added to each variable in order to make the calculation of the Response Bias and discriminability possible in cases in which one of the raw cells was equal to 0.
Following previous studies, secondary analyses evaluating accuracy and reaction times (RT) were performed to assess overall task performance.Overall, trials with RTs less than 150 ms or longer than 1500 ms were excluded to remove outliers; then, remaining trials with RTs (following natural log transformation) falling outside the mean 3 ± SD were considered as additional outliers and excluded.

S5. Computation Modeling of PRT
The 'Stimulus-Action' model adopts the standard Rescorla-Wagner premise.Both stimuli were assumed to be totally distinct, and rewards were associated with separate stimulus-action pairs.The 'Action' model proposed that participants only learned the value of their actions when forming expectations, independently of the stimuli.The third model, 'Belief', assumed that participants are uncertain of which stimulus was actually presented on each trial and, thus, the rewards were associated with a combination of two uncertainty-weighted stimulus-action associations.Finally, the 'Punishment' model examined whether trials in which no reward was delivered were treated as aversive losses.

S6. Computation of Learning Rate and Reward Sensitivity in the transformed space
Learning rate is not constrained to the range of 0-1 and reward sensitivity is not constrained to the range of 0 to +inf.Instead, both parameters are unconstrained from -inf to +inf, but larger values still indicate greater learning rate and reward sensitivity.Assuming  is the actual learning rate with a constraint from 0 to 1, we have presented learning rate in the transformed space as log  1− , which is unconstrained from -inf to +inf.Similarly, assuming  is the actual reward sensitivity with a constraint from 0 to +inf, we have presented reward sensitivity in the logtransformed space as log , which is unconstrained from -inf to +inf.
The role of reward sensitivity and learning rate in achieving optimal performance in the task has been explored.Optimal performance was defined as a 3:1 response ratio of rich to lean choices, reflecting the reward probabilities in the task setup (rich trials rewarded at 60% and lean trials at 20%).This ratio is indicative of optimal decision-making, where choice allocation proportionally matches the reward probabilities -a principle derived from the probabilistic matching law.
For the simulations, we varied RS and LR around the mean values observed in our sample data.Specifically, RS and LR were varied by ±1 and ±2 standard deviations from these means.This approach was chosen to reflect a range of behavior that spans from typical to extreme within our observed participant data, thus providing insights into how deviations from the mean affect task performance.
We generated 500 surrogate datasets for each of the 25 RS-LR combinations, with each dataset consisting of 300 trials.We analyzed these datasets using a sliding window of 50 trials to monitor the evolution of the rich:lean response ratio.The results highlighted a nuanced dynamic (Figure S3).For example, if you look at the third row, when learning rate is kept constant at the mean, excessively high RS led participants to disproportionately favour the rich option, deviating from the optimal 3:1 ratio (indicated by the red horizontal line in the plots).Conversely, too low RS resulted in insufficient responsiveness to rewards, failing to leverage the richer option adequately.
Regarding LR, if you look at the third column, when RS is kept constant at the mean, high values of LR led to overfitting to recent outcomes, causing fluctuations and instability in choice patterns where the rich:lean response ratio is close to 1. On the other hand, low values of LR hindered timely adaptation to the task's reward contingencies.
These findings illustrate the importance of balancing RS and LR to achieve a response ratio that aligns with the optimal strategy defined by the reward probabilities of the task.

S7. Outlier management
A single outlier at the level of enduring individual differences (i.e., between-persons level) was identified as potentially influential data point using standardized scores >|3| for univariate outliers.
Multivariate outliers were identified based on their multivariate Mahalanobis distance (for p < .01).
While no multivariate outliers were detected, a potentially influential subject was identified for Valence.Removing this subject from the analyses did not change the substantive results, and differences between the zero-order correlations calculated on the final sample and those excluding the univariate outlier did not exceed |.02|.

S8. Ecological Momentary Assessment data analysis
EMA observations (Level 1 or within-person level) were nested within individuals (Level 2, or between-person level).While loneliness and ΔResponse Bias (as well as its subcomponents, reward sensitivity and learning rate) represents between-person variables by design (i.e., they were measured once), EMA measures represent within-person variables.In both Model 1 and Model 2, the between-person (stable) components of valence and duration of daily social interaction were modeled at the between-person level (i.e., they were centered at their latent means), and their within-person components were specified as unstructured at Level 1 of analysis (i.e., the withinperson level was fully saturated).
Participants with < 30% of valid assessments on the EMA measures were not retained for this analysis; however, findings are the same in terms of effect sizes, statistical, and practical significance when a more stringent criterion (<25%) is adopted.

S9. Limitations of the study
The study relies on self-reported measures using a convenience sample, and data were collected in a single national context.Moreover, it is difficult to completely disentangle objective and perceived social isolation in an unselected sample in which the mismatch between actual and perceived social contacts is expected to be lower compared to what is usually seen in clinical populations (1).
Notwithstanding, thanks to the use of an EMA design to capture ongoing daily levels of interactions, distinctive patterns of results for actual and perceived social isolation were found.However, the EMA design does not include inquiries about participants' modes of communication, specifically, whether interactions occurred in person or online.This consideration should have been taken into account, as some research suggests that online interactions may detrimentally affect the sense of social bonding, particularly when employed to alleviate social disconnection (2,3).The existing literature is inconsistent on this topic; while several studies support the compensatory theory of technology use in terms of mental well-being and reduced loneliness during the pandemic (4,5), others demonstrate no correlation between the frequency of virtual social interaction and perceived isolation (6).The last limitation is indeed the fact that the present study was conducted during COVID-19 social restrictions (7,8).It has to be noted, however, that studies investigating the role of social relationships during the pandemic showed how perceived social support and social network size may act as a buffer against COVID-related physical and mental health concerns, fatigue and general psychological distress (9,10).

2 . 3 . 4 . 5 . 6 . 7 .
If you do not live alone, with whom do you live?Has the spread of the virus and the restrictions forced you to spend a lot of time alone?Has the spread of the virus and the restrictions significantly impacted your lifestyle and led to substantial changes?Have you ever been in isolation because you contracted COVID-19 or came into contact with someone who tested positive?In the past 6 months, have you been in isolation because you contracted COVID-19 or came into contact with someone who tested positive?Are you currently in isolation?

Figure S1 .
Figure S1.Illustration of the Probabilistic Reward Task (A) and graphical representation of the current results in terms of Response Bias (B), Accuracy (C), Discriminability (D), and Reaction Times (E).

Figure S2 .
Figure S2.Completely Standardized Estimates from the EMA Model 1.