The synaptic correlates of serial position effects in sequential working memory

Sequential working memory (SWM), referring to the temporary storage and manipulation of information in order, plays a fundamental role in brain cognitive functions. The serial position effect refers to the phenomena that recall accuracy of an item is associated to the order of the item being presented. The neural mechanism underpinning the serial position effect remains unclear. The synaptic mechanism of working memory proposes that information is stored as hidden states in the form of facilitated neuronal synapse connections. Here, we build a continuous attractor neural network with synaptic short-term plasticity (STP) to explore the neural mechanism of the serial position effect. Using a delay recall task, our model reproduces the the experimental finding that as the maintenance period extends, the serial position effect transitions from the primacy to the recency effect. Using both numerical simulation and theoretical analysis, we show that the transition moment is determined by the parameters of STP and the interval between presented stimulus items. Our results highlight the pivotal role of STP in processing the order information in SWM.


Introduction
Sequential working memory (SWM), a function responsible for temporarily storing and manipulating information in a specific order (Stephan and MJB, 1989;Jensen and Lisman, 2005;Logan, 2021), plays a fundamental role in brain cognitive functions, such as reasoning, comprehension and learning (Alan, 2003;Curtis and Lee, 2010;Potagas et al., 2011;Calmels et al., 2012;Ru et al., 2022).SWM supports human mental processes by providing an interface between perception, long-term memory and actions (Tsetsos et al., 2012).The memory recall paradigm is widely utilized to investigate the storage of multiple items in SWM (Endel and CFI, 2000;Pantelis et al., 2008;Henson, 2013), where subjects are required to retrieve previously presented information.A large volume of psychological experiments has demonstrated that the retrieval performances of human subjects are associated with the order of presented items, displaying a serial position effect, namely, subjects exhibit better performances for memory items appearing at the beginning or at end of a sequence, called the primacy or recency effect, respectively (Simon, 1962;Postman and Phillips, 1965;Glanzer and Cunitz, 1966).This serial position effect is observed in various types of working memory systems, including visual (Kiani et al., 2008), auditory (Hurlstone et al., 2014;Borderie et al., 2024), and spatial working memories (Groeger et al., 2008).The serial position effect is a well-established phenomenon in memory research, yet its underlying neural mechanism, contextual variation, and functional implication remain largely unresolved.
A number of psychophysical experiments have indicated that the serial position effect of SWM is affected by factors related to attention, context and interference.For instance, the attentional gradient, i.e., a gradual decrease in attention level as different items are presented during the encoding period, affects the primacy effect.Contents can also affect participants' retrieval performances, with the recency effect observed when the recall cue is the item order, while the primacy effect observed when the recall cue is the relative size of a specific feature of the item (Cowan et al., 2002;Li et al., 2021).Interference between items affects the recency effect (Gorgoraptis et al., 2011).Additionally, some variations in the experimental paradigm can affect the serial position effect, such as, increasing the inter-stimulus interval during the encoding can weaken the primacy effect but not the recency effect (Glanzer and Cunitz, 1966); prolonging the maintenance period in a visual sequence working memory task can shift participants' performances from the recency to the primacy effect (Knoedler et al., 1999), and this recency-primacy shift were observed in experiments including auditory, verbal, and text materials (Knoedler et al., 1999;Storm and Bjork, 2016).Finally, distractions at different time points during the maintenance period can lead to fluctuations in participants' retrieval accuracy (Lui et al., 2023).
Up to now, the neural mechanism underlying the serial position effect in SWM remains largely unclear.The perspective of "limited resource" proposed that the differential allocation of memory resources across multiple items governs their relative recall precision, thereby leading to the primacy and recency effects as observed in SWM tasks (Gorgoraptis et al., 2011;Ma et al., 2014;Lee et al., 2020;Wang et al., 2021).Another study proposed an attractor network model with firing rate adaptation, which explains the power law of recall capacity (Naim et al., 2020), as well as the primacy and recency effects in human free recall (Boboeva et al., 2021).Nevertheless, these studies did not explain how the detailed dynamics of a neural circuit account for the serial position effect.Recent studies have suggested that working memory is mediated by rapid transitions in "activity-silent" neural states (Wolff et al., 2017;Barbosa et al., 2020), and the strength of hiddenstate representation predicts the accuracy of working memoryguided behavior, including recall precision, i.e., the primacy and recency effects (Stokes, 2015;Wolff et al., 2015Wolff et al., , 2017;;Katkov et al., 2017;Naim et al., 2020).The synaptic mechanism of working memory posits that information is encoded in the facilitated synaptic connections between neurons, rather than in the persistent responses of neurons (Mongillo et al., 2008;Mi et al., 2017).These works mainly studied the neural mechanism for storing and manipulating a single memory item.However, the neural circuit dynamics underlying the serial position effect during the storage of multiple (>2) memory items has not been investigated, which is the focus of the present study.
In this work, we adopt the view that information resides in hidden states of a neural circuit (Stokes, 2015) and is expressed by facilitated synapses between neurons (Mongillo et al., 2008).Specifically, we develop a model of continuous attractor neural network (CANN) with short-term synaptic plasticity (STP).Utilizing a delay task paradigm, we investigate the serial position effect in SWM.In the delay task, the whole period is divided into stimulus encoding, maintenance, and retrieval/response phases.During the encoding phase, participants sequentially receive and encode multiple items into their working memory.After a maintenance period, they are prompted to recall the items, with each item's recall performance indicating the precision of the corresponding memory representation.Typically, participants are required to retain a specific attribute of each item in the sequence, such as visual orientation, direction, or spatial location.Our model shows that with the prolongation of the maintenance period, the serial position effect gradually shifts from a significant primacy effect to a significant recency effect, with the recency effect diminishing in significance over time.This agrees well with the experimental finding.We further analyze that the transition moment of the serial position effect is predominantly determined by the STP dynamics and the interitem interval of presenting stimuli.Our study highlights the important role of STP plays in processing the order information in SWM.

The model
To elucidate the neural mechanism underlying the temporal dynamics of the serial position effect in SWM, we adopted a continuous attractor neural network (CANN) with short-term synaptic plasticity effect (STP).CANNs are a canonical model for neural information storage and representation (Wu et al., 2013(Wu et al., , 2016) (Figure 1A), which has been successfully applied to describe the encoding of continuous features in neural systems, such as orientation (Ben-Yishai et al., 1995), moving direction (Georgopoulos et al., 1986), head direction (Taube et al., 1990), and spatial location of objects (Bottomley, 1987).Additionally, CANNs has been extensively used to model the neural mechanism of working memory (Mi et al., 2017;Li et al., 2021).STP is a ubiquitous phenomenon in neural systems, referring to the property that synaptic efficacy between neurons dynamically changes over time in a way that reflects the firing history of the pre-synaptic neuron (Figure 1B).Based on the property of STP, Mongillo et al. proposed a synaptic mechanism of working memory, stating that a neural circuit need not to maintain energy-intensive firings during the entire period of the task for memorizing stimuli, rather the neural circuit can utilize facilitated synaptic connections to retain information (Mongillo et al., 2008).The alteration in synaptic strength induced by STP is a relatively slow process that temporarily modifies the network's connectivity pattern, leading to the network's computation relying on the history of external inputs.Combining CANNs with STP, computational models have elucidated the maintenance and manipulation of working memory  u(θ, t) remains at a high level for an extended duration.(Mi et al., 2017), and the phase precession phenomenon in hippocampus (Chu et al., 2022). .

Continuous attractor neural networks
Consider a one-dimensional continuous stimulus θ , such as the visual orientation, is encoded by an ensemble of neurons.All excitatory neurons in the CANN (the red circles in Figure 1A) are aligned in a ring according to their preference under the periodic boundary condition, i.e., θ ∈ [−π/2, +π/2), and they are all reciprocally connected to a global inhibitory neuronal pool (the black circle in Figure 1A).Denotes h E (θ , t) as the synaptic inputs at time t of excitatory neurons at θ .The dynamics of h E (θ , t) is determined by a decay term, the recurrent current from other neurons, the inhibitory input from the global inhibitory neuronal pool and the external input, which is written as, where τ denotes the time constant of neurons, ρ the neuronal density, I 0 + σ 0 η 0 (θ , t) the background input, with η 0 (θ , t) the Gaussian white noise of zero mean and unit variance and σ 0 the corresponding noise strength.I ext (θ , t) refers to the external input, such as the visual stimulus during the encoding period and the cue during the recalling period.J EI is the synaptic strength from the global inhibitory neuronal pool to excitatory neurons.r E (θ , t) is the firing rate of neurons with preference at θ , and its relationship with the synaptic current is given by r E (θ , t) = α ln[1+exp(h E (θ , t)/α)], which is a smoothed thresholdlinear function.
Due to this characteristic topological structure, a CANN can hold a continuous family of stationary states, metaphorically understood as a valley of local minima in the network's energy landscape.J and J 0 determine the synaptic connection strength between neurons, while B controls the synaptic interaction range.Neurons with similar preferences have stronger synaptic connections, while those with significantly different preferences have weaker connections.
The synaptic input to the global inhibitory neuronal pool is denoted as h I , with r I the corresponding firing rate, and r I (t) = α ln[1 + exp(h I (t)/α)].The dynamics of the global inhibitory neuronal pool is written as, where τ denotes the time constant of the inhibitory neuronal pool, J IE the connection strength from excitatory neurons in the ring to the inhibitory neuronal pool.The global inhibitory neural pool plays a crucial role in maintaining a balanced state between excitation and inhibition in the network, thereby preventing excessive neuronal firing.Additionally, it fosters competition among different groups of excitatory neurons, ensuring that only one memory item is represented at a given moment. .

Short-term synaptic plasticity
Two types of STP, known as short-term facilitation (STF) and short-term depression (STD), have been observed in various Frontiers in Computational Neuroscience frontiersin.orgZhou et al.
. /fncom. .cortical areas.STF is caused by the influx of calcium into the synaptic terminal of the pre-synaptic neuron following spike generation, which increases the release probability of neurotransmitters.On the other hand, STD is caused by the depletion of neurotransmitters at the synaptic terminal of the pre-synaptic neuron after spike generation.
In the model proposed by Mongillo et al. (2008), the STF effect is modeled by u(θ , t) (u ∈ [0, 1]), which indicates the release probability of neurotransmitters from pre-synaptic neurons at θ , and STD is modeled by x(θ , t)(x ∈ [0, 1]), indicating the fraction of available neurotransmitters in pre-synaptic neurons.The dynamics of STF and STD are given in Equation 4, where τ f and τ d denote the time constants of STF and STD, respectively, and U 0 the increment of u caused by spiking of the pre-synaptic neuron.When a neuron at θ receives external inputs, its firing rate (r E (θ , t)) increases.The increase in firing rate results in an increase in the release probability of neurotransmitter u(θ , t) (with the increment determined by U 0 ), leading to the STF effect, while the proportion of available neurotransmitter x(θ , t) decreases, leading to the STD effect.Subsequently, u(θ , t) decays to its baseline of 0 with a time constant τ f , and x(θ , t) returns to its baseline of 1 with a time constant τ d , as illustrated in Figure 1B.The product of u(θ , t) and x(θ , t) represents the instantaneous synaptic efficacy at time t, i.e., Ju(θ , t)x(θ , t), which reflects the strength of memory representation in the network (Mi et al., 2017;Li et al., 2021).
To elucidate the neural mechanism of SWM, we selected parameters consistent with the synaptic connectivity between neurons in the prefrontal cortex (PFC), which is the primary crotical region involved in working memory (Wang et al., 2006).We adopted the STF dominant parameters as in the model proposed by Mongillo et al. (2008), i.e., τ d ≪ τ f and a smaller U 0 .This implies that after neuron firing, the synaptic connection efficacy is maintained at a high value for an extended period to sustain memory information.
In our numerical simulation, we model the CANN by considering N neurons uniformly distributed in the range of [−π/2, π/2) in the feature space.The integration in Equation ( 1) is computed by, The parameters are given in Supplementary Tables 1, 2.

The serial position e ect in SWM
Based on the above model of CANN with STP, we investigated the serial position effect in SWM.We first studied the case of two memory items and later generalized the study to the case of multiple items.We utilized the same paradigm as in the psychophysical experiments for SWM (Li et al., 2021), and investigated the recall accuracy of items based on their visual orientations, as illustrated in Figure 2A.
In each trail, two stimuli with different orientations (referred to as θ 1 , θ 2 ) are sequentially presented in the encoding period.Following a delay period, a visual recall cue lasting for T recall is presented, and the network retrieves either the first or second visual item based on the recalling cue.Let T encode denote the duration of presenting each item, T gap the time interval between two items, and T maintain the duration of the delay period.The orientation values of two stimuli are set as: θ 1 is randomly selected from the range [−π 2 , π 2 ), and θ 2 = θ 1 + θ , where θ is the difference between two stimuli, randomly selected from the data set et al., 2021).The visual stimulus in the encoding period and the cues in the recalling period are denoted as I ext (θ , t), which are written as, where θ type (type = encode, recall) represents the visual stimulus during different periods.The parameters a type (t) and B type regulate the strength and accuracy of external signals, respectively.A larger a type (t) and B type result in more precise encoding of orientation information from the stimulus.As the recalling signals are unrelated to the task, the parameters are set as follows: a encode ≫ a recall , B encode > B recall (Li et al., 2021).
When two visual stimuli are presented sequentially, the neural network generates successive bump-shaped neural activity patterns.The peaks of these bumps correspond to θ 1 and θ 2 , respectively, as illustrated in Figure 2B.Due to the strong interactions among neurons with similar preferences and weak interactions among those with significantly different preferences in the network, we define a neuronal group G i as the ensemble of neurons whose preferred values satisfy |θ − θ i | ≤ (i = 1, 2), and this group of neurons primarily encodes the ith item.The corresponding neural activity and synaptic strength are calculated to be r i (t) = respectively, with m i representing the number of neurons in G i .After removing stimuli, the firing rates of both neuronal groups gradually decrease to zero (Stokes, 2015;Wolff et al., 2015).However, their synaptic strengths remain at high levels due to STF, which maintain the stimulus information (see Figure 2C).During the recalling period, the network generates a weak bumpshaped activity pattern in response to the recalling cue, and the retrieved orientation (denoted as θ recalled i , i = 1, 2) is decoded using the population vector method, with details given in the Supplementary material 1.1.
We investigated how the maintenance duration T maintain affects the recall performance.We set T maintain as a variable ranging from 0 to 10 s and selected 11 values within this range.For each chosen value of T maintain , we evolved the network for 50 times (corresponding to 50 different participants in a psychophysical experiment), each consisting of 300 trials.We utilized the normalized target probability method (Bays et al., 2009;Schneegans and Bays, 2016) to calculate the recall performance of each item (i.e.,θ recalled i , i = 1, 2).More details see Supplementary material 1.2.We found that (as shown in Figure 2D): • When T maintain is smaller than a critical value denoted as T c , i.e., T maintain < T c , the recall performance exhibits the primacy effect, indicating that participates memorize the first item more accurately.Moreover, the primacy effect becomes more pronounced as the value of T maintain decreases.• When T maintain > T c , the recall performance shifts to the recency effect, indicating that participates memorize the second item more accurately.The significance of the recency effect gradually decreases as T maintain increases.
We further utilized the methods of Circular Variance (CV) and Circular Kurtosis (CK; Berens, 2009) to calculate the accuracy of recall performance.The statistical results are consistent with those shown in Figure 2D (more details see Supplementary material 1.2 and Supplementary Figure 1).In conclusion, with the increase of the maintenance period, the serial position effect in SWM dynamically shifts from the primacy effect to the recency effect.
To further reveal the neural mechanism underlying the dynamical change of the serial position effects, we calculated the relative synaptic efficacy of two neuronal groups encoding two stimuli (θ 1 and θ 2 ) over time, denoted as Jux(t) = Jux 2 (t) − Jux 1 (t) hereafter.The synaptic mechanism of WM posits that information is maintained in the facilitated synaptic interactions between neurons, with the synaptic efficacy determining the accuracy of the memorized item (Teyler and Discenna, 1984;Henry and Misha, 1996;Mongillo et al., 2008).For example, when two items (θ 1 and θ 2 ) are presented sequentially in a trail (Figure 2C), the transient synchronous firing of a neuronal group (G 1 or G 2 , respectively) leads to rapid decrease in synaptic efficacy, due to the depletion of available neurotransmitters in neurons.After the visual stimulus disappears, the synaptic efficacy of the neural group recovers to the maximum value (Jux max i ) and maintains at a high level for an extended period.We calculated the synaptic efficacy between two neuron groups Jux(t) during the maintaining period (Figure 2E) and found that: • Since the second item is presented later, its synaptic efficacy Jux 2 (t) is smaller than that of the first one Jux 1 (t) before it recovers to the maximum value.Therefore, Jux(t) < 0 when t < T ′ c (SIM), where T ′ c (SIM) is the moment when Jux(t) ≡ 0.
• As the time t approaches T ′ c (SIM), both Jux(t) and its variance approach zero.When t > T ′ c (SIM), Jux(t) first increases and then gradually diminishes over time.
The critical moment T ′ c (SIM) at which Jux(t) = 0 coincides with the value of T c at which the recall performance transfers from the primacy effect to the recency effect, as shown in Figures 2D, E. Specifically, • When the maintenance period T maintain is smaller than a critical value, i.e., T ′ c (SIM) & T c , the recall performance exhibits the primacy effect; and the greater the value of Jux(T maintain ) is, the more significant the primacy effect becomes.
• As T maintain approaches the critical value and Jux(T maintain ) approaches 0, the recall performance switches to the recency effect and is no longer significant (Figure 2D).• When T maintain is much larger than the critical value and meanwhile Jux(T maintain ) ≫ 0, the recall performance displays the recency effect.The greater the value of Jux(T maintain ) is, the more pronounced the recency effect becomes.
It is worth noting that the temporal shift of the serial position effect is independent of the orientation difference between two visual stimuli, which is consistent with the results in Figures 2D, E.More details see Supplementary material and Supplementary Figure 3.
We further investigated the detailed dependence of the transition from the primacy to the recency effect on the model and experimental parameters, including the time interval between two stimuli (T gap ) and the parameters of STP (i.e., τ f and τ d ; Figure 3).For each set of parameters {τ f , τ d , T gap }, we calculated the memory accuracy of participants when the recall cue was given at different times (i.e., T maintain ).For each given T maintain , we simulated the network 50 times, each consisting of 300 trails.We then calculated the transition moment from the primacy to the recency effect (i.e., T c ) and the critical moment (T ′ c (SIM)) when Jux(t) ≡ 0. T c is calculated using different statistical methods, such as CK, CV and P.
We found that both T c and T ′ c (SIM) increase with τ f and τ d , respectively, as shown in Figures 3A, B, and they both decrease with T gap (Figure 3C).Furthermore, T c is approximately equal to T ′ c (SIM) for each given parameter set, suggesting that the relative synaptic efficacy between two neural groups (G 1 and G 2 ) determines the recall performance.In conclusion, the shift of the serial position effect is determined by STP (i.e., τ f , τ d ) and the inter-stimulus interval (T gap ).Notably, as depicted in Figure 3, the transition from the primacy to the recency effect coincides with the time constant of STD (τ d ), precisely aligning with its time order.

. Theoretical analysis
In the above, we have utilized a simplified mean-field model to elucidate the neural mechanism of SWM, there are still many variables and parameters involved, including the time constants of STF and STD (τ f , τ d ), the time constant of a single neuron (τ ), the connection strength between neurons (i.e., J,J 0 ,J EI ,J IE , etc.), the duration of loading each stimulus (T encode ), and the time interval between adjacent stimuli (T gap ).If we continue using numerical simulations, it will be very time-consuming to explore how the recall performance depends on all these variables.We therefore conducted theoretical analysis to elucidate how the critical moment T c depends on various variables that lead to the shift of the serial position effect.The advantage of theoretical analysis lies in its In response to each stimulus, the network generates successive bump-shaped neural activity patterns centered at θ i , respectively.During the delay period, the neural activity gradually decays to a silent state.(C) The temporal course of synaptic e cacies of neural groups encoding two visual stimuli throughout the trail.When a stimulus is presented, the synaptic e cacy of the corresponding neural group rapidly decreases.After the stimulus is removed, the synaptic e cacy gradually recovers to a maximum value, denoted as Jux max i , within a certain period of time.Subsequently, it remains at a high level for an extended duration.(D) The recalling performance at varying T maintain .T c denotes the critical moment of recall performance shift from primacy to recency e ect, and T maintain ∈ { .s, .s, .s, .s, .s, s, s, s, s, s, s}.(Top) The normalized target probability of the ith presented item, denoted as P i for i = , , at varying T maintain .(Bottom) The normalized target probability di erence between the st and nd item (P − P ) at varying T maintain .(E) The relative synaptic e cacy ( Jux(t)±SEM) of the neuronal groups encoding the two visual stimuli during maintaining period.T ′ c (SIM) denotes the critical moment at which Jux(t) ≡ .The parameters settings see Supplementary Tables , .(n.s.: p > ., *: .<p < ., **: .<p < ., ***: p < . ).
power of prediction, which can be validated with experiments.Specifically, we focused on examining the dependence of T c on the variables τ f , τ d , and T gap .
To carry our theoretical analysis, we simplified the model of CANN with STP (i.e., Equations 1, 3, 5) into a model composed of multiple neuronal groups.In this simplified model, each ith neuronal group (G i , i = 1, • • • , M with M the number of items in the SWM task) encodes the item, and the interaction and overlap between different neuronal groups are ignored.This is because that the orientation difference between two stimuli does not impact much the recall performance, as shown in Supplementary Figure 3.Note that, the value of the visual orientation difference θ between memory items reflects the extent of interaction between neural groups encoding them.Using the experimental paradigm with two items as an example, we showed that the recall performances (Supplementary Figure 3) for θ being a random number in the range of [−π , +π ] are consistent with that when θ takes a large value (Figures 2,  3 indicating that the ignoring the interactions between neural groups is a proper approximation for theoretical analysis.Indeed, representation of too many items will introduce unignorable overlaps between neural groups.However, consider the limited capacity of working memory (∼4 items, Zhang and Luck, 2008), ignoring the interactions between neuron groups is feasible in the theoretical analysis.The STP effect is considered in each neuronal group.Moreover, Figures 2, 3 have shown that the firing rates of different neuronal groups during the maintaining period approach zero (i.e., r i (t) → 0Hz), and this allows us to disregard the change in firing rates over time.On the other hand, the relative memory accuracy of participates for multiple items is mainly determined by the relative synaptic efficacy of neuronal groups during the maintaining period.Therefore, we only need to consider the dynamics of STP of different neuronal groups (G i ) during the maintaining period, For each given parameter { τ d , τ f , T gap }, we computed the memory accuracy for di erent maintenance times T maintain when the recall cue was presented.For every selected T maintain , the network was simulated times, with each simulation comprising trials.More details see Supplementary material .
which is written as: where u i (t) and x i (t) denote the STF and STD effects of the ith neuronal groups at time t, respectively, and t is aligned to the presence of the sencond item.We first examined the case of two items (i.e., M = 2).In accordance with Equation ( 6), the synaptic efficacy of neuronal group G i is calculated as Jux , where u 0 and x 0 represent the values of STF and STD of the ith neuronal group when the stimulus is removed.Thus, the relative synaptic efficacy among neuronal groups during the maintaining period is expressed as: (7) where t * = T encode + T gap .According to Equation ( 7), the critical moment T ′ c (THEO) for Jux(t) ≡ 0 is derived as, For details, see Supplementary material 3. Since it takes an amount of time for the cue in the recalling period to trigger the activity of the corresponding neuronal group, a constant bias (denoted as t b , t b ∈ (0, T recal ]) is considered into Equation (8).Meanwhile, since τ f ≫ τ d , and t * = T encode + T gap has the same time scale as τ f , we further simplify Equation ( 8) to be: Based on the theoretical predictions in Equation ( 9), we found that the theoretical results for T ′ c (THEO) (represented by the blue colored line in Figure 3) are consistent with both the critical moment T ′ c (SIM) (depicted by the purple triangle in Figure 3) and T c (shown by the blue line in Figure 3).It is important to highlight that our theoretical analysis effectively captures the dynamic patterns of storage for multiple items qualitatively during the maintenance period.Therefore, the shift of the serial position effect is positively correlated with the time constants of STD and STF, which implies that a larger time constant of STD enables the working memory system to maintain the primacy effect for a longer duration.This shift of the serial position effect is inversely correlated with the sum of presentation durations of items and the inter-stimulus time intervals (referred to as t * ), which indicates that the larger the value of t * , the more difficulty it is for the working memory system to keep the primacy effect.
To further validate the theoretical results, we conducted three different tasks to investigate the temporal dynamics of relative accuracy for multiple memory information (i.e., T ′ c (THEO) and T c ).We explored the serial position effect on various parameters of the network and the design of the experiment, as illustrated in Figure 4, where T ′ c (THEO) is computed based on Equation ( 9).In the first task, we calculated T c using the same experimental parameters as depicted in Figure 2, but altered the values of τ f and τ d of the network, as shown in Figure 2A.We selected 21 different values of τ f uniformly from the range of [2, 8] s and 21 different values of τ d from the range of [0.1, 0.4] s.For each parameter set { τ f , τ d }, we simulated the network and assessed recall performances of participants at different moments (i.e., T maintain ), and obtained the critical moment T c when the recall performance shifted from the primacy to the recency effect (left panel).The selection of T maintain followed the same procedure as depicted Figure 3.
In the second task, we fixed τ d and varied the variable T gap in the psychophysical experiment, as well as τ f , as shown in Figure 4B.We selected 21 different values of τ f uniformly from the range of [2, 8] s and 21 different values of T gap from the range of [0, 2] s.The calculation method for T c is the same as that in Figure 4A.
In the third task, we fixed τ f and varied the variable T gap in the psychophysical experiment, along with τ d , as shown in Figure 4C.We uniformly selected 21 different values of τ d from the range of [0.1, 0.4] s and 21 different values of T gap from the range of [0, 2] s.The calculation method for T c is the same as in Figure 4A.
In summary, we found that: (1) the theoretical analysis of the critical moment (T ′ c (THEO)) is qualitatively consistent with the results obtained by numerical simulations (T c ). ( 2) T c increases with τ f and τ d , as shown in Figure 4A, (3) T c decreases with T gap , as shown in Figures 4B, C.
. Model prediction: the serial position e ect in SWM for multiple items In this section, we demonstrate that the results obtained in SWM with two items can be extended to cases involving multiple items.We continued using the experimental paradigm depicted in Figure 2A for simulations, and studied the case of loading three items successively into the working memory system during the encoding period.
In the WM task, we considered that three items with different orientations are presented sequentially in each trail.Following a maintaining period lasting T maintain , a recalling signal with duration of T recall is presented, which triggers the network to retrieve the task-related feature of the corresponding item.Denote T encode the loading duration of each item, T gap the time interval between two adjacent items, and θ i , (i = 1, 2, 3) the orientation of each item.The value of θ i (i = 1, 2, 3) is determined as follows.θ 1 is randomly selected from the range of (−π/2, π/2], and θ i is determined by θ i = θ 1 + θ (i = 2, 3), where θ denotes the orientation difference between the ith (i = 2, 3) and the first items.The value of θ is chosen according to the relevant psychophysical experiments (Huang et al., 2023) with θ being randomly selected in each trail.For each T maintain , we conducted 50 runs, each consisting of 500 trials, and used three methods of normalized target probability, Circular Variance, and Circular Kurtosis to measure the recall performance of each trail.More details are given in Supplementary material 1.2 and Supplementary Figure 2.
In each experimental trial, three items are sequentially loaded to the network, and the network generates three bump-shaped population firing patterns to represent the corresponding items (Figure 5A), respectively.The synaptic efficacy of each neuronal group (Jux i (t), i = 1, 2, 3) decreases rapidly due to the neurotransmitter depletion.After all items are removed, Jux i (t) (i = 1, 2, 3) recovers to its maximum value (Figure 5B) and then retains at a relatively high level to preserve the item information.We calculated the memory performance of participants at different T maintain , as shown in Figure 5C.
We extended the theoretical analysis in Section 4 to the SWM task involving three items.By comparing the relative efficacy of neuronal groups encoding different items, we can deduce the network's recalling performance.Due to the neglect of connections and overlaps between different neuronal groups, and the assumption that all neuronal groups receive the same intensity and duration of external signals, the relationship of synaptic efficiency between any two different neuronal groups (for example, Jux i (t), Jux j (t), and i < j, for the ith and j loaded items) during the maintaining period is solved to be Jux i (t) = Jux j (t+(j−i) * t * ), with t * = T encode + T gap .Thus, the relative synaptic efficacy between the ith and jth neuronal groups ( Jux ij (t)) are given by, The critical moments (Equation 10) can be theoretically resolved based on Jux 12 (t) ≡ 0 and Jux 23 (t) ≡ 0, which are: where t 12 b , t 23 b (t 12 b , t 23 b ∈ (0, T recall ]) denote the response times of neuronal groups to recalling signals, which are approximately in the time order of τ .According to the above theoretical analysis, if The above theoretical analysis is in perfect alignment with the numerical simulation results, as shown in Equation 11and Figure 5C.
Firstly, when T maintain < T 12 ′ c , the network exhibits the primacy effect (see Figure 5C middle panel).This is because that after the removal of multiple items, Jux 1 firstly recovers to the maximum valve, while the synaptic efficacy of other neuronal groups still remains low values due to neurotransmitter depletion (Figure 5B).The larger the relative synaptic efficacy Jux 12 (t), the more significant the primacy effect.Meanwhile, both the value of Jux 12 (t) and the significance of the primacy effect decrease with T maintain over time.Thus, the recalling performance indeed shifts around the critical moment T 12 ′ c (Figure 5C).Secondly, when T maintain > T 23 ′ c , the network exhibits the recency effect (see Figure 5C bottom panel).This is because the synaptic efficacy of the third item (Jux 3 (t)) recovers to its maximum valve, which is larger than that of other neuronal groups.Furthermore, the synaptic efficacy of neuronal groups gradually decrease over time, and the recency effect becomes insignificant (Figure 5C, bottom panel).
Thirdly, the critical moments (T 12 ′ c and T 23 ′ c ) are primarily influenced by the time constants of STF and STD (τ f and τ d ), as well as the time interval between adjacent items (T gap ).Specifically, the values of T 12 ′ c and T 23 ′ c increase with τ f and τ d , and decrease with T gap .

Discussion
In this study, we built a CANN incorporating STP to investigate the neural mechanism underlying the shift of the serial position effect in SWM.We found that with the elongation of the delay period, the participates' recall performance undergoes a shift from the pronounced primacy effect to the significant recency effect.Additionally, the prominence of the recency effect gradually wanes with the extension of the delay period.Furthermore, we show that the transition moment of the serial position effect is predominantly determined by STP, the time interval between adjacent stimuli, and the duration of stimulus presentation.We carried out theoretical analysis to confirm the simulation results and made predictions to be validated by future experiments.Overall, our study indicates that STP gives us insights into understanding how the ordinal information is processed in working memory.
Utilizing a CANN to depict the encoding, maintenance, and retrieval processes in SWM is biologically plausible.CANNs have been extensively applied to elucidate the neural mechanisms of information processing in working memory.In these tasks, the stimuli are typically represented by continuous variables' including visual orientation (Ben-Yishai et al., 1995), spatial location (Bottomley, 1987), and motion direction (Georgopoulos et al., 1986) in visual working memory, as well as auditory frequency in auditory working memory (Borderie et al., 2024).Additionally, CANNs have been widely used to describe the neural representation and storage of continuous variables, such as head orientation (Stringer et al., 2002;Wang and Kang, 2022), visual orientation (Li et al., 2021), motion direction (Fung et al., 2010;Mi et al., 2014), and spatial location (Samsonovich and McNaughton, 1997;Yoon et al., 2013), and they align with experimental data.A core feature of CANNs is that the synaptic connections between neurons are solely dependent on their difference in preferred continuous variables (Wu et al., 2013(Wu et al., , 2016)), implying that the synaptic connections in their feature space exhibit spatial translation invariance.Moreover, the network employs population coding (forming bump-shaped neural activity patterns) to represent external stimuli, where the peak value of the bump corresponds to the continuous value of the external stimulus.This structural characteristics of ).
CANNs and its mode of representing external stimuli have been empirically validated.
In this study, we adopted the view of the synaptic mechanism of working memory which considers that information is stored in the facilitated synaptic connections (Mongillo et al., 2008).The prefrontal cortex (PFC) is a crucial cortical region for the execution of working memory.A significant body of empirical evidence demonstrates that the connections between neurons in the PFC exhibit STP and are dominated by STF (Wang et al., 2006;Masse et al., 2020;Bocincova et al., 2022).Consequently, after the removal of external stimuli, information can be stored in the temporally enhanced synaptic connections between neurons, rather than in the sustained firings of neurons (Rainer and Miller, 2002;Shafi et al., 2007), i.e., the memory is residing in the activity-silent hidden state.
The synaptic mechanism of working memory postulates that the memory information is primarily stored in the facilitated synaptic interactions between neurons, and the synaptic efficacy determines the accuracy of the memorized item (Stokes, 2015;Wolff et al., 2015).However, other studies suggest that STD also plays a critical role in the storage and manipulation processes of memory information.Regarding the storage of memory information, the limited capacity of working memory is directly proportional to the time constant of STD (Mi et al., 2017).In term of memory manipulation, for instance, external dynamic that are irrelevant to the task but weakly related to the attributes of the remembered items can alter the relative synaptic connection strengths between neuronal groups encoding different items through STD, thereby changing the relative accuracy of participates' recollection of different items in visual working memory in real time, a transition from the recency effect to the primacy effect (Knoedler et al., 1999;Li et al., 2021).Moreover, the time constant of STD determines the effective time window for dynamically manipulating working memory.Compared to previous works (particularly Li et al., 2021), the contributions of our study include: (1) We theoretically calculated the critical moment (T c ) at which the relative memory accuracy of two or more items in a SWM task undergoes a transition.We showed that T c depends on the time constants of short-term facilitation (τ f ) and short-term depression (τ d ), and the sum of the duration of presenting the items and the inter-item intervals (t * ).( 2) Previous studies focused on manipulating two memory items.In this study, we investigated the relative memory accuracy changes of multiple items and derived the critical moments for when these changes occur.Furthermore, our study indicates that the efficacy of synaptic connections (Jux) encoding memory items in neuron groups not only determines the accuracy of item retrieval but is also related to the ordinal information of items, suggesting the importance of STP in processing order information in SWM.
In our model, the efficacy of synaptic connections within neuronal groups, denoted as Jux, determines the accuracy of memory storage.On one hand, u represents the short-term synaptic facilitation effect.According to the synaptic computational theory of working memory, information is retained in the facilitated synaptic connections between neurons within each neuronal group.Therefore, the accuracy of information storage gradually diminishes as u decreases.On the other hand, x represents the short-term synaptic depression effect, which provides the network with a slow negative feedback effect.This negative feedback effect can induce the mobility of neuronal activity bumps within the network (Fung et al., 2012), consequently leading to the phenomenon where neural activity bumps drift in the feature space of the network during the information maintenance period [i.e., the memory drift phenomena reported by Funahashi et al. (1989)].Such drifting of activity bumps weaken the accuracy of memory representation (Seeholzer et al., 2019).Therefore, during the information maintenance period, a decrease in u within the neuronal groups leads to memory accuracy decay; simultaneously, if the STD effect in the network is sufficiently strong, this effect can also induce the drifting of weak neuronal activities during the maintenance period, further reducing memory accuracy decay.
The design of the psychophysical experimental paradigm also affects the shift of the serial position effect.Our theoretical analysis and simulation results indicate that an increase in the duration of stimulus and the inter-stimulus interval can affect the relative synaptic efficacy between neuronal groups encoding different items.Consequently, this reduces the prominence of the primacy effect and its significance, and concurrently induces the shift of the serial position effect.These theoretical insights into the effects of the psychophysical experimental design on the serial position phenomena pave the way for further investigation into the control of SWM with multiple items.

FrontiersFIGURE
FIGUREThe serial position e ect in SWM task for two items.(A) Schematic diagram of the psychophysical experiment paradigm.In the encoding phase, two distinct visual stimuli are sequentially presented.Each stimulus is presented for a duration denoted by T encode , separated by an time interval of T gap between them.Following a delay period of T maintain , a recall cue lasting for T recall is presented.Participates are then tasked with recalling the orientation value of either the first or second visual stimulus based on the given cue.(B, C) The temporal dynamics of neural activity pattern in an individual trial simulation example.(B) Two visual stimuli, denoted as θ i (i= , ), are presented sequentially during the encoding phase.In response to each stimulus, the network generates successive bump-shaped neural activity patterns centered at θ i , respectively.During the delay period, the neural activity gradually decays to a silent state.(C) The temporal course of synaptic e cacies of neural groups encoding two visual stimuli throughout the trail.When a stimulus is presented, the synaptic e cacy of the corresponding neural group rapidly decreases.After the stimulus is removed, the synaptic e cacy gradually recovers to a maximum value, denoted as Jux max

FIGURE
FIGURE The dependence of T c , T ′ c (THEO&SIM) on di erent variables τ d , τ f and T gap .The calculations for T c employ various statistical methods, including normalized target probability (indicated by orange asterisks), Circular Variance (green crosses), and Circular Kurtosis (red hollow circles).The theoretical analyzes T ′ c (THEO) represented by the blue solid line, while numerical simulations T ′ c (SIM) denoted by purple triangles.(A) T c and T ′ c (THEO&SIM) increase with increasing τ f ; (B) T c and T ′ c (THEO&SIM) increase with increasing τ d ; (C) T c and T ′ c (THEO&SIM) decrease with increasingT gap .T ′ c (THEO&SIM) ≈ T c in (A-C).For each given parameter { τ d , τ f , T gap }, we computed the memory accuracy for di erent maintenance times T maintain when the recall cue was presented.For every selected T maintain , the network was simulated times, with each simulation comprising trials.More details see Supplementary material .

FrontiersFIGURE
FIGURE The dependence of temporal dynamics of memory accuracy on di erent variables and parameters, particularly τ f , τ d , and T gap .(left panel) the numerical simulation results of T c , (right panel) the theoretical predictions (T ′ c (THEO)).(A) Both T ′ c (THEO) and T c increases with increasing τ f , τ d .(B) Both T ′ c (THEO) and T c increases with increasing τ f , and decreases with increasing T gap .(C) Both T ′ c (THEO) and T c increases with increasing τ d , and decreases with increasing T gap .

FIGURE
FIGUREThe temporal pattern of memory accuracy in a SWM task for three items.(A, B) The temporal dynamics of neural activity pattern in an individual trial simulation example.(A) Three items are presented sequentially loaded to the network, and the network generates three bump-shaped population firing pattern to represented the corresponding items, respectively.After the removal of items, the neural activity decays to a slient state.(B) The synaptic e cacy of each neuronal group (Jux i (t) for i = , , ) rapidly decays when the corresponding items is presented, and then recovers to a maximum value, denoted as Jux max i , within a certain time and then remains at a high level for an extended duration.(C) The recall performance at varying T maintain .T c and T c present the critical moment of recall performance shift from the primacy to recency e ects.(Top) The normalized target probability of the ith presented item, denoted as P i , i = , , , at varying T maintain .(Middle) The normalized target probability di erence (denoted as P − P ) and relative synaptic e cacy (Jux − Jux , the red line) between the st and nd items.(Bottom) The normalized target probability di erence (denoted as P − P ) and relative synaptic e cacy (Jux − Jux , the red line) between the nd and rd items.(n.s.: p > ., *: .<p < ., **: .<p < ., ***: p < .).
The schematic diagram of the CANN.Excitatory neurons are arranged in a ring based on their preferred visual orientations θ (θ ∈ [−π , π )).The connection strength between two excitatory neurons at θ and θ ′ is denoted as J(θ, θ′), which depends only on |θ − θ′| (the varying shades of gray lines represent the connection strength) and is translation-invariant in the feature space.All excitatory neurons in the network are connected to a global inhibitory neuronal pool (the gray node).decays to within the time of τ f , and x(θ, t) returns to within τ d .Due to the dominance of the synaptic short-term facilitation e ect, FIGUREA continuous attractor neural network (CANN) model with short-term plasticity (STP).(A)