Brain signatures predict communicative function of speech production in interaction

People normally know what they want to communicate before they start speaking. However, brain indicators of communication are typically observed only after speech act onset, and it is unclear when any anticipatory brain activity prior to speaking might first emerge, along with the communicative intentions it possibly reflects. Here, we investigated brain activity prior to the production of different speech act types, request and naming actions performed by uttering single words embedded into language games with a partner, similar to natural communication. Starting ca. 600 msec before speech onset, an event-related potential maximal at fronto-central electrodes, which resembled the Readiness Potential, was larger when preparing requests compared to naming actions. Analysis of the cortical sources of this anticipatory brain potential suggests a relatively stronger involvement of fronto-central motor regions for requests, which may reflect the speaker's expectation of the partner actions typically following requests, e.g., the handing over of a requested object. Our results indicate that different neuronal circuits underlying the processing of different speech act types activate already before speaking. Results are discussed in light of previous work addressing the neural basis of speech act understanding and predictive brain indexes of language comprehension.


Introduction
When humans speak, they are driven by a wide variety of communicative goals and intentions. Words and sentences are used as tools for making recommendations, expressing invitations, for asking someone for help or support, for requesting an object or for naming it (Austin, 1975;Searle, 1969;Wittgenstein, 1953). Experimental linguistic research addressing speech production has so far mainly focused on the picture naming paradigm (or variations thereof), where participants have to name the object depicted in a line drawing shown on a computer screen (Abdel Miozzo, Pulvermü ller, & Hauk, 2015;Strijkers, Costa, & Pulvermü ller, 2017;Strijkers, Costa, & Thierry, 2010). This somewhat artificial paradigm was probably chosen because it is simple and easily administered. However, it is clear that the variability of communicative functions relevant in everyday language use is not appropriately covered when exclusively addressing naming, in particular in a computer-controlled experimental paradigm. Here, we focus on this variability of communicative function as it occurs in more natural ways of using language, by comparing two different speech act functions, naming and requesting an object between two partners. We ask specifically whether neurophysiological indicators recorded from the human brain discriminate between these speech acts during real social interactions and at which point in time any difference can first be recorded.
Previous neurocognitive research has reported brain correlates of communicative functions in speech and language comprehension (Ba sn akov a, van Berkum, Weber, & Hagoort, 2015; Ba sn akov a, Weber, Petersson, Van Berkum, & Hagoort, 2014;Egorova, Pulvermü ller, & Shtyrov, 2014;Egorova, Shtyrov, & Pulvermü ller, 2013;Gisladottir, B€ ogels, & Levinson, 2018;Gisladottir, Chwilla, & Levinson, 2015;Tomasello, Kim, Dreyer, Grisoni, & Pulvermü ller, 2019;van Ackeren, Casasanto, Bekkering, Hagoort, & Rueschemeyer, 2012;van Ackeren, Smaragdi, & Rueschemeyer, 2016). Results indicate that comprehension of speech act types, such as direct or indirect speech acts, or naming and requesting, have distinct brain correlates, even if they are conveyed by exactly the same words. These pre-existing results raise the question whether the brain signatures distinguishing between speech act types are specific to the comprehension modality or rather general, thus persisting equally during production and comprehension of speech and language. Brain language theories, along with psycholinguistic models, make different predictions here. Modular and so-called stream models claim at least partly separate processing components for speech production and perception/comprehension interlinked by way of interfaces (Hickok & Poeppel, 2007;Indefrey, 2011;Indefrey & Levelt, 2004;Levelt, Roelofs, & Meyer, 1999), thus implying that different brain areas contribute to speech production and understanding. In contrast, integration models claim that the same cognitive and brain mechanisms are at work when people use and understand language, although these models of course cannot be reduced to this claim, but include many more specific claims about spatio-temporal activation patterns at work in different ways during speaking and understanding (Pickering & Garrod, 2013;Pulvermü ller, 1999Pulvermü ller, , 2018Pulvermü ller & Fadiga, 2010;Strijkers & Costa, 2016). This leads to the prediction that speech production and understanding activate similar and strongly overlapping brain regions. Furthermore, at present, only a small minority of models explicitly take into account the mind and brain basis of communicative processes, thus allowing for strong predictions on pragmatic brain processes and, in particular, differences between speech act types (e.g., Pulvermü ller, Moseley, Egorova, Shebani, & Boulenger, 2014;Pickering & Garrod, 2013). Here, we give a brief overview of recent work on communicative function processing and then outline an experiment addressing the mechanisms of communicative function processing in speech production.
Recent neurophysiological studies documented rapid or near-instantaneous brain-correlates of understanding of pragmatic information of communicative functions within the first 200 msec after presentation of critical communicative stimuli (Egorova et al., 2013Tomasello et al., 2019). Multiple brain areas were involved in distinguishing the processing of different speech act types, for example, direct versus indirect speech acts (van Ackeren et al., 2012, van Ackeren et al., 2016 or assertive (e.g., statements) versus directive ones (e.g., requests, Egorova et al., 2016;Tomasello et al., 2019). In particular, when comparing the speech acts of naming and requesting objects from a partner, a consistent finding is the involvement of hand motor cortex in request understanding (Egorova et al., 2016;Tomasello et al., 2019;see also;van Ackeren et al., 2016). This specific motor area activation can best be explained by the actionrelated nature of request function. According to linguistic pragmatic theories (Alston, 1964;Fritz & Hundsnurscher, 1994;Hamblin, 1970;Kasher, 1988;Fritz, 2013), a specific speech act is embedded in an action sequence tree comprising the sequence of other (speech) acts that typically and regularly precede and follow it. In this framework, requesting-anobject, via its specific action sequence structure, is firmly associated with the action of the partner grasping an object and handing it over to the speaker, and, as alternative follower actions, with the denial or rejection of the request. In contrast, the action sequence following naming-an-object would normally not be followed by overt object-oriented actions, or denial or rejection of such a response. Other unspecific responses to speech acts, such as asking back, correcting the utterance or approving it, are shared by naming and request and by most other communicative actions. Hence, the stronger activation of motor areas for requests may reflect the additional processing of the relatively richer knowledge about possible partner actions, which does not come into play in understanding naming actions. On this background, integration models predict that also in production of speech acts, the same differences between the brain signatures of requests and naming actions will emerge, with motor areas being relatively more strongly active. In contrast, brain language models implicating largely different production and understanding mechanisms do not predict such a commonality.
To appropriately address the brain signatures of speech act types in speech production, it is essential to choose an experimental paradigm where the same linguistic forms are used to perform different communicative acts. Otherwise, any differences between utterance forms might confound any distinctions in brain activation between speech acts. To facilitate the task of finding identical forms for different communicative functions, single words were chosen, as they are the most elementary means to perform speech acts (Dore, 1975;Wittgenstein, 1953). A noun such as 'coffee' can be used to request a cup of coffee, to ask for information (e.g., whether there is coffee on the menu) to name the content of a mug or to inform about it. What these communicative functions have in common is the propositional content, namely the things and issues they are about e in this case that they are used to speak about coffee rather than, for instance, tea. However, crucially, they differ in their communicative or speech act function (illocutionary force), that they are used to request or name (Austin, 1975;Searle, 1969). We measured event-related brain potentials as the same real objects were named and requested by the participants by using the same single words. Also the communication partner, the objects available and the general arrangement of the communicative context were the same between the two conditions. Note also that a paradigm was chosen that approximates natural social communicative interaction by establishing a dialogic language game context involving the speaker and a partner. To situate both speech act conditions in a context as natural as possible, participants were instructed to imagine the requests to take place between a merchant and a customer and the naming between an examiner and a testee.
We recorded event-related brain potentials continuously during social-communicative request and naming interactions and, in the evaluation, focused on a time interval already starting 2000 msec before the onset of speech production. This focus on anticipatory activity had the following reasons. First, overt speech production is accompanied by muscle movements which cause muscle artifacts in the recordings of brain responses (see e.g., Miozzo et al., 2015;Strijkers et al., 2017). Before speech onset, these artifacts are absent or much reduced. Second, it is well known that, before overt movement onset, there are brain indicators of processes of planning, decision and motor control, which are most pronounced in prefrontal, premotor and motor areas. Such anticipatory motor activity known under the label of the 'Readiness Potential' or the 'Bereitschaftspotential' has been documented for movements with different parts of the body (e.g., Di Russo et al., 2017;Kornhuber & Deecke, 1965) and for speech, too (Deecke, Engel, Lang, & Kornhuber, 1986;Galgano & Froud, 2008;Gunji, Hoshiyama, & Kakigi, 2000). Third, and most importantly, anticipatory slow waves have also been recorded during visual or acoustic perception of actions Kilner, Vargas, Duval, Blakemore, & Sirigu, 2004) and in word and sentence comprehension (Grisoni, Miller, & Pulvermü ller, 2017;Grisoni, Mohr, & Pulvermü ller, 2019; for a review see; . Intriguingly, these anticipatory potentials reflect aspects of the meaning of the upcoming sounds or words. Therefore, they can be considered predictive brain activity providing semantic information and are called 'semantic prediction potential'. We hypothesized that in contexts where pragmatic function of upcoming speech acts is predictable, (i) a similar anticipatory potential as documented for semantic prediction in sentence comprehension (for a review, see  would appear prior to speech act production, (ii) different predictive brain indexes appear for different speech act functions, (iii) stronger anticipatory neural sources will emerge for request compared to naming and (iv) this additional activity for request relative to naming production resembles that previously seen during the comprehension of requests relative to naming actions, including activation in sensorimotor brain regions (see, e.g., Egorova et al., 2016;Tomasello et al., 2019).

2.
Materials and methods

Participants
Twenty-five healthy volunteers (12 females) were paid for their participation in the experiment. Our sample size was determined on the basis of a power analysis performed in G*power 3.1.9.7 (Faul, Erdfelder, Lang, & Buchner, 2007). To this scope, we took the effect size from Tomasello et al. (2019) which investigated the same two speech acts (naming and request) with the same method (EEG) used in the current study, but in the comprehension modality. Thus, in order to achieve an effect size of h p 2 ¼ .29 (Tomasello et al., 2019) with a ¼ .05 and power ¼ .8, we determined that minimum sample size of 23 subjects was required, to which we recorded two more subjects in order to compensate for potential subjects exclusion. All subjects were monolingual German native speakers and between 18 and 35 years old (mean age 24.7 years ± 3.9 SD). Subjects reported no neurological disorders or reading/writing disorders and had normal or corrected-tonormal vision. They were all right-handed, as assessed using the Edinburgh Handedness Inventory (Mean laterality quotient ¼ 83.5 ± 15.3 SD) (Oldfield, 1971). The above mentioned inclusion and exclusion criteria were defined prior to study conduction. All procedures were approved by the Ethics Committee of the Charit e Universit€ atsmedizin, Campus Benjamin Franklin (Berlin, Germany) and were in agreement with the Declaration of Helsinki. All participants were paid for their participation and signed an informed consent form prior to the beginning of the experiment.

Stimuli and procedure
The experimental material consisted of 96 real physical objects that were selected to be small and graspable. We specifically took care that all items were familiar and typical of everyday life situations. Naming consistency was confirmed by the tested population (see 'Audio Processing' section below). The object stimuli were split into two different lists, such that their verbal labels, which were one or two syllables long, were matched for the following lexical and sub-lexical psycholinguistic variables: normalized lemma frequency, number of syllables, number of sounds, number of consonants at word onset and normalized bigram and trigram frequency. The normalized lemma frequency was taken from two different databases of German language: the dlexDB (Heister et al., 2011) and the SUBTLEX-DB database (Brysbaert et al., 2011). The dlexDB database is based exclusively on written German material and therefore is representative of written language only. For this reason, we also included a measure of normalized lemma frequency from the SUBTLEX-DE, which is based on German movie subtitles reflecting spoken German language. Independent sample t-tests failed to indicate any significant differences between the two stimuli lists on any of the aforementioned variables (for details, see Table 1).

Familiarization phase
In order to increase naming consistency of the objects used for the main experimental task (i.e., naming and requesting a desired objects), subjects were familiarized with the object stimuli and their labels prior to the main experimental task.
To this end, a color photography of the objects with the appropriate label written in white font and capital letters on a light grey background, was shown to the subjects for two seconds in the middle of the screen (LCD U2412Mb, Dell inc, Round Rock, TX). The order of the items was randomized. The stimulus presentation was controlled by PsychoPy2 software (Peirce, 2007). The subjects were instructed to pay attention to the images and labels, but not specifically to try to remember them.

Main experimental task
The main experimental task was divided into two blocks, respectively corresponding to the naming or to the requesting condition and each including 48 trials. The participants' task was to name or request a self-selected item presented on the table by interacting with a confederate which, known to the participant, was a member of the research team. We attempted to match the two conditions in several respects, including the social-communicative context, the actual setting including the persons, objects and basic actions relevant in it and the linguistic tools used. In the naming condition, participants were instructed to imagine that they were taking part in a language test by interacting with an examiner, who assessed whether the real objects lying on the table were correctly named by the testee (with overt feedback being given only at the end of the task). In contrast, in the requesting condition, participants were asked to imagine that they were interacting with a salesman and, as customers, would have to request the items for purchase. Both scenarios were kept as simple as possible to avoid any confound for situational complexity between naming and request contexts. In both conditions, participant and confederate were sitting on opposite sides of a table. At the beginning of the first trial of each block, two objects were placed on the table. After an auditory signal was given (trial onset signal), the participant had to mentally select one of the two items present on the table, then fixate their gaze on a red dot located at the center of the table so that neurophysiological responses (e.g., related to object selection) could return to baseline, and finally, after an additional self-determined interval of few seconds (and still while gazing at the red dot), request or name the object using a one-word utterance, for example 'Schere' (¼scissors). Subjects were specifically instructed to avoid the use of any other verbal materials, including articles or politeness expressions such as 'Please' or 'Thank you'. Notice that no time constraint was given to the participants regarding the onset of their speaking. The subsequent reaction of the confederate differed between the two conditions. In the naming condition, the named object was removed from the table, placed in a basket not visible for the subject and replaced by another object. In the request condition, the requested object was also removed from the table and placed into a basket that was not visible for the subject, but in this case the basket had previously been designated as 'the subject's basket'. Just as in the naming condition, the requested object was subsequently replaced by a new one. Crucially, the trial structure was precisely the same for both conditions prior to and during the subject's utterance. The only difference was in the location where the object was placed by the confederate after it had been named or requested (Fig. 1). The order of conditions, as well as the assignment of the object stimulus list to one or the other condition, was counterbalanced across participants. Thus, each object stimulus was presented only once to each subject to avoid potential repetition effects known to reduce the cortical responses of repeated stimuli (Grill-Spector, Henson, & Martin, 2006;Nagy & Rugg, 1989). The presentation order of the objects within a block was managed as follows: all the items in one block were divided into four sets or 'bags' that were counterbalanced for the aforementioned psycholinguistic variables. The order in which the bags were taken was randomized and the presentation of the object that were in each bag was varied across subjects. The side of the body (namely the left or right hand) with which the confederate performed her object manipulation responses was counterbalanced across subjects. Namely, for the first half of the subjects, the items Table 1 e Matching of the two lists of experimental words for psycholinguistic variables. Average values (Mean) as well as the standard error of the mean (SEM) are shown for each measure and for both lists, together with the results of independent sample t-tests, including t-value, degrees of freedom (df) and error probability (p). c o r t e x 1 3 5 ( 2 0 2 1 ) 1 2 7 e1 4 5 were always replaced by the confederate with her right hand, whereas for the second half, they were replaced by the confederate's left hand. The subject's basket in the request condition and the location where the objects were put in the naming condition were at opposite sides of the subject and were inverted for half of them. Notice that at every trial, two objects were always present on the table, and one of them could remain on the table for several trials if not named/requested immediately by the subject. Moreover, as always one out of two objects had to be chosen, it followed that on the final trial of every block, a last single item was left. This last item was not subject to naming/ request and subsequently removed. In total, each subject performed 94 trials, one with each of the 2 x 48 objects minus the final single left-over ones. One block had an approximate duration of 20 min resulting in a total experiment duration of about 45 min, including a ca. 5 min break between blocks. In order to determine the voice onset of the produced utterances, during the entire experiment, the voice of the participants was continuously recorded via a high-resolution microphone (SM58, Shure, Stuttgart, Germany) placed at a distance of approximately 70 cm from the subject's mouth.

Audio processing
The time of voice onset was determined off-line by visually inspecting the waveforms from the audio recording using Audacity 2.1.1 software (https://sourceforge.net/projects/ audacity/). The obtained markers were then temporally aligned with each subject's EEG signal. The trials were ascribed to five different categories based on the subject's responses and coded as follows: -Correct: trials where the subject uttered exactly the target word, i.e., the exact same name that was associated with the specific object in the familiarization task. -Synonymous: trials where subjects uttered synonymous words (e.g., 'Schü ssel' instead of 'Schale' (¼bowl)) or a compound word derived from the correct word (e.g., 'Briefumschlag' (¼postal envelope) instead of simply 'Umschlag' (¼envelope)). -Incorrect: in these categories were put all labels that were non-synonymous relative to the correct label (e.g., 'Schachtel' (¼box) instead of 'Seife' (¼soap)) also when they were semantically related (e.g., 'Blume' (¼flower) instead of 'Pflanze' (¼plant)). -Mispronounced: trials where the subject seemingly intended to utter the correct word, but produced a mistake in the pronunciation (e.g., 'Harken' instead of 'Haken' (¼hook)). -Invalid: trials that were not performed at all because of technical problems in the experimental procedure.
Trials entered the EEG analysis only if they were classified as correct, based on the above-mentioned criteria. The average correct rate per subject in the two conditions Fig. 1 e Schematic representation of the structure of experimental blocks in the naming and request conditions. In both conditions, confederate (C) and participant (P) are each sitting at opposite sides of a table. Before the beginning of the naming block (top panel, in blue), participants were instructed to imagine that they were in a testing room partaking in a language test interacting with an examiner. The beginning of each trial was signaled by an acoustic tone (a pure tone of 500 Hz). From that moment on the participant could choose one of the two real objects on the table, fixate the red dot on the table for a few seconds, and then name the item of their choice. Finally, the named object was removed by the confederate and replaced by a new one. Before the beginning of the request block (bottom panel, in red), participants were instructed to imagine that they were in a shop, interacting with a salesman. Precisely as in the naming condition, the beginning of each trial was signaled by the acoustic tone. From that moment on, the participant chose one of the two objects on the table, fixated the red dot on the table for a few seconds, and then requested the item of their choice. Finally, the requested object was put into the participant's basket, and a new object was placed on the table.

EEG recording
The EEG was recorded in an electrically and acoustically shielded chamber through 64 active electrodes embedded in a fabric cap (the green and yellow subsets of electrodes from the actiCAP 128Ch Standard-2; Brain Products GmbH, Munich, Germany). These were arranged according to the 10e10 conventional layout with the following modifications: the reference was moved from FCz position to the tip of the nose, the electrode occupying the PO10 position replaced the empty FCz position. The PO9 and FT9 electrode positions were reassigned as EOG channels placed below and above the left eye respectively and the FT10 electrode to the right outer canthus to measure the vertical and horizontal electro-oculograms. All electrodes were referenced to an electrode placed on the tip of the nose. Data were amplified and recorded using the Brain Vision Recorder (version: 1.20.0003; Brain Products GmbH), with a passband of .1e250 Hz, sampled at 500 Hz and stored on disk. Impedances of all active electrodes were kept below 10 KU.

EEG data preprocessing
The EEG data were processed with the EEGlab 14.10b (Delorme & Makeig, 2004)  . Data were down-sampled at 250 Hz and band-pass filtered at .1e30 Hz. The signal from the upper and lower eye electrodes was used to generate bipolar vertical EOG signals and from the average of the latter two minus the potential at the right outer canthus was computed to produce the horizontal EOG. Noisy EEG channels were removed from each dataset after visual inspection. Independent component analysis (ICA) was carried out based on the standard algorithm included in the EEGlab toolbox and was set to generate 35 independent components from the EEG data. The resulting independent components were identified as artifactual using two procedures. First, we identified components that captured eye movements as those correlating (|R| > .3) with either of the previously generated horizontal and vertical EOG channels and removed them from the data to minimize eye-related artefacts (Groppe, Makeig, & Kutas, 2009;Tomasello et al., 2019). Second, we identified the component capturing articulatory activity as the ones that correlated (|R| > .3) with the signal of any of the channels FT7, FT8 or the lower EOG channel. These three electrodes were the ones that were most likely to be affected by articulatory artifacts due to their location on top of the temporal muscles and relatively close to the mouth, respectively. The components that were marked as artifactual were subtracted from the EEG data. In average, 3.2 (range: 2e6) out of 35 components were removed from each participant's dataset because of ocular activity and 3.3 (range: 2e6) because of articulatory activity. The previously removed EEG channels were interpolated based on the standard EEGlab toolbox method. Subsequently, the data were segmented into epochs starting at À2000 msec prior to voice onset (VO) and ending at 500 msec after it ( Fig. 2A). Thus, data were epoched in a response-locked fashion. Baseline correction was applied subtracting from the data the average voltage of a 200 msec time window between À2000 and À1800 msec relative to VO. This was done because we expected the anticipatory activity to resemble the Readiness Potential, RP, which typically starts <1 sec before movement onset (Di Russo et al., 2017;Kornhuber & Deecke, 1965). An artefact rejection procedure was applied only in the time window À2000 msec to VO. We focused on the time range before VO, as this is where any relatively uncontaminated anticipatory activity may occur (Grisoni et al, , 2017Kornhuber & Deecke, 1965;Shibasaki & Hallett, 2006). Trials were rejected if their potential exceeded ±150 mV, a threshold chosen based on previous speech production studies (Aristei, Melinger, & Rahman, 2011;Rose, Aristei, Melinger, & Rahman, 2019;Strijkers et al., 2010).
In the current dataset, the average trial rejection rate per subject across collapsed conditions was 3.6%. The trial rejection rate was comparable between conditions as assessed by paired t-test (t (24) ¼ -.21, p ¼ .84). Only subjects with a trial rejection rate <20% in both conditions were included in the analysis. Following this criterion, one subject was excluded from the final analysis. Additionally, one subject was excluded because, when collapsing the two conditions, his average ERP measured between À2 s and VO was beyond ±2.5 SD from the grand-average for at least 10% of the time points. A third subject was excluded because of the two above mentioned criteria combined with self-reported illness on the testing day.
Overall, twenty-two subjects out of twenty-five entered the final EEG analysis. An additional set of analyses was also computed with a more conservative artifact rejection criterion of ±100 mV to further ensure that any significant differences between brain responses were not affected by artifacts (for more detail see Supplementary Material).
With the aim to estimate muscular activity recorded during the task, we performed an additional separate analysis of the neurophysiological data. As the spectral power of muscular activity (e.g., from articulators) increases with frequency (especially above 20 Hz), whereas that of the EEG signal decreases with frequency and is relatively low above the beta range (Cacioppo, Tassinary, & Fridlund, 1990;Goncharova, McFarland, Vaughan, & Wolpaw, 2003;Pulvermü ller, Birbaumer, Lutzenberger, & Mohr, 1997), the raw neurophysiological data, after down-sampling to 250Hz, were high-pass filtered at 20 Hz. Subsequently, the data were epoched and baseline corrected with the same parameters as in the main analysis, followed by a full-wave rectification and by the calculation of the upper envelope on an individual trial basis. Finally, grand-averages of the pooling of the same EEG channels used for the cluster-based permutation test (Fig. 2D) were calculated separately for naming and request condition. The resulting grand-average is an estimation of the strength and temporal unfolding of (mainly) muscular activity prior to word generation (see Fig. 2B). Recordings are from mid-fronto-central electrode FCz. The X axis represents time in seconds before and after speaking onset (voice onset, VO) and the Y axis represents the ERP amplitude in micro-Volt (mV). The grayed areas indicate the time windows where the difference between naming and request were significant (after Bonferroni-corrected post-hoc t-tests), as well as their respective significance levels. (B) EMG activity measured pooling the same channels that were used for the cluster-based permutation test. (C) ERP topographies for naming and request trials from ¡2000 msec to VO, given as maps each displaying average potentials in time windows of 200 msec. Each map shows the head and recording array from above, with the nose pointing upward. (D) Electrodes used in the ANOVA (poolings indicated in bright and dark green) and for the cluster-based permutation test (electrodes indicated in bright and dark green and purple electrodes). (E) Source analysis results for request (in red) and for naming (in blue) computed in the time window going from ¡600 msec to voice onset, where significant differences between conditions were found. Notice the additional fronto-central activation highlighted in yellow for the request function. The box indicates the resulting difference brain source maps of request-naming (in magenta) and namingrequest (in cyan). Source strength was thresholded at .02 a.u.

ERP data analysis
To determine any differences in amplitude and peak latencies between the two conditions (naming and requesting) and to avoid the problem of multiple comparisons, a first statistical analysis was performed using a (non-parametric) clusterbased permutation test (Maris & Oostenveld, 2007;Sassenhagen & Draschkow, 2019) as implemented in the FieldTrip toolbox. As the readiness potential (RP) typically occurs about one second prior to speech onset and is largest on the fronto-central-parietal electrodes (Kornhuber & Deecke, 1965;Shibasaki & Hallett, 2006), we centered our analysis on the time period between À1000 msec and VO and restricted it to 45 frontal, central and posterior electrodes (Frontal: F7, F5, F3, F1, Fz, F2, F4, F6, F8, FC5, FC3, FC1, FCz, FC2, FC4, FC6, Central: T7, C5, C3, C1, Cz, C2, C4, C6, T8, CP5, CP3, CP1, CPz, CP2, CP4, CP6, Posterior: P7, P5, P3, P1, Pz, P2, P4, P6, P8, PO7, POz, PO8) (see Fig. 2D). This cluster-based permutation test was computed by randomly exchanging data between the two stimulus conditions and producing the maximal positive and negative cluster of each permutation (5000 permutations). In addition, we repeated the same test in a smaller time window going from À1000 to À200 msec before VO, thus excluding the last 200 msec of the previous analysis that may have been contaminated by articulatory artefacts as a previous study reported articulatory movement proceeding voice onset by ca. 200 msec (Fargier, Bü rki, Pinet, Alario, & Laganaro, 2018;Salmelin, 2010, p. 143). Furthermore, we ran the same permutation test analysis on the vertical and horizontal EOG channels to ensure that no significant differences were present between the two conditions in the signals recorded from ocular responses electrodes. All cluster-based permutation tests were considered significant only if clusters with p < .025 two-tailed were found, thus resulting in a critical a ¼ .025 corresponding to a false alarm rate of .05.
The aforementioned cluster-based permutation tests were complemented with a repeated-measures analysis of variance (ANOVA), which allowed a more fine-grained analysis of the temporal and spatial extent of the effects. Hence, 36 channels (see . Additionally, the mean voltage before voice onset was averaged within five 200 msec time windows (TW1: À1000 to À800 msec, TW2: À800 to À600 msec, TW3: À600 to À400 msec, TW4: À400 to À200 msec and TW5: À200 to 0 msec  [five levels: TW1, TW2, TW3, TW4, TW5] and Exposure time [two levels: first vs second experimental block]. This analysis also addressed the issue of whether there might have been any differences between speech act conditions across the experiment e for example greater fatigue effects in one condition than in the other. In addition, to test if the side of the confederate response action (who used either the left or right hand when manipulating the objects) was reflected in the topographies of the brain responses, we repeated the main ANOVA analysis, with the addition of the between-subject factor "Confederate response hand" [two levels: left and right hand]. Finally, a 2-way repeated measures ANOVA with factors of Communicative act [two levels: naming and request] and time window [five levels: TW1, TW2, TW3, TW4 and TW5] was run on the horizontal and vertical EOG channels to explore whether any statistically significant EEG differences between conditions found at scalp channels could possibly be due to the differences in ocular activity. Greenhouse-Geisser correction (Geisser & Greenhouse, 1959) was applied to the degree of freedom whenever violation of the sphericity assumption occurred. Corrected p-values, along with epsilon (ε) values are reported throughout. Partial etasquare (h p 2 ) values are also stated, which is defined as an index of effect size (.01e.06: small; .06e.14: medium; > .14: large; Cohen, 1988).

Source level analysis
To localize the cortical origin of the neurophysiological responses of naming and request functions before speech onset, we performed a distributed cortical source analysis. Source solutions were calculated on the grand-averaged responses that benefit from higher signal-to-noise ratio (SNR) (Egorova et al., 2013;Hauk et al., 2006;Shtyrov, 2011). Also, they were restricted to those latencies where significant effects between speech act conditions (i.e., À600 msec to voice onset) were found in the statistical analysis of event-related potentials calculated relative to their baseline at À2200 to À2000 msec to voice onset. In addition, to further examine if the two conditions differed in terms of the involved sources, we obtained the difference source maps by computing the subtraction between the resulted brain sources of naming and request. We used the structural MRI included in SPM12 to create a cortical mesh of 8196 vertices. The volume conductors were constructed with an EEG (3-shell) boundary element model. The method used for source estimation was the multiple sparse prior (MSP) technique, specifically the 'greedy search' algorithm (Litvak et al., 2011), which had previously been used in our laboratory (e.g., in Grisoni et al., 2017;Tomasello et al., 2019). Activation maps were then smoothed using a Gaussian kernel of FWHM 12 mm. Each region emerged from the sources were reported with their respective cortical labels.

Acoustic analysis
To test whether differences in the RP component between naming and requesting were driven by differences in articulatory preparation of speech execution, we performed additional analysis on the produced utterances. The acoustic c o r t e x 1 3 5 ( 2 0 2 1 ) 1 2 7 e1 4 5 profile of the vocalizations was quantified in terms of duration (msec), loudness (RMS, dB), pitch (F0, Hz), jitter (ms), shimmer (dB) and harmonic-to-noise ratio (HRN, dB). To this end, the software PRAAT 6.0.49 (http://www.praat.org) was used to compute the mean average of the acoustic proprieties mentioned above. Later the generated values were averaged across all vocalizations produced during naming and request contexts, resulting in two values for each participant. Finally, the Wilcoxon signed-rank tests were used to statistically compare the acoustic properties of the produced words between naming and requesting functions across subjects.

3.1.
Cluster-based permutation tests on ERP data Fig. 2A illustrates the grand average recorded at the midfrontocentral electrode FCz in the naming (in blue) and request (in red) conditions. Visual inspection of the ERPs shows overall more pronounced negativity for requests compared to naming. Fig. 2C illustrates the scalp distribution of the ERPs for naming and requesting, time-locked to the VO.
Visual inspection of these topographies shows that both conditions are characterized by progressive negativity building up at central electrodes locations. To test for significant differences between naming and request conditions, we performed cluster-based permutation tests on the large timewindow from À1000 msec to VO and across the previously defined pool of frontal, central and posterior electrodes (see section 2.8 'ERP Data Analysis' and Fig. 2D for more details). The test detected a statistically significant difference between the naming and requesting condition (p ¼ .003, with significance threshold adjusted for two-tailed comparisons to p < .025). The difference between conditions was most pronounced in the time window between about À430 msec to about À130 msec relative to VO. The same analysis performed on the smaller time window from À1000 to À200 confirmed the previous results (p ¼ .003, with significance threshold adjusted for two-tailed comparisons to p < .025) by revealing differences most pronounced in the time window from À430 msec to the end of the tested time window (i.e., À200 msec). To ensure that the significant differences between the two conditions could not be due to differences in ocular activity, we performed two additional permutation tests on the horizontal and vertical EOG channels, respectively. These did not reveal any significant differences between naming and requesting condition in the hEOG (no clusters found) and vEOG responses (all clusters with p > .129, with significance threshold adjusted for two-tailed comparisons to p < .025).

ANOVA on ERP data
The cluster-based permutation tests were complemented by a 4-way repeated measure ANOVA (Communicative act x Laterality x Gradient x Time Window) performed on the neurophysiological brain responses of naming and request functions during speech preparation. This analysis revealed a significant main effect of Communicative act (F (1, 21)  The significant interaction was confirmed by post-hoc t-tests (Bonferroni-corrected for 5 comparisons) showing that the differences between naming and requesting conditions were specific to the three last tested time windows (TW3: À600 to À400 msec, p ¼ .002; TW4: À400 to À200 msec, p < .001; and TW5: À200 to 0 msec, p < .001) (see Fig. 2A). Furthermore, the repeated measures ANOVA revealed a significant interaction effect between the topographical factors of Laterality and Gradient (F (4, 84) ¼ 3.30, ε ¼ .49, p ¼ .048, h p 2 ¼ .14), which was due to the frontocentral maximum of the negativity and polarity reversal at posterior electrodes (see Fig. 2C). Post-hoc ttest (Bonferroni-corrected for 9 comparisons) showed that along the midline, the negativity was greatest at anterior (p < .001) and central ( Similar to the cluster permutation tests, we ran two additional 2-way repeated-measures ANOVAs (Communicative act x Time window) on the horizontal and vertical EOG channels respectively to test for any possible significant differences in the ocular EOGs activity between the two conditions. However, no significant differences were found between naming and request functions either in the horizontal (F (1, 21) ¼ .08, p ¼ .784) or in the vertical EOG (F (1, 21) ¼ 1.22, p ¼ .281). Finally, visual inspection of the EMG activity (Fig. 2B) revealed that muscular activity might have been manifest starting around 200 msec before voice onset. However, the time course of the measured EMG was virtually identical in the two conditions on the entire time window.

Source analysis
To identify the cortical sources underlying the significantly different neurophysiological responses recorded prior to naming and request actions, we conducted a distributed source localization on those time windows where interactions between the factors Communicative act and Time window were c o r t e x 1 3 5 ( 2 0 2 1 ) 1 2 7 e1 4 5 significant (À600 to 0 msec relative to voice onset). Sources of the EEG responses for naming and request revealed activation of temporal-frontal regions (for more detail, see Table 2) with request function activating additionally bilateral motor cortex (BA3/4 with peak coordinates x ¼ À26, y ¼ À27, z ¼ 56 and x ¼ 26, y ¼ À27 z ¼ 58, see Fig. 2E), which was not activated in naming. The proportion of unexplained variance was ca. 8% for both source estimates, which is comparable to that reported in previous studies and indicates successful source estimation (e.g., Miozzo et al., 2015). In addition, we computed the subtraction of the difference source maps of request e naming and of naming e request to further scrutinize the specific difference in cortical locus between these two speech acts. The results confirmed that requests produced stronger bilateral motor cortex activations as compared with naming. Conversely, middorsal prefrontal and anterior-inferior temporal activation foci tended to be relatively stronger for naming (see Fig. 2E). Notice that no significant differences in source statistics were found, likely due to variability in single subject ERPs, and the source maps for the significant ERP time window was therefore computed to take advantage of the large signal-to-noise ratio of the grand average (see for instance Egorova et al., 2013;Hauk et al., 2006;Shtyrov, 2011).

Acoustic analysis
It is possible that neurophysiological differences relate to differences in the physical effort subjects spent during articulation. Although our EMG data speak against this possibility (see Fig. 2B), it is important to also assess possible differences in the acoustic makeup of speech produced during naming and requesting. To this end, we performed an acoustic analysis of the produced utterances. Wilcoxon signed-rank tests were performed on the data, which did not show any significant differences in utterance duration (msec), loudness (RMS, dB), pitch (F0, Hz), jitter (ms), shimmer (dB) and harmonic-tonoise ratio (HRN, dB -see Fig. 3 and Table 3).

Discussion
Before naming an object or requesting it from a partner, neurophysiological activations indicate the speaker's communicative intention, that is, the communicative function of the intended speech act. Specifically, a negative-going anticipatory potential resembling the Readiness Potential (RP) preceding the onset of motor acts appeared already ca. À600 msec before voice onset and, interestingly, distinguished between speech acts. Because the RP predicts upcoming movements and their muscular origin, whereas the anticipatory wave reported here indicates linguistic-communicative function, we prefer a different name for it and follow Grisoni et al. (2019) in calling it 'prediction potential' and, to be more specific to the present context, 'pragmatic prediction potential' (PPP). PPP amplitudes were larger when the same verbal materials were used in interactions with a partner for requesting objects in a role play of shopping than when they were uttered for naming objects in a role play of language testing. Calculation of cortical currents underlying the predictive pragmatic potential before naming/requesting suggested differences between the underlying source constellations. In the motor system, more precisely in the motor cortex controlling the hand, there were stronger sources prior to requesting than naming. Even though we need to interpret the sources computed from grand-averages cautiously, one may consider that this results is compatible with the proposition that the brain correlates of request production reflect aspects of pragmatic information relevant for this speech act type. Speech acts are characterized by the set of partner actions they regularly entail, and for requesting physical objects, one of these typical partner reactions is the handing over of the requested item. It is possible that the motor system activation in anticipation of request production might reflect the prediction immanent to requests that the partner will perform the hand motor movement to follow the request. This interpretation draws on the idea that motor system activity can indicate actions of a partner, a finding well established by the body of research on Mirror Neurons (e.g., Rizzolatti & Craighero, 2004;Rizzolatti, Cattaneo, Fabbri-Destro, & Rozzi, 2014). Importantly, the additional motor system activation to requests as compared with naming was frequently reported in previous studies of speech act understanding using spoken, written and gestural utterances (see Section 4.1).

Brain activity anticipating upcoming speech act production
It is important to point out that previous studies have already investigated the neural basis of communicative function processing. These studies had participants observe or listen to recorded social interactions in/from a third-person perspective (computer-based experiments) (Ba sn akov a et al., 2014(Ba sn akov a et al., , 2015Egorova et al, 2013Egorova et al, , 2014Egorova et al, , 2016van Ackeren et al., 2012van Ackeren et al., , 2016. One very recent study used a second person perspective by presenting wordegesture combinations directly addressing the experimental subject, who occasionally had to respond to the perceived communicative acts by pointing to or handing over an object (Tomasello et al., 2019). Here, we complemented this previous research by looking at the firstperson perspective, the case where the experimental subject herself performs the critical speech act in the context of language games, that is, simulated social interactions with a real confederate. Pragmatics, as the study of language in use in social context, requires such and related attempts to place language in communicative interaction contexts. Our study is thus in the spirit of recent developments in cognitive neuroscience, examining neural underpinnings of cognitive processes in general (Czeszumski et al., 2020;Hasson & Honey, 2012;Kasai, Fukuda, Yahata, Morita, & Fujii, 2015) and to linguistic processes more specifically (Goregliad Fjaellingsdal et al., 2020;Hasson, Ghazanfar, Galantucci, Garrod, & Keysers, 2012;Kuhlen, Allefeld, Anders, & Haynes, 2015) in more naturalistic and social settings.
In the present experiment, we show that the readiness like anticipatory brain activity called 'prediction potential', which indicates semantic expectancy (for a review, see , can also be an index of linguistic pragmatic information about upcoming communicative actions. Information about the speech act function of utterances was manifest 600 msec before voice onset. Previous studies reported similar anticipatory brain activity prior to perceiving predictable action sounds , specific actions and visually related written or spoken words (Grisoni et al, 2017Grisoni, Tomasello, & Pulvermü ller, 2020), and predictable words in sentence context generally (Le on-Cabrera, Flores, Rodríguez-Fornells, & Morís, 2019; Le on-Cabrera, Rodríguez-Fornells, & Morís, 2017). Several of these studies also reported that differences in meaning between the predictable words were reflected in the topographies of the PP, thus providing a neural estimate of semantic prediction. A relevant critical question is to what extent the PPP found in the present study relates to the one found during semantic expectation in work on language comprehension (Grisoni et al., 2017. Although the two components, the semantic and the pragmatic PP, emerged in different modalities (comprehension and production) and in one case reflected a difference in meaning of the predicted target words and in the other one in speech act function, they are similar in at least four ways. First, they appear before a predictable meaningful symbol-in-context (word in semantic or speech act context). Second, they emerge slowly with a negative-going polarity and a maximum at fronto-central recording sites. Third, their sources are in part in motor systems, and fourth, unlike the RP, which indexes basic motor movement, they reflect higher  c o r t e x 1 3 5 ( 2 0 2 1 ) 1 2 7 e1 4 5 cognitive information about action related meaning aspects or predictable partner actions at abstract semantic-pragmatic levels. Importantly, as reported in the present study, these prediction potentials are modulated depending on the linguistic semantic or pragmatic information attached to the upcoming utterances, i.e., their meaning and communicative function. Thus, it appears that there is a new family of brain responses, superficially similar to the RP, but with much broader and more far-reaching cognitive scope, which may be of relevance for future investigations into the brain's prediction mechanisms .
To further examine the consistency of the findings across neuropragmatic studies, Fig. 4 presents results of previous experiments revealing brain indexes of speech act function. When calculating the distributed cortical sources of the PPP obtained in the present study for naming and request contexts (see Fig. 2E), bilateral precentral cortex activity tended to be stronger in preparation of request production compared with activity emerging before naming actions. This was confirmed also when calculating the differences source maps between request and naming functions (i.e., request-naming and naming-request, see box in Fig. 2E). The peak voxel of the left precentral motor focus for requests had the MNI coordinates x ¼ À26, y ¼ À27, z ¼ 56, which was only 11 mm away from the one found in the recent neuropragmatic study by Tomasello et al. (2019) investigating comprehension of requests compared to naming in second person perspective (x ¼ À28, y ¼ À38, z ¼ 58). As shown in Fig. 4, also the study by Egorova et al. (2016), which addressed speech act comprehension in third person perspective, and work on indirect requests by van Ackeren (2012; 2016) yielded comparable precentral activations. In spite of the degree of uncertainty immanent to neurophysiological source localisation (H€ am€ al€ ainen & Ilmoniemi, 1984;Ilmoniemi, 1993), these neuropragmatic activation foci are localised close to each other and also close to an index of semantic processing of hand-related action verbs and finger movement reported by Hauk, Johnsrude, and Pulvermü ller (2004). This convergence of neuropragmatic results on pre-existing speech act differences and semantic brain indexes indicates that close-by, overlapping or shared neuronal sources underlie the processing of requests in anticipation of speech act production and in speech act comprehension. This finding is consistent with, and therefore provides support for, neurocognitive psychological and linguistic theories claiming shared neuronal mechanisms being engaged in comprehending and producing social communicative actions (Pickering & Garrod, 2004, 2013Pulvermü ller, 1999Pulvermü ller, , 2018Strijkers & Costa, 2016). These results are not easily accounted for by brain language models that postulate a separation between the brain mechanisms of speech production and comprehension or by models not acknowledging a role of the motor system in semantic-pragmatic understanding and production. The evidence for a role of motor activity in indexing the pragmatic function of language sits nicely with a broad range of recent studies supporting the relevance of motor areas for the processing of other types of linguistic information, in particular at semantic (Dreyer et al., 2015;Dreyer, Picht, Frey, Vajkoczy, & Pulvermü ller, 2020;Grisoni et al., 2016;Hauk et al., 2004;Pecher, Zeelenberg, & Barsalou, 2004;Pulvermü ller, Hauk, Nikulin, & Ilmoniemi, 2005;Shtyrov, Butorina, Nikolaeva, & Stroganova, 2014;Tomasello, Garagnani, Wennekers, & Pulvermü ller, 2017, Tomasello et al., 2018Vukovic & Shtyrov, 2014) and phonological levels (D'Ausilio et al., 2009;Pulvermü ller et al., 2006;Fig. 4 e Motor cortex activation during request production and comprehension and during understanding of single action words. 6 mm spheres centered on the peak activation coordinates of request speech acts reported in the current production study (shown in cyan) and in previous speech act comprehension studies (other colors). For comparison, peak activation coordinates are also given (in red) for the overlap activation between action verb comprehension and finger motor localizer tasks (Hauk et al., 2004). Only peaks in the left motor cortex system are reported.

Methodological considerations
In contrast to much work in speech production focusing on object naming in computer-controlled experiments, we here set out to move towards a novel paradigm that closely approximates real life social communicative interaction. It is clear that such an endeavor cannot result in real communication but can only approximate this goal to a degree, as it is necessary to control aspects of the experiment so as to allow for conclusions on the critical variable, that is, on speech act function in the present case. However, it is also clear that real life-approximating settings are in greater danger than standard ones of being confounded by factors not well controlled. We gave our best to exclude some putative confounding factors and will summarize these below. Crucially, the differences in neural response between communicative actions prior to speech cannot be attributed to the verbal and object material used, because across-subjects the same materials were used in preparation of, and to perform naming and request functions, and cannot be due to mere anticipation of the motor response produced by the communicating partner, as this response was very closely matched between both communicative actions. To this end, we avoided differences in the subject's own actions following the critical (naming/request) speech act. In particular, subjects were not expected to perform hand actions during the experiment or respond otherwise to the confederate's activities. The only difference between conditions was the location where the object was placed: during requests, the object was placed in the 'subject's basket', whereas in the naming condition, the object was removed from the table and placed in a 'non-specified basket', a minor difference which is unlikely to explain the profound neurophysiological differences as we believe. We performed a further analysis to explore if the side of the confederate's response actions might have caused differences in the topographies of brain responses. This analysis did not show significant statistical differences, thus indicating that the predictive brain responses did to some degree abstract away from the actual subsequent partner actions, as they did not encode information about which hand (left or right) was used by the confederate to respond to the object request.
Despite the fact that the same single words were uttered in both speech act conditions under investigation, certain physical, acoustic and articulatory aspects of the produced utterances might have systematically differed between them. As the RP associated with overt body movements is known to vary in amplitude in a way that matches the physical properties of the prepared movement (Shibasaki & Hallett, 2006), differences in PPP between naming and request might have reflected subtle differences in the activity of the articulatory system, subsequently reflected by the acoustic properties of the produced utterances. To explore this possibility, we performed an acoustic analysis of the produced utterances. Convergingly, none of the examined acoustic measures (parameters) e which included duration, loudness, pitch, jitter, shimmer and harmonics-to-noise ratio (HNR) -differed significantly between conditions (see Fig. 3 and Table 3). These results argue against a possible confounding by acoustic features of the produced utterances, although we cannot exclude with certainty that the utterances might have varied in ways that were not captured by the parameters observed here. The absence of any differences in acoustic properties of naming and request communicative actions produced in the current experiment appears to be in contrast with other findings by Hellbernd and Sammler (2016), who demonstrated that acoustic profile of same utterances produced with different communicative intentions differed and can also be reliably used by the listener to infer the latter. In particular, it is wellknow from the linguistic literature that many speech act types differ in prosodic features. For instance, questions and statements expressed by the same sentence form have markedly different prosody in many languages, including, for example, English (Horn & Ward, 2005;Srinivasan & Massaro, 2003) and German (Schneider, Lintfert, Dogil, & M€ obius, 2012). For single word utterances as the ones used here, however, we are not aware of any studies showing consistent prosodic differences between requesting and naming actions. Therefore, the absence of acoustic or prosodic differences suggested by our analyses should not be taken as evidence against different communicative roles of the utterances used in the naming and request contexts of our study. Furthermore, the fact that specific acoustic profiles may help to convey speech act type to the listener does not mean that this type of information is necessary for the listener's correct understanding, or for the speaker's appropriate production. Instead, other types of information e such as many aspects of context, including action sequence and common ground e are available and disambiguate communicative function. In the present experiment, the lack of difference between acoustic profiles of naming and request functions could have been a consequence of the block design applied, in which the imaginary context was kept constant across trials so that subjects might have deemphasized prosodic cues for indicating speech act function to the confederate. We also wish to stress that the neurophysiological and source responses documented by previous studies on communicative action processing used written words on a screen lacking prosodic information (Egorova et al., 2013;Tomasello et al., 2019) and showed results analogous to those revealed by the present study. This demonstrates that the neural activations patterns revealed here also appear in case prosody does not play a role in conveying communicative intentions.
For the same reasons as elaborated above for prosody, we consider it improbable that different neural responses seen between naming and request were caused by differences in articulatory movements. As previous studies have shown a time delay of 100e200 msec between articulatory movement and voice onset (see Fargier et al., 2018;Salmelin, 2010), we repeated the cluster-based permutation tests in a reduced time window excluding these very last 200 msec preceding voice onset (where articulatory confounds appear likely), which revealed similar significant differences between naming and request as found across the entire time window. It should also be noted again that the computed EMG activity indicated equal articulatory contributions in both conditions across the entire time window of interest (see Fig. 2B), which further argues against the possibility that differences in articulatory artefact underlie the differences identified in the EEG signals.
As a further possible caveat, due to embedding in a language game context approximating everyday communication, certain body movements, and in particular ocular movements could not be avoided entirely. As we anticipated this issue, our subjects were instructed to fixate a dot located in the center of the table before and while producing utterances, with the goal to reduce ocular activity. Also, analysis of the vertical and horizontal EOG signals in the same time window of interest used for EEG data analysis did not reveal any significant differences between the two conditions. However, we cannot completely exclude the possibility that object preference had other effects. For example slightly different object selections might have been made in naming and request contexts. We did not take note of the exact object choices made from trial to trial, but should remind the reader that the picture set of objects was predefined for each block and exactly counterbalanced and matched across conditions, so that only the last picture remaining on the table at the end of a block might have systematically differed between naming and requesting, a difference unlikely to affect brain responses recorded across a large stimulus set.
For the reasons summarized above, we consider it as unlikely that the presence of articulatory or ocular artefacts or a selection preferences might have produced differences between the experimental conditions which acted as confounds of the results reported. But again, also in this case, there is no ultimate certainty; it is still possible that artefact-induced variability in the EEG and EMG signal might have made subtle differences between conditions (e.g., topographical differences) more difficult to detect statistically. Altogether, the larger pragmatic predictive potential (PPP) prior to requesting as compared to naming cannot be related to differences in the linguistic properties of the verbal material applied and are unlikely to be due to the way these were articulated or to their acoustic features. They can only be related to the distinct linguistic-pragmatic information intrinsic to the communicative context, in which critical words were articulated. Therefore, our study shows that, apart from any general RPlike function in indexing motor preparation, the PPP preceding speech acts reveals and predicts cognitive features of the upcoming speech acts, and in particular aspects of their illocutionary force.

Limitations and outlook
Here, we will take a closer look at the limitations of the present work and issues still left open for future study. Whereas differences in the size of predictive potentials could solidly be documented, our source localization showed different patterns of activated cortical areas in anticipation of naming and requests. However, these source estimations were based on grand-average event-related potential data. As already mentioned, data obtained from single subjects were too noisy and thus variable to allow for meaningful statistical analysis across conditions. Still, grand average potentials led to different activation landscapes, which we here interpret. We tried our best to avoid and reduce noise in the electrocortical responses (ICA analysis, interpolation of bad electrodes, rejection of data from 3 subjects due to bad data). However, the only way for obtaining source estimates with good signalto-noise ratios was to calculate them from grand average data. This limits the conclusions from the source estimates, as no statistics on the obtained sources was possible. Even if the precentral activation focus seen in request preparation but not in the naming context fits very well with previous neuropragmatic studies (see Fig. 4), and even though it emerged in the source map of requests and again in the brain map sources of request-naming), it is necessary to reconfirm this result by future studies applying source statistics and, crucially, direct statistical comparisons between conditions. However, the noisiness of neurophysiological recordings before and during speech production appears as a major obstacle here. We found a difference in brain indicators of speech act processing consistent with the idea that the predictable partner actions are to a degree reflected in the brain response characterizing a speech act. However, it is important to note that such a difference in the sequence structure of speech acts is only one of the many aspects that may be relevant, and that may in principle be reflected at the level of the brain. In fact, request actions differ from naming not only in terms of their sequence structure (i.e., typical follower actions), but also in terms of attention (directed to the object in the case of naming and to both object and to the partner in the case of request), memory (later checking whether the right object has been selected by the confederate), and degree of complexity of the social situation (lower for naming, higher for requests). Furthermore, requests and naming actions, even if both performed in a communicative setting such as shopping and examination, differ with regard to motivational, affective and emotional factors and in mental states including theory of mind. Please think of the desire of the requesting party characterizing a true request or the belief of the examinee that the tester knows the correct answer in the test. One may argue that such affective-emotional-mental differences confound our study and make it impossible to attribute the results to speech act function. However, we have to strongly argue against such a position. In fact, all of these above-mentioned differences are intrinsic to the speech act types targeted, and each of them may be relevant for the neurocognitive differences observable in the current paradigm and in similar neuropragmatic studies. Speech acts come as a package of knowledge, beliefs, intentions, emotional states and also utterances to be produced, and it is a relevant topic of current research to examine any differences in brain indexes between them. In addition, other aspects of the actions predictably following the to-be-performed speech acts might become manifest in brain activity. To disentangle which aspects of the investigated speech acts is critical for the observed neurophysiological differences remains to be matter for the future. At present, we can only offer hypothesis about which specific factor(s) was/were reflected.
Disentangling some aspects of the investigated speech acts and their specific brain indexes may be possible in the comparison between similar studies. In Egorova and colleague's fMRI study, there was a range of brain areas that became more strongly active in request as compared with naming contexts (Egorova et al., 2016). Over and above premotor cortex, these c o r t e x 1 3 5 ( 2 0 2 1 ) 1 2 7 e1 4 5 areas included temporo-occipital cortex, which may point to a difference in specific attentional loads. A possible reason why these latter neural differences were not replicated in the present study could relate to the matching for general communicative embedding between the conditions of the present experimental design. The parallel instructions motivating subjects to imagine real life interactions (between a salesman and a customer and between an examiner and a testee) may have contributed to similar focused visual attention being directed toward objects and therefore may have cancelled any differences in posterior temporal-occipital cortical activation. Needless to say, this is hypothetical and requires future follow up by controlled experiments. Although we have highlighted the possible role of the premotor activation enhancement during requests as a brain index of action sequence structure processing, we do not wish to exclude the relevance of other features that distinguish speech acts at the cognitive and neural levels. Hence, further studies with more precise localization tools should investigate more closely the specific cortical locus of subtypes of speech acts in social interaction.
A further clear limitation of our work relates to the role play settings implemented. As the simulated shopping and test scenarios had the character of role plays, they were markedly different from actions in real communicative situations. Thus, one may argue that the subjects may just not have followed the instruction and refrained from joining the game. In this case, some type of naming would have been performed in both conditions and the neurophysiological differences, which in part match previous results, would be unexplained; therefore, we consider this possibility as not so plausible. Additionally, the lack of main effect of Exposure time or interaction between the factors Communicative act and Exposure time (first vs second experimental block), fails to support any general or differential fatigue or disengagement effects in request and naming conditions. However, it may still be that aspects of the brain responses reflect the artefactual role play scenarios and may not be generally present during speech act processing in real life conditions. Furthermore, because of the block design, the type of speech act performed by the subject as well as the response produced by the confederate was constant across several trials. This rigidity in the dynamics of the dialogues, along with their elementary character (including only the target speech act and its most likely successor), does not fully reflect natural communication, where each speaker's contribution can only be predicted probabilistically and may vary across multiple plausible options (see Gisladottir et al., 2018) while unexpected response actions cannot be excluded. Therefore, future work towards more natural communicative settings should attempt to integrate event-related designs, where speech act type is randomized across trials (e.g., as in Tomasello et al., 2019), as well as a diversification of the response of the interlocutor (e.g., as in Egorova et al., 2014). There is good reason to strive for even more close approximation of real-life situations, although there are certainly limits to this endeavor due to the need for controlled experimenting.
One more fruitful direction of future study is the investigation of directive speech acts of different types, including requests that do not refer to concrete objects. In the current study, requests were operationalized as asking for an object with the intention to have the listener hand it over. However, although this type of request is extremely common in everyday situations, it does not exhaust the different types of requests and directive speech acts. In fact, one might as well request things that are non-material (e.g., request some money to be transferred electronically to a bank account), or abstract (e.g., request attention, request more time for completing a task). Likewise, speech acts such as requesting, commanding and asking questions, which all are grouped together in Searle's category of 'directive speech acts' (Searle, 1979) may be characterized by shared and distinctive neurocognitive features, thus providing much motivation for additional study. One might hypothesize that sub-types of requests and directives share their neural signatures with those found here for object-related requests in the current study. The question of whether this is really the case is however still open.
From a theoretical perspective, the current experiment contributes to the body of literature exploring the mechanisms of speech production. However, because of the characteristics of the slowly rising predictive potential and because of the absence of additional experimental factors, this study cannot relate processing of speech act type to other aspects of linguistic processing such as semantics or phonology, and in particular it cannot establish the temporal relationship between them. Note that present work on speech production suggested near simultaneous access to semantic, lexical and phonological information during speech production in the standard naming paradigm (Miozzo et al., 2015;Strijkers et al., 2017). The time course of pragmatic information access in speech production still needs to be investigated in such contexts, similar to earlier work in the domain of language comprehension (for example, see Egorova et al., 2013;Tomasello et al., 2019).
In summary, this study on brain signatures of speech act production revealed (i) a predictive index of speech act function starting ca. 600 msec before the actual articulation begins. Our study also provided strong evidence that (ii) different predictive brain indexes appear for different speech act functions, in our present case naming and requesting performed using identical linguistic forms. Finally, (iii) the estimated cortical sources differentiating between prediction potentials of naming and requesting resembled those found earlier during communicative function understanding. Even though some of these conclusions call for confirmation, as pointed out above, these results provide support for the claim that (not only linguistic form but, in addition) speech act function is neurally manifest in specific definable cortical activation patterns and that cortical-mechanistic resources are (at least partly) shared between speech act production and comprehension.

Conclusion
The current study investigates neural activity prior to speech production, when subjects use the same words to do different things, to perform speech acts with different functions. In one case, subjects were naming objects, while in the other case they were requesting them. We found larger predictive brain potentials when subjects were preparing for a request as opposed to object naming. Also, the brain activity patterns underlying the predictive potential differed insofar as significant sources in the hand motor cortex could only be found prior to requests but not in preparation of naming actions. In contrast to the readiness potential, which indicates motor preparation, we conclude that the predictive potential reported here reflects linguistic-pragmatic information about specific action-related communicative functions. On the background of earlier neuropragmatic work (Egorova et al., 2013(Egorova et al., , 2016Tomasello et al., 2019), our present results indicate that shared neuronal mechanisms contribute to the planning and production and to the perception and understanding of speech acts.

Open practices
We report how we determined our sample size, all data exclusions, all inclusion/exclusion criteria, whether inclusion/ exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study. No part of the study procedures and analyses was pre-registered prior to the research being conducted. The study in this article earned an Open Materials badges for transparent practices. Materials and data for the study are available at https://osf.io/gc7ny/.
Raw data supporting our study could not be made public as this is not permitted by the conditions of our ethics approval, which prevent sharing of individual subject data under any circumstances with anyone outside the author team.

Declaration of competing interest
The authors declare no conflicts of interests.