Sounding others’ sensations in interaction

ABSTRACT This study investigates the practice of “sounding for others,” wherein one person vocalizes to enact someone else’s putatively ongoing bodily sensation. We argue that it constitutes a collaborative way of performing sensorial experiences. Examples include producing cries with others’ strain or pain and parents sounding an mmm of gustatory pleasure on their infant’s behalf. Vocal sounds, their loudness, and duration are specifically deployed for instructing bodily experiences during novices’ real-time performance of various activities, such as tasting food for the first time or straining during a Pilates exercise. Vocalizations that are indexically tied to the body provide immediate displays of understanding and empathy that may be explicated further through lexicon. The existence of this practice challenges the conceptualization of communication as a transfer of information from an individual agent – even regarding assumedly individual body sensations – instead providing evidence of the joint nature of action and supporting dialogic theories of communication, including when language-marginal vocalizations are used.

reenactments (Sidnell, 2006), or animations (Cantarutti, 2020), terms for presenting an event as if it is happening in the current moment, have furthermore focused on representing other people holistically (i.e. when vocally depicting a musical performance, the body is also done by the depicter), rather than distributing different modalities among copresent participants, which will be shown here. It has hardly been examined how bodily sensations may be communicated by other participants who are not undergoing the sensation. Past research has looked at gestural mimicking of what the others may just have gone through (Bavelas et al., 1986) while the current study targets sounding.

Sounding practices
Due to a broader emphasis on verbal resources across linguistic and interactional research, very little is known about how non-lexical vocalizations are used to portray experiences, let alone those beyond the self (Wiggins & Keevallik, 2021a, 2021b. A recent surge of interest in so-called "marginally linguistic" tokens (Dingemanse, 2020), such as onomatopoeia, ideophones, and non-lexical vocalizations has demonstrated that these sounds, though phonetically diverse, have systematic uses in interaction (Keevallik & Ogden, 2020). Some of these tokens are particularly useful for displays of bodily events within specific contexts, such as expressing pain at doctor's appointments (La & Weatherall, 2020), exhaustion during running (Pehkonen, 2020), and emotional reactions during board games (Hofstetter, 2020), or instructing movement in dance classes (Keevallik, 2021). They are particularly suited for framing responses to events as spontaneous and genuine (Wiggins, 2002(Wiggins, , 2013 or potentially accountable or delicate (Hofstetter, 2020;Ogden, 2020). Much of this work is a continuation of the argument by Goffman (1978, p. 800) that these are presented as if they were "natural overflowings" of the body. Unsurprisingly, then, these kinds of marginal tokens are extensively used in the sounding for others practice, which involves the delicate accountability of representing another person as well as vocalizing a bodily event. By analyzing how participants attribute experiential phenomena to each other, and how coparticipants involve themselves in putatively inaccessible events in others' bodies, we can examine how the body and language are intertwined, and how sensorial experiences emerge through joint participation in interaction.

Temporality of interaction
Temporality is a foundational feature of interaction (Deppermann & Günthner, 2015). Overwhelmingly, interaction research has focused on the order of talk and the one-at-a-time organization of verbal interaction (Sacks et al., 1974;Sidnell & Stivers, 2013). While it was one of the earliest findings that affiliative conversational moves are performed early or in overlap with current talk (Jefferson, 1986;Pomerantz, 1984), the interest in choral speech (Lerner, 2002), nodding during other's storytelling (Stivers, 2008), early onset (Goodwin, 1986;Vatanen, 2018) and other overlap issues have appeared sporadically and have only recently been theorized into a coherent account of "early responses" that feature specific social meaning (Deppermann et al., 2021). In instructional settings, for example, repetitive imperatives have been shown to occur in parallel with or even after student actions (Mondada, 2017;Okada, 2018), and thereby affirm the student's action, thus synchronizing instruction and compliance. This study will provide further illustration of how participants coproduce sensoriality in a near-synchronous manner.

Joint action and entitlement
Temporally tight collaboration acutely raises issues of agency: who is (accountably) a speaker and contributor to a turn or utterance. Interaction analytic work provides evidence that grammatical resources may be distributed across copresent individuals, such as in coconstruction of linguistic units (Lerner, 1996(Lerner, , 2002. The notion of "speaker" as relying on a single physical body was prominently problematized by Goffman (1981). These analyses demonstrate that utterances are not only metaphorically constructed by multiple participants, via being embedded in long-term relations and histories (as argued in theories of polyphony, Bakhtin, 1981), but moreover that utterances are formed of components contributed by multiple participants in situ, forming coherent actions jointly.
Past studies in interaction have shown that speaking on others' behalf can enact empathy but simultaneously risks inhibition of agency (Goodwin, 2004;Norén et al., 2013;Robillard, 1996;Throop, 2010). Participants in interaction must manage the accountability of how they enact and distribute agency -both their own and that of others -as the interaction unfolds (Enfield, 2017). Studies of different activities and interactional settings show substantial variability in speakers' rights to be the sole expressors of their thoughts, emotions, and bodily experiences (Cekaite, 2016, p. 20;Heritage & Raymond, 2005;Jenkins, 2015;Stevanovic & Peräkylä, 2012;Wiggins, 2013). At the same time, speaking on others' behalf occurs frequently in instructional settings. Earlier research has established how, during instruction, grammatical structures emerge in the form of incrementally produced clauses (Lindström et al., 2020) or locally established formulas to accompany repetitive moves (Keevallik, 2020a). One can also instruct someone else to play an instrument by "singing" the sounds in nonsense syllables (Clark, 2016;Haviland, 2007;Tolins, 2013;Weeks, 1996Weeks, , 2002, but in that case it is the sound itself that is the objective of the activity rather than it being in the service of bodily concerns (concerning the paradigmatic difference of sounding for bodies vs. sounding for music, see Keevallik, 2021). In this paper, we show a number of ways to vocally represent the current experience of others' bodies, both for instructing and empathizing purposes, in particular indexing it through sounds that ostensibly emanate from a particular bodily event, thereby jointly performing the experience through distributed vocal and bodily resources.

Sensing in interaction
The jointness of bodily experience that sounding for others both indexes and achieves can be seen as an instantiation of intercorporeality, wherein bodies are taken to be constituted by their relations with other bodies and socialities, and thus not separate (Merleau-Ponty, 1964b, p. 168). Particularly relevant to this study is Merleau-Ponty's idea of "compresence," which refers to the way participants embody each other, and can use their own bodily experiences as a guide to ongoing experiences in others' bodies. Sounding for others would seem to rely on such an ability, however, as we show below, compresence is not only a precondition, but is also achieved through this practice. Within studies of interaction, the use of these concepts has only recently begun to be developed (see Meyer et al., 2017). However, the interactional organization of sensation more broadly, not just through the perspective of intercorporeality, has been observed since the 1980s; pain is displayed in an institutionally relevant manner at specific moments in doctor-patient interaction (Heath, 1989), seeing relevant details can be accomplished through professional training (Goodwin, 1994;Mondada, 2004;Nishizaka, 2017), tasting and smelling are collaborative actions (Mondada, 2018(Mondada, , 2020. While we have also found similar examples of senses being expressed with fully lexicalized items, the current study is delimited to non-lexical, or marginally linguistic, vocalizations.
Considering the variable entitlements involved in different embodied activities, we illustrate the locally contingent details of the practice in professional, everyday, as well as pedagogical environments, and then discuss the features of the practice as a whole, dissecting its theoretical consequences for the conceptualization of joint sensing, coproduction of action, and dialogicity.

The method and data
The study uses multimodal interaction analysis for documenting systematicity and accountability in participants' behaviors, based on video-recordings of interactional events, which enables scrutiny of the mutual coordination of otherwise fleeting, real-time practices (Broth & Keevallik, 2020;Goodwin, 2018;Mondada, 2019a). Crucially, the analysis involves the observation of participants' displayed sense-making of each other's behavior, which reveals the participant's role in organizing ongoing events, while avoiding the memory bias inherent when people are asked to account for their vocalizations in interviews as well as the ecological unnaturalness of experimentation with sounds performed by actors (Anikin & Lima, 2018). In order to document the practice, the data for this study come from a variety of activities: doctor-patient interaction, infant feeding, Pilates classes and dance lessons. The sensory phenomena found in these sources include pain, taste, and proprioception. We remain agnostic regarding which other senses or activities can be involved in the practice, as this is ultimately an empirical question of working through a broad variety of data. Our collection of instances is built opportunistically, through several years of data analysis within different projects.
Sounding the other's pain was detected in medical contexts. The doctor-patient interactions were drawn from The Applied Research on Communication in Health (ARCH) Corpus of Health Interactions held at Wellington School of Medicine, Otago University, New Zealand. It currently consists of English language data from nine separate studies with a total of 478 audio-video recorded health interactions involving 533 participants, all of whom provided their consent for their data to be used for future research. In this corpus, 30 sequences were identified in which patients sounded their pain rather than merely describing it (from six different visits, see, Weatherall et al., 2021) and, surprisingly, in two of the visits the doctor also did so.
Numerous instances of sounding for another person's proprioception were found in instructive settings, where the focus was on transferring bodily skills. The dance class data come from 38 h of video recorded group classes in Estonia and Sweden. Three of the 17 teachers speak Estonian (9 h), six Swedish (13 h), and 10 English (15 h). All the teachers signed their consent for research purposes; the students were informed orally about the study and of their right to opt out at the beginning of every class. In these data 20 strain sounds were identified. The Pilates data come from video-recordings of four classes given by one instructor in Estonian (total duration 4 h), where strained voice occurs at least five times per class (see an example in Hofstetter & Keevallik, 2023, pp. 65-67). All the participants agreed to the use of the recordings for research purposes and the teacher agreed to be non-anonymized in publications.
Sounding someone else's taste experience has to date only been found with infants and their caregivers. The infant mealtimes data comprise 66 video-recorded mealtimes recorded in Scotland (around 19 h of data). The parents, who were all white and spoke English as their first language, were given two cameras to record the meals with their 5-to-8-month old infants, and consented for the data to be used. The study was granted ethical approval by the University of Strathclyde. All families gave consent for disguised images to be used in research publications. 66 meals in five families resulted in 273 instances of gustatory mmms (Wiggins, 2019) and 391 lip-smack chains (Wiggins & Keevallik, 2021b).
Sounding for others being a relatively rare occurrence in most activities, our collection of these particular settings where they are reasonably common, allows us to point at its existence and its main defining features.

Analysis: collaborating in the expression of others
In this section we examine instances where one participant sounds for copresent others through vocal resources tied to the others' accountable sensory experiences, including pain, strain, and taste. Through vocalizing during the coparticipant's bodily events, the speaker claims access to their body and experience, and displays themselves as co-experiencing. To be clear, this is not to suggest that the speaker is claiming to be undergoing the same experiential or sensational events, but rather that the speaker uses their vocalizing to enact the event as one to which the speaker has access, thus displaying embodied understanding. It is one way to publicly accomplish Merleau-Ponty's (1964a) "compresence". In particular, we aim to demonstrate that the participants connect 2 the vocalizations to others' bodies through temporal coproduction, that this connection enacts a distributed speakership concerning the ostensible experience, and that in doing this sounding for the other's experience the speakers undertake certain actions, such as empathizing and instructing.

Sounding other's pain
Let us first return to our above case where the display of pain is jointly accomplished during a visit to a doctor's office, occasioned by a sore arm, which eventually gets diagnosed as a tennis elbow. As the extract (represented below as 2) begins, the doctor and the patient are sitting facing each other. The patient's arm is outstretched, and the doctor is holding her hand in his. He has been testing a few positions of the underarm that have not yet turned out to be painful. The doctor launches a new so-prefaced diagnostic question (lines 01-02) which marks it as being generated through inference (Bolden, 2006) and simultaneously moves his other hand to the patient's elbow and palpates it (see, Figure 1). His turn assumes a no pain response, which turns out not to be the case, as the patient displays pain (lines 02-03).
Extract 2. Tennis Elbow; TS-GP03-17 (Extract 1, expanded) As the doctor palpates the patient's elbow, the patient's upper torso makes a small jerking movement backward, during which she takes a sharp in-breath (lines 02-03), which amounts to the beginning of an embodied pain display (Weatherall et al., 2021). Immediately latched to this, the doctor's uuuw (line 04) responds to the patient's display. The patient and doctor each further verbalize that the palpated spot is painful (lines 03-09), treating the pain as belonging to the patient's body. In-breaths have been associated with pain responses in experimental research (Jafari et al., 2017). Although not a typical response cry for pain under Goffman's rubric (Goffman, 1978), the patient's sharp intake of breath has similar relevant features: it demonstrates the pain with immediacy, which is helpful for pinpointing a diagnostically relevant spot on the body; and it "floods out," or is produced in a particular moment of the interaction that enacts spontaneity and genuineness. The doctor's vocalization similarly does this immediacy, albeit just after the patient. The uuuw documents the doctor observing ( Figure  2) and possibly feeling the other's pain through the recoil, ratifying the precise timing (and thus bodily location) of the pain expression. It also puts the doctor's empathetic stance on record right away by prioritizing a conventionally emotion-relevant vocalization over further diagnostic inquiry (indeed, this kind of sound has been noted for its rarity in prior literature on diagnosis, which normally finds doctors prioritizing an analytic, noninvolved stance; Heath, 1989Heath, , 2002Heritage & Stivers, 1999). These sounds not only note the presence of the pain and its bodily trigger location, but they also transform the investigation of the elbow into a jointly achieved sensory occurrence. The doctor does more than inquire about or observe the pain, but participates in its expression. The uuuw sounds on the patient's behalf (and we have another instance with a different doctor producing a similar sound). Together, the doctor and patient organize a joint expression of the putative pain experience, even though their sensorial access to it is clearly different and treated as different, since the pain is located by both parties as in the patient's body. A pain experience is constructed jointly through the precise temporal placement of the sound, while the sounder implicitly claims entitlement to coparticipate in its enactment.

Sounding other's strain
Sounding for others can be especially useful in the instruction of proprioception in synchronously moving bodies. In dance classes, for example, teachers regularly vocalize strain at the precise moments when strain is due in students' ongoing performance, such as when needing to create tension or sharp moves. Extract 3 is taken from a class where a Charleston combination is being taught. The lead dancers in the dancing couples need to bring their partners from their side to the front and back again, which requires some well-timed energy twice during the pattern (during beats 1, 2, and 5). Specifically, the lead dancers must create sufficient tension between the partners to reverse the follow dancers' movement trajectory. During the extract the teacher is not dancing herself but observing the students and providing the Charleston rhythm with vocalizations (Keevallik, 2021). The vocalizations sound not only for the timing, but also the quality of the movement, and specifically enact the strain that ought to occur in the lead dancers' bodies on beats 1, 2, and 5 in line 02. In line 03 the teacher then praises one of the couples for their improved performance. The transcript features beat numbers as danced by the students, aligned with teacher's vocalized syllables.
Extract 3. Tension between dance partners (in Swedish); Höst 2, klipp 01 16:00 In line 01 the students are supposed to dance a basic Charleston step, side-by-side, in a rhythmical but relaxed manner, which is also reflected in the teacher's relatively immobile body ( Figure 3; the teacher in yellow, unfortunately behind her partner) as well as in the simple open syllables that often provide the baseline rhythm at dance classes. In line 02, however, the first syllable KRRhahh is qualitatively different: it features a lengthened trill that extends across the two first beats of the step pattern, and a heavy outbreath at the end. It is also louder than the previous syllables. During that time the leads are supposed to redirect the follows' energy so that they move forward, and to do that they need to build tension between them. During the next two beats the forward movement is again relaxed, which is reflected in the quieter open syllables chaga. Then, on beat 5 (line 02), the follows should end up in the front of the leads and their movement energy needs to be reversed again. This is accompanied by a markedly louder voice and a long syllable featuring higher vowels in the form of a diphthong uo (as opposed to the open back vowels used for the rest of the sequence). The vocalization QUOO is furthermore uttered with tense glottis, with the air barely seeping through, and with a clear glottal stop syllable onset (marked with "Q"). In addition, or perhaps in order to produce that tense sound, the teacher is also herself straining her body in parallel to the dancing leads (see, Figure 4). This is thus an instance where the teacher is not actually performing the strenuous dance move (she is not moving a partner) but chooses to accompany the motion in real time by vocalizing and embodying extreme strain at the very moment when the dancers' bodies should be experiencing it (see, for example, the rightmost person in Figure 4). Thus, the teacher vocalizes not only to provide rhythmic beats and coordinate the class, but uses the opportunity given by the vocalizations to additionally perform the embodied phenomena the students should undergo -in this case, enacting peaks of strain in real time. The sounding both empathizes with and instructs the students regarding the proprioceptive aspects of the synchronous move, as well as displays a co-experiencing of the strain through the embodied enactment. The teacher thereby accomplishes compresence in the dance and entitlement to sound the strain that is synchronously supposed to be present in the lead students' bodies.
In another instructive context, Pilates classes, we can likewise find displays of strain where the strain does not originate in the speaker's own body, but is or should be present in the bodies of others (this happens repeatedly during every class, see another example of the teacher sounding for a "boat pose" in Hofstetter & Keevallik, 2023). Extract 4 is taken from a Pilates class where the teacher has just demonstrated a new exercise called helicopter that implies moving legs and hands in a circular motion but to opposite sides while balancing on the buttocks. This is the very first time that the students try it out and the teacher uses a large gesture and a strain sound to accompany the exercise.

Extract 4. Helicopter (in Estonian); 2018 c2
The teacher times the beginning of the exercise by uttering a slightly lengthened ja "and" (which is characteristic of this activity, Keevallik, 2020b). She simultaneously launches a two-hand gesture that iconically shows the required circular movement of the legs (Figures 5,6). As she dips into the shape, she produces a strain sound, a nasal vocalization with a glottal onset and narrow glottis throughout (see about similar use of voice for incitement of others in various sports in Reynolds (2021)). The vocalization ends in a very high pitch, iconically marking the legs' arrival back at the top position together with the arms arriving in the upward position (Figure 7). The teacher's bodily-vocal performance makes the exercise visually and audibly available, highlighting the shape, trajectory, and most relevantly for this paper, the ostensible proprioceptive experience of the exercise and its inherent strain throughout the trajectory. The strain sound further extends over a reasonable time frame of the complete move, and the gesture and sound are instructive, indicating how long the exercise should extend. The multimodal display is also empathetic with the struggling students, thus achieving compresence in the exercise as well as entitlement to vocalize the experience even while not under strain herself. However, at this particular instance the vocally performed strain is not quite synchronous with what all the other bodies are undergoing, as the students are each toiling with their own tempo (see the variable leg heights in Figure 6), even though everybody is coordinated to try out the same move. The bodilyvocal production is thus also prospectively instructive, the experience is displayed to ease not only ongoing, but also upcoming attempts at performing the exercise.
Instructors across activities sound when others need scaffolding, marking matters that need attention. While this makes the sounding practice useful for demonstration (Keevallik, 2021;Tolins, 2013), in the above examples, the referents emerge near-synchronously and in a separate body from the instructor. The instructor relies on, and enacts, rights to sound on behalf of student bodies, apparently drawing on her own embodied experiences to highlight qualities and sensations useful for proper performance. Real-time scaffolding and correcting of bodily moves thus emerges as one major affordance of the sounding for others practice.

Sounding other's tasting
We now move on to a different sense, that of taste, and explore how sounding for others can display coparticipation in a sensing event while maintaining ambiguity as to whether the sounding is an instruction or an (empathetic) acknowledgment. In our chosen setting, parents are sounding for their infants during a meal. We specifically selected moments when the parents are not eating and hence their vocalizations are apparent as sounding for the infant who is eating. In Extract 5, Mum uses various sounds when their 5-month-old infant appears distracted while eating a new food. Earlier in this episode, Mum demonstrated concern that the infant might be disinterested in eating (asking "you not sure?," "do you not like it?," see also line 01 in Extract 5). The mother sounds with lip-smacks (transcribed as (.)m(p)t (ḁ) to render the exact phonetic qualities) and gustatory mmms that work to enact the infant's experience of eating and tasting (see detailed arguments in Wiggins & Keevallik, 2021a, 2021b, as well as encourage the relevant eating practices: chewing and swallowing the food. They involve distinct sequential, temporal, articulatory, and, crucially, embodied features that accomplish the sounds as for another.

Extract 5. Yummy; McD003_0424
The sequence begins with a question about whether the infant likes the food (line 01) and Mum's attention moves soon after (line 03) to check as to whether the infant still has food in their mouth. The request to "see" inside the infant's mouth is followed immediately by a series of lip-smacks (line 04), which model the opening-and-closing movement that would both facilitate chewing and enable visibility of food in the mouth. The focus here is thus not only on tasting (and "liking"), but also on chewing food. The infant then opens their mouth while also turning their eye gaze to Mum. Perhaps ironically, this movement then results in some food falling out, though this at least provides evidence that there was actually food in the mouth.
The mutual eye gaze that is achieved in line 04 (and continues to line 10) provides the right conditions for a gustatory mmm (Wiggins & Keevallik, 2021a) soon after on line 06, which reemphasizes the tasting event that is ongoing with the chewing. This is produced with Mum's lips tightly pressed together and head slightly raised back ( Figure 8) and thus features visual as well as auditory elements. The close physical proximity of the participants and the exaggerated closing and opening of the mouth -both with the gustatory mmms and the lip-smacks (see also line 08) -may encourage the infant to mirror the mouth movements and certainly there is evidence of this on lines 04 to 09. Regardless of whether these displays actually encourage the infant, the sounding practices of Mum, combined with facial gestures, leaning in, and positioning her face close to the infant, work to enact tasting and eating on behalf of the infant. The mmm in line 10 is prosodically upgraded through a pitch change from prior mmms (e.g., line 06) and the following yummy lexically formulates and makes explicit the pleasurable orientation of the mmm; combined with the preceding sequence of mouth movements and sound objects, the ongoing sensory experience is thereby constructed as assessment-relevant for the infant (see, Wiggins & Keevallik, 2021a). This assessment is only relevant when some eating and tasting has occurred, processes which required the infant to have followed along with Mum's enactments. The two coordinate together in this respect, as Mum's sounds have been precisely coordinated with the events of the infant's mouth. Through continual bodily mirroring and sounding, there is abundant evidence of the two participants being highly attuned to each others' bodies.
With this extract, we have demonstrated how parental sounding practices can both chime in with the infant's sensory experience while it is supposed to be ongoing, and also be tailored to instruct the relevant moves required for eating solid foods. The highly salient vocal and embodied production of lip-smack particles and mmms, featuring extended sounds, rhythmicity, and extreme pitch movements, allow the parent to draw attention to specific, currently-relevant sensations and movements in the infant's own body. Through smiles, Mum's laughter, raised eyebrows and subsequent lexical items such as yummy, the eating experience is furthermore framed as a pleasant one. The observing, empathetic and instructive aspects of sounding for others are here intertwined in the real-time achievement of the sensoriality of taste and the proprioception of jaw movements. Concerning sounding for others more generally, this analysis exemplifies how participants can remain ambivalent about whether they are empathizing, instructing, or both; most critical is that the sounding for others practice permits co-participation in sensory-relevant activities.

Discussion
In this paper, we have shown examples of one person sounding for others, using the voice, in particular non-lexical vocalizations, to capture some aspect of the others' bodily experience. The practice is used across activities for tasks including coproducing sensations, expressing empathy, coordinating multiple bodies together, and instructing. Through the temporal positioning in immediate relation to the experiencing and moving bodies -in other words, by temporally fitting the vocalization to another's bodily event -these vocalizations achieve their meaning and legitimacy as "expressing that very experience". Likewise, the qualitative fitting of the vocalizations with others' sensory events binds the participants' experiences and vocalizations into one whole. In the following, we will discuss the contribution of this study with respect to the broader array of non-lexical vocalizations, the temporal coproduction of the practice, the joint accomplishment of sensorial events, as well as the actions carried out through its use.

Non-lexical vocalizations used for coproduction of sensation
While research on non-lexical vocalizations is scarce (Dingemanse, 2018(Dingemanse, , 2020Keevallik & Ogden, 2020;Reber, 2012), most of the existing work targets onomatopoeia and recently ideophones -words that depict sensory experience, used in regular conversation (Dingemanse, 2017). The sounding for others practice, however, does not depict, since the event being sounded for is occurring almost at the same moment; the referent is not removed by time or place as in Clark's (2016) framework for depictions. This study instead broadens the horizon of vocalization studies by targeting sounds emanating from current sensory experiences: the pain sounds enact ongoing pain, the strained voice vocalizes right now the strain happening in the students' bodies, and the gustatory mmms and lipsmacks enact current tasting. This is a specific body-based subset of all non-lexical vocalizations, characterized by the immediate representation of the bodily sensation and reflective of the bodily contingencies of production, such as the mmm being producible with a closed mouth full of food. In the case of strain, this representation is furthermore impossible to perform without employing muscular strain in the vocal apparatus as well as the abdomen, thus creating the necessary bodily bases that the sound indexes (Hagins & Lamberg, 2006;Massery et al., 2013;Tammany et al., 2021;Welch & Tschampl, 2012). It is precisely this bodily immediacy that makes these kinds of vocalizations useful in the sounding for others practice. While adjacent talk may provide lexical specification of the sounds' local meaning (e.g., "it is sore," "yummy"), the non-lexical elements allow for highly context-specific modifications in production. Prosody alterations can suggest suitable intensity, sharpness or severity of the current bodily experience while length can be modified, respective to the event sounded for, to suggest intimate connections between the sensation and the vocalization. The sounds are accountable and recognizable indices of bodily events. They may feature a degree of conventionalization but are most probably less language specific than, e.g., ideophones (Dingemanse, 2018) or surprise tokens.

Close temporal coproduction of (individual) sensations
The mechanisms that connect one person's voice to another's body are the temporal placement of the sound at the exact moments when the other is assumedly sensing pain, taste, or strain and fitting the sound choice for the kind of embodied experience underway. Broadly speaking, the sounding is synchronous with those experiences, which is in contrast to most vocal behavior (talk) being sequentially organized into one-at-a-time speaker turns. However, there are also fine differences in the temporal organization of the beginning and the end of the sounds identified: while the doctor's pain cry responds with immediacy to the patient's recoil and in-breath (initial phases of a pain display, Weatherall et al., 2021), the parent's mmm is timed to be simultaneous with the putative taste emerging in the infant's mouth, and the instructor's strained qnnnnnnnnnnmmmuhh is launched slightly before most students have started the strenuous move. We thus documented three slightly different organizations of the sounding practice: instances of responsive sounding by the doctor (even though the sound is uttered at the exact moment when it could also have been said by the patient), synchronous sounding with the taste and the precise strained moment in the dance, as well as anticipatory sounding that features micro-sequentiality (Mondada, 2021a) and emerges as instructive (such as in the lip-smacks and the Pilates move). The sounding makes both empathizing and instructing potentially relevant, and thus the vocalizer is vulnerable to being seen as doing either or both. The relative anticipatory timing in instructional contexts may lean into claiming stronger rights to know and access the sensation, and thereby do instruction more obviously. In contrast, the relative delay in the doctor's pain cry for the patient backs away from such a claim, which may be diagnostically relevant. Further investigations can elucidate the different contingencies and results of these timings. Intersubjectivity is achieved near-synchronously, by one person sounding the other's current sensory experience, as opposed to producing consecutive actions in interactional sequences, such as instructing in advance of the event what the sensation is going to be (e.g., in classes of professional tasting skills, Mondada, 2021b).

Joint performance of a sensation
Through the temporally precise implementation of a suitable vocalization with the ostensible experience by other, the emerging action as well as compresence is achieved across different participants: one undergoing an experience (such as the infant or the lead dancer in the couple) and the other sounding that very experience, thus distributing the resources to perform the action. The practice thereby fundamentally undermines the conceptualization of action and speakership as tied to an individual body and voice deployed by one person at a time. The role of a sensing and expressing agent in interaction is here assembled across participants, further problematizing the conduit-metaphor conceptualization of what it means to communicate, and to be a speaker or a receiver. An agent is not an individual sender of information but an assembly of bodily and vocal resources that may be variably distributed across participants.

Functions of the sounding for others practice
The two main functions emerging from the above collection of cases are the performance of empathy and instruction. While empathy has been explored as a sequential phenomenon (Heritage, 2011;Ruusuvuori, 2005), it has also been shown that affiliative responses are regularly launched early in relation to the prior turn (Pomerantz, 1984;Vatanen et al. 2021). The sounding for others practice is extreme in that it vocalizes the ongoing experience in another, not as an onlooker, but as a coproducer of that very experience, thus emerging as highly empathetic. Sounding for others permits a "being with" that is attuned (both in the lay sense and in the phenomenological sense, Merleau-Ponty (2002) to coparticipants. Along similar lines, Bavelas et al. (1986) showed that the mimicked gestures on behalf of another were distinctly interactional, in that their frequency increased when the other was visually available.
The practice also involves the participants instructing various bodily skills in real time, such as tasting and performing strenuous moves. Instruction and compliance have been shown to occur in overlap in a range of human activities (Deppermann et al., 2021;Mondada, 2014a), especially in series of instructions emerging across time. Time-critical instruction of strain, however, can be provided entirely synchronously with, e.g., dancing bodies, if only because its precise timing ultimately needs to be acquired in consecutive occasions of practice. Instructors, doctors, and other experts (such as grown-up eaters) furthermore exert their entitlement to sound those experiences for non-experts in a manner that is strikingly intimate, being an immediate representation of their bodies, as opposed to most verbal instruction and formulations of others.

Conclusion
Occurring in a close temporal fit with others' bodily engagements, the sounding for others practice illustrates how we give voice to each others' sensorial experiences, including proprioception, and thereby jointly produce sensation as social. Instead of formulating those activities purely through words, non-lexical vocalization practices seem especially fitted to join in others' bodily lives in real time. The connection between specific sounding practices and bodies was first impressionistically described by Goffman (1978) as being "response cries" to ostensible bodily events, often unexpected ones. Several of these "cries" have since been described as semi-conventionalized items used for specific communicative purposes, such as sharing the experience of being out-of-breath with your jogging partner (Pehkonen, 2020) or voicing strain to coordinate a lift (Keevallik & Ogden, 2020). As the other-oriented vocalizations discussed in the current paper are something that the other could have uttered themselves at those very moments (and sometimes actually do so), they illustrate the social and distributed nature of not only actions and resources, but also sensoriality. In other words, they demonstrably achieve compresence -one participant using their own previous bodily experiences as a guide to current sensations in others' bodies and claiming entitlement to those by producing relevant sounds.
By establishing how vocal practices, including language-marginal vocalizations, are lodged in the minute coordination of bodies in real time, we can ultimately arrive at a more ecologically valid understanding of human vocal behavior as inherently dialogic and distributed, rather than monologic and individual. A spotlight on sounding practices enables us to target central issues in linguistics, such as what the limits of language are and what kinds of tasks can be accomplished through vocally joining in others' moves or experiences. By making use of the specific affordances of voice, intonation, loudness, and sound quality, vocalizations constitute a specific means of accomplishing sociality that has so far flown below the radar of studies in communication, having been treated as perhaps too subtle and evasive, or too chaotic. This paper provided examples of the systematic deployment and shared understanding of sounds for specific tasks, thereby also demonstrating that the boundary between language and non-language is fuzzy: minimally conventionalized vocalizations still provide a means of immediate and even synchronous displays of intersubjectivity. The practice of sounding for others thus constitutes yet another empirical finding in support of the dialogic theories of language and communication, where utterances are understood to be intersubjective accomplishments (Du Bois, 2014;Linell, 2009), and human sociality to be inherently dialogic, with every single move reflexively anchored in the ongoing interpretation by the other.

Endnotes
1. By empathizing we mean providing "an affective response that stems from the apprehension or comprehension of another's emotional state or condition, and that is similar to what the other person is feeling or would be expected to feel" (Eisenberg & Fabes 1990). 2. In Sacks' terminology, this would be a form of tying procedure, see Sacks (1992:150) et passim, see Küttner (2020) for discussion of tying procedures beyond sequentiality.