Informing language training with multimodal analysis: insights from the use of gesture in tandem interactions

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.


Introduction
In the 21st century, increased mobility, internationalization, and technical innovations define our professional world. Learning and training have become lifelong processes, and skills that were once considered 'soft' are now a must. This chapter focuses on the multimodal, interactional, and intercultural aspects of communicative competence, which have yet to gain institutional recognition. For instance, the Common European Framework of Reference for languages (CEFR) tends to marginalize the role of gesture in the acquisition and mastery of a language (Council of Europe, 2001). CEFR descriptors consider gestures to be a mere paralinguistic, compensatory strategy used by beginners: a beginner 'below A1' can "make simple purchases where pointing or other gesture can support the verbal reference" (Council of Europe, 2001, p. 32); an A2 speaker "can use an inadequate word from his/her repertoire and use gesture to clarify what he/she wants to say" (Council of Europe, 2001, p. 64). In curriculum scenarios, "attention paid to body language and gestures" (Council of Europe, 2001, p. 172) is restricted to primary school. Conversely, I draw from linguistics research on the multimodality of tandem interactions so as to provide evidence for the crucial communicative and linguistic functions of gesture during exolingual interactions (i.e. between native and nonnative speakers of a language). This chapter addresses the following questions: how can future professionals learn to

Language learning in tandem
Originally developed in the 1960's so as to complement formal classroom language teaching, tandem learning is "an arrangement in which two native speakers of different languages communicate regularly with one another, each with the purpose of learning the other's language" (O'Rourke, 2005, p. 434). Language tandems provide a unique collaborative learning environment based on solidarity and reciprocity (Brammerts & Calvert, 2003); while aiming to learn a target foreign language, participants also engage in helping partners learn their own mother tongue (Helming, 2002).
Linguistic tandems provide a favorable socioaffective context for L2 learning (Horgues & Scheuer, 2015). Tandem learning is based on role reversibility and peer symmetry, in which the asymmetry of language expertise is contextual and temporary. Peer empathy and mutual commitment are central, and each participant has something to learn. Tandems breed trust and motivation because participants often perceive native speakers as trustworthy representatives of the target language community, and participants become interested in getting to know their partners as "individuals and not just as sources of language input" (O'Rourke, 2005, p. 434).
During tandem interactions, learners are exposed to spoken and contextualized L2 input that is extensive and authentic. Tandems provide learners with 'positive' and 'negative' evidence of the target foreign language (Gass, 2003;Mackey, 2006), respectively defined as "the set of well-formed sentences to which learners are exposed" and "the type of information that is provided to learners concerning the incorrectness of an utterance" (Gass, 2003, p. 225), across all dimensions of a language, from pronunciation to syntax. Crucially, face-to-face tandems allow participants to share the same interaction space and to rely on nonverbal cues like gestures, facial expressions, and the articulation of pronunciation (jaw, mouth, and lip placements).

A multimodal approach to language tandems, with a focus on gesture
The construction of meaning in interaction is by essence multimodal, intersubjective, and sequential. Participants mobilize a variety of semiotic modes and resources, one of which is language. Their actions are inscribed in time, one after the other or simultaneously; they alternately react to and project others' actions (Sacks, Schegloff, & Jefferson, 1974), providing new communication material or reusing material provided by others (Goodwin, 2013). Face-to-face communication is multimodal by nature (Argyle, 1972;Norris, 2004); multiple semiotic modes are used by participants to communicate (Bezemer & Jewitt, 2010;Kress & van Leeuwen, 2001), including speech and gestures. Gestures are bodily actions that "[belong] to the 'story line' of the interaction" (Kendon, 1986, p. 6) and that are inscribed in its sequentiality; they coincide with other actions in the construction of meaning, rather than being there by mere coincidence (Schegloff, 1984). Gesture is here understood as a large category for communicative kinesic resources, including hand gestures, head and shoulder movements, facial expressions, and body posture (Allwood et al., 2007).
Speech and gesture are tightly coupled in interactional communication (Kendon, 2000). Gesture can fulfill a variety of communicative functions, among which linguistic ones (Cienki, 2017;Müller, Ladewig, & Bressem, 2013). Gesture is a resource that plays a crucial role in second language acquisition, for instance in classroom settings (McCafferty & Stam, 2008). In face-to-face tandems, gesture is a shared resource that participants can mobilize so as to bridge the L1/L2 language and culture gap. The four studies presented here provide insights into how tandem participants use gesture to collaborate, adjust to their interlocutor, negotiate and secure meaning, and make communicative progress.
Study 1 (Debras et al., 2020) focuses on Corrective Feedback (CF, after Lyster & Ranta, 1997;Sheen & Ellis, 2011), understood as an equivalent of negative evidence -that is, "the type of information that is provided to learners concerning the incorrectness of an utterance" (Gass, 2003, p. 225). It shows how tandem participants mobilize gesture during CF sequences (Graziano & Gullberg, 2013) and what cultural differences can be observed in the way native speakers of French and of English position themselves as experts or learners.
Study 2 spells out the functions of gestural alignment (i.e. the cross-speaker repetition of a gesture form; Atkinson, Churchill, Nishino, & Okada, 2007;Kimbara, 2006) during metalinguistic sequences. During communication breakdowns, tandem participants engage in metalinguistic sequences in which gestural alignment plays a key role.
Study 3 (Debras & Beaupoil-Hourdel, 2019) documents the key contribution of gesture to anaphora, or reference tracking (Gullberg, 2006). When participants mention a given referent multiple times as the interaction unfolds, the form of the referent's successive mentions (which form a reference chain) can change. Since tandem interactions are characterized by referential instability and linguistic insecurity, gesture is crucial as a shared resource used to secure reference chains.
Study 4 documents how native speakers adapt their gestures when addressing L2 learners as part of foreigner talk in the tandem data. Indeed, foreigner talk (i.e. the linguistic and conversational adjustments made by native speakers who address nonnative speakers, after Ferguson, 1975) affects gestures as well (Adams, 1998).

Corpus and methodology
The studies reviewed in this chapter are based on data from the SITAF project (Spécificités des interactions verbales dans le cadre de tandems linguistiques Anglais-Français) coordinated by Céline Horgues and Sylwia Scheuer at Sorbonne Nouvelle University (Horgues & Scheuer, 2015). This project was created with two main goals: first, to provide students with the opportunity to improve their language, communication, and intercultural skills by participating in language tandems, and second, to collect language tandem data to create a learner corpus (Gilquin, Granger, & Paquot, 2007;Granger, Gilquin, & Meunier, 2015) in order to analyse the participants' practices and measure their progress. The project paired up undergraduate students who were native speakers of French and English and were learning each other's language as an L2. The French participants were undergraduate students from Sorbonne Nouvelle University, and the English-speaking participants were exchange students at this university who came from the United States, United Kingdom, Australia, Ireland, and Canada. Participants were paired up on the suggestion of the coordination team after filling out an online questionnaire about their linguistic background and level of L2 proficiency as well as their interests and preferred conversation topics. The participants' spoken language proficiency was not formally evaluated other than by self-assessment for L2 oral comprehension and expression as part of the online questionnaire. Their language proficiency varies from level B1 to C1 according to the CEFR (Council of Europe, 2001).
The corpus collected as part of the project is a 25-hour collection of videotaped interactions in 21 tandems (42 participants). Two sessions of each tandem were video-recorded in the university's recording studio at a three-month interval in February and May 2013. Partners were encouraged to hold unsupervised meetings once a week between the two recording sessions, and each pair met an average of 12 times. In each recording session, participants engaged in three tasks: a reading task and two game-like communicative activities aiming at eliciting storytelling and argumentation, respectively. The four studies presented in the chapter are based on the second recording session of the storytelling game Liar, Liar, in which the nonnative speaker tells a personal story to the native speaker and hides three lies in it. A discussion ensues during which the native speaker has to guess the lies. To allow comparisons of the ways a speaker communicates during exolingual (L1/L2) and intralingual (L1/ L1) interaction, an extra round of recording was done during the May session, in which all the participants performed the three tasks addressing a native speaker. Study 4 is based on a comparison between the L1/L2 and the L1/L1 data.
The technical setup was well suited for the study of nonverbal cues: three cameras were used (one aimed at each participant and one capturing the whole set), allowing a rich capture of the various dimensions of gestural output (Mondada, 2006). Although the interactions were constrained in some respects (i.e. participating in a task, sitting on stools in the university recording studio, the presence of recording devices), they remained spontaneous in character. Sitting on chairs did not prevent participants from moving freely from the waist up, where most gesture articulators are located.
The method used in the four studies is rooted in multimodal interaction analysis (Ferré, 2019;Norris, 2004) and the linguistic analysis of gesture (Müller et al., 2013). The discourse content of the tandem interactions was transcribed, and the transcriptions were aligned with the video recordings in the software ELAN (Wittenburg et al., 2006). The analyses relied on either: • the systematic annotation of gesture forms and functions (Bressem, Ladewig, & Müller, 2014;Kendon, 2004;) in ELAN (Wittenburg et al., 2006) to provide a quantitative overview of gesture use in the data; and • the moment-to-moment, qualitative analysis of interaction sequences (Goodwin, 2013) to provide insight into the variety of observable processes at play, or a combination of both.

Study 1: the multimodality of CF
Repairs proposed by more skilled speakers play a key role in L2 learners' acquisition of a language (Gass, 2003;Long, 2007;Sato & Lyster, 2012). Interactions with expert speakers allow language learners to notice the gap between their L2 production and the target form (Schmidt, 1990;Mackey, 2006), enabling them to adjust mental representations of the target forms, reshape hypotheses about them, and modify their output accordingly.
Study 1 (Debras et al., 2020) pertains to a sample of the corpus data that includes the eight recordings (four in French, four in English) with the most occurrences of CF. In the 58 minutes and 37 seconds of total footage selected, the coding yielded 128 occurrences of CF. Out of the 128 occurrences of CF, 91 were in French and 37 in English: In this data, 72% of the CF is given by French native speakers.
CF is an interactional process organized in different phases: (1) request (by the nonnative speaker), (2) provision (by the native speaker), and (3) uptake (by the nonnative speaker). Phase 2 (CF provision) is always present, but Phases 1 and 3 (request and uptake) are optional.
In study 1, the 128 occurrences of CF collected in the eight recordings were double-coded in ELAN for the following features: • CF type: recast, explicit correction, clarification request, suggestion, etc., based on existing categories described in the literature; • CF request: requested, not requested, request unclear; • CF uptake: uptake, partial uptake, acknowledgment, no uptake; and • semiotic resources mobilized for each phase of CF: (1) verbal: discourse; (2) vocal: intonation, hyper articulation; and (3) visual: manual gestures, gaze, head movements, and facial variations (smile, frowning, squinting eyes…).
The visual modality is used very frequently in all three sequential phases of CF: 96% of the time during request, 93% during provision, and 87.5% during uptake. During CF provision, the majority of head nods and metadiscursive hand gestures are produced by French native speakers when providing normative CF (i.e. an explicit correction or a recast). Their preferred kinesic forms point to a more professorial positioning on the part of French native speakers: They visually confirm the 'right', target-like form with a nod of the head or embody their expert's stance by literally manipulating the target language through metadiscursive gestures.
English native speakers provide fewer normative forms of CF than French natives and use more lifted eyebrows to do so. This type of facial display can serve "to signal and monitor affective cues between the participants" (Peräkylä & Ruusuvuori, 2006, p. 132) or mark the reception of information as unexpected, with possible nuances such as signifying that something is new, interesting, surprising, worthy of notice. English natives thus attend to affective relations with the interlocutor by marking the reception of the learner's production as worthy of notice or interesting. This gesture could be interpreted as a strategy for toning down the normativity of CF by emphasizing friendliness toward and interest in the interlocutor.
During CF requests, English learners of French account for the most metadiscursive gestures, thereby displaying more metalinguistic awareness of the learning process. They make more frequent use of frowning or squinting eyes, which display uncertainty or distance from the discourse they are producing. These visual cues can be taken up as appeals for the French native to provide a more target-like form. English learners of French not only request more feedback than French learners of English but also mobilize more visual resources to make it visually more obvious and explicit that they are doing so.
Overall, because of their physical behavior during CF requests, English natives can appear more proactive in the role of learner than French natives do.
Most CF uptakes are performed by English learners of French. They use visual resources more often than French learners to do so, mostly in the form of metadiscursive gestures, head nods, and lifted eyebrows. These kinesic forms all participate in expressing the learners' metalinguistic awareness; learners explicitly inform native speakers that they are involved in taking up the CF provided. For instance, lifted eyebrows can be used to receive CF as new information, while head nods can indicate affiliation (Stivers, 2008), signaling that the CF is understood and accepted. Conversely, French learners of English more rarely respond to CF and more rarely accompany their response with gestures than English learners of French do. The fuller kinesic involvement of English learners of French suggests that they position themselves as more eager students, whereas French learners of English appear more passive.
The multimodal analysis of CF sequences showed different strategies on the part of French and English native speakers. French native speakers provided more CF, gave more normative CF, and used more visual cues during CF provision. Conversely, English natives provided less CF, gave less normative CF, made more CF requests and uptakes as learners, and used more visual cues for request and uptake. Based on these observations, French speakers may appear more proactive as experts, whereas English speakers could be perceived more proactive as learners. These results might be due to a variety of factors, namely different sociocultural orientations to CF with varyingly prescriptive conceptions of what it means to learn and to speak a language as well as the fact that the interactions are taking place in France.

Study 2: gestural alignment in the negotiation of meaning
Study 2 (Debras & Beaupoil-Hourdel, forthcoming) focuses on gestural alignment (Atkinson et al., 2007), which can be used by the L2 learner to bridge lexical gaps, with the native speaker's subsequent aligned gesture enabling the participants to secure the referent. Aligned representational gestures can also be used by the L2 learner to display understanding by securing the referent. Gestures can also scaffold the L2 learner's appropriation of new vocabulary. When gesture is sufficient to ensure mutual understanding, the visual modality can take over from speech, with neither participant ending up naming the referent that has been identified visually, as shown in Table 2.
In this excerpt, the native speaker asks the language learner for further detail about a Christmas tree that she made. The nonnative speaker explains that she made it out of twisted wire -a challenge because neither twist nor wire is part of her vocabulary. She hence resorts to the visual modality, combining a gestural enactment of twisting a wire, using speech only to specify the material she used ("I made it with iron").
The learner's multimodal utterance integrates gesture into the linearity of speech (Ladewig, 2014). The missing lexeme twist is specified by multimodal clues: the nonnative speaker's gesture fills the gap of a predicate after the generic subject pronoun you, indicating that the missing lexeme is a verb. The gestural enactment (circular gestures, holding a thin object) indicates that the unnamed referent is a durative action, thereby the lexical aspect of the missing dynamic verb (twist). Wire can be retrieved by the metonymic specification of the material used (iron) as well as by the hand shape (folded fingers as if holding a long, thin object). The native speaker immediately takes up the language learner's gesture, showing her understanding of the whole complex predicate twisting the wire: Her mirroring circular gestures indicate the understanding of twist, and the imitated hand shape with folded fingers shows understanding of what is twisted (wire). The two participants utter yeah simultaneously, thereby confirming mutual understanding. This sequence shows that gesture can take over from speech in conveying the main information of an utterance. It also shows how 'gesture-craft' (Streeck, 2009) is a highly efficient modality for representing the activity of hand-crafting an object. Gestural alignment can even be used in metalinguistic sequences where the vocabulary is fairly transparent. Gestural alignment is transferred from discourse objects to discourse as an object; gesture forms with attitudinal and interpersonal functions are used to secure the participants' mutual engagement in the metalinguistic sequence itself. For instance, in a sequence in English, a native speaker provides feedback on the plural form of the noun goose. To do so, he identifies the target of his feedback both in speech (you can say for uh there's more than one goose, there're geese) and by lengthening the vocal sound [i:] of geese, and in gesture, by virtually holding the word between his extended thumb and index finger. The nonnative speaker immediately takes up the target word, mirroring the native speaker's visual and vocal exaggeration. The native speaker then provides further explanations on grammar and irregular spelling (yeah it changes to E E in the middle), combined with a representational gesture that indexes the activity of writing. With his index finger, he traces the letter 'e' twice in the upper center of his gesture space, high enough to meet the nonnative speaker's gaze and catch her attention (Figure 1). Again, the nonnative speaker immediately aligns with the native speaker, mimicking the writing of 'e' twice with her index finger high up at gaze level (Figure 2), while repeating the target word geese with visual and vocal exaggeration to show that she has taken in the target form.
This metalinguistic sequence targets phonology, grammar, and spelling -not vocabulary; visual alignment is used so as to secure the participants' shared awareness of being involved in a metalinguistic sequence. Gestural alignment is a crucial locus of interpersonal resonance and the collaboration of speakers in interaction. Tandem participants use it for various interactional goals, from mutual understanding to language learning. The gradual stabilization of referents through speech and gestures shows how meaning is an unfolding process that relies on the accumulation of forms, which, once used, become part of a public substrate (Goodwin, 2013), namely the collection of semiotic forms used by speakers that constitute a common set of reusable, decomposable, and transformable resources for the intersubjective construction of meaning in interaction.

Study 3: the use of gesture in reference tracking
Study 3 is a detailed qualitative analysis of interaction sequences rooted in a formal approach to gesture analysis (Boutet, 2015). It shows how chains of reference (Schnedecker & Landragin, 2014) are constructed sequentially, multimodally, and interactively during tandem conversations. Reference stability and co-referentiality are key issues for mutual comprehension and for the co-construction of meaning in conversation, all the more so during exolingual interactions. Gestures can contribute to the construction of referents by fulfilling anaphoric functions (Navarretta, 2011), deictic ones (Kita, 2003), or representational ones (Müller, 2014).
Exolingual interactions have a direct effect on the formal characteristics of gestures. Native speakers tend to use more gestures, and their gestures are extended in time and space (they last longer and are ampler, and iconic gestures are more frequent; Adams, 1998;Tellier & Stam, 2012; Study 4). Language learners can use representational gesture to fill lexical gaps (Ladewig, 2014) and tend to use co-referentially overexplicit speech (i.e. overuse of full lexical nominal expressions but limited use of pronouns). Visually, the repetition of full noun phrases is synchronized with anaphoric gestures (Gullberg, 2006) that maintain a referential locus in the gesture space (Perniss, 2012).
When gesture forms are repeated by the speaker or taken up by the interlocutor (Bressem, 2014; Study 2), participants never actually reproduce a gesture in its exact same form. Formal variations in the gesture's realization are often meaningful in terms of the referent's informational status (i.e. as new/ foregrounded or old/backgrounded information). For that reason, the term 'gesture reiteration' is preferred to 'gesture repetition'. Study 3 shows how these reiterations of the speech of self and others are concatenated and combined with gesture to sequentially co-construct chains of referents that evolve as the conversation unfolds, in a context of referential instability and linguistic insecurity that is typical of tandem interactions.
Study 3 shows that links of one and the same reference chain can be expressed monomodally in speech or gesture only, or multimodally, in a combination of both. Multimodal referential expressions can combine mentions from different reference chains that can be expressed simultaneously with each hand representing a distinct referent, as illustrated in Figure 4: the native speaker of English represents a Christmas tree with one forearm and hand and a Christmas bauble hanging from it with the other hand. Two different gesture forms can be combined with the same lexical form in speech to highlight different characteristics of the referent; in that case, maintaining the same locus in the gesture space (Perniss, 2012) or an object of similar size helps identify two different gestures as being related to one and the same referent. Gesture reiterations involve two major processes, namely the reduction or the expansion of the reiterated form. Reduced forms imply the reduction of one or many formal features: The reiterated gesture can be faster, smaller in amplitude, or less articulated, or it can involve one hand only when both hands have been used previously. A referent that has first been presented in three dimensions (modeling as per Müller, 2014) or two (tracing), can be taken up in a more schematic way that involves fewer dimensions. Formal reduction can involve only one modality at a time. When an already established referent is used as visual background for another, it can be sketchier (e.g. fourth mention of the Christmas tree, Figure 4) than a more detailed, previous mention (third mention of the Christmas tree, Figure 3; see also Holler & Bavelas, 2017). A sketchier gesture form can also be used when it is repeated by the interlocutor to confirm understanding. The development of common ground (Clark, 1996) between the participants as the interaction unfolds is a factor that explains the formal reduction of gesture reiterations. The reduction of subsequent visual reiterations can also be analyzed as a process of gesture conventionalization at the scale of an interactional sequence (LeBaron & Streeck, 2000). In all, gesture reiterations display features similar to proforms in speech, reflecting the accessibility of referents (Ariel, 1990): More reduced forms can be used once the referent's status has shifted to known information (Gundel, Hedberg, & Zacharski, 1993).
Gesture reiterations can also involve formal expansion: they can last longer, be ampler, or be more precise. For instance, a representational gesture can go from tracing to modeling, or from two to three dimensions, as exemplified by the first and second representations of a Christmas bauble (respectively in Figure 5 and Figure 6), by the French native, who is speaking in English and filling a lexical gap with gestures to refer to a 'bauble'. A first, sketchier representation in gesture only can anticipate a fuller speech and gesture representation. Articulatory efforts aiming to produce a more developed gestural representation are typically used by native speakers adapting their communication style to facilitate nonnative speakers' understanding (Adams, 1998). More broadly, expanded gesture reiterations show how gestures belong in the 'public substrate' (Goodwin, 2013): the dynamic production of new forms based on shared ones enables structure-preserving transformations that are necessary for future actions to unfold.

Study 4: gestures of foreigner talk
Foreigner talk encompasses all the linguistic and conversational adjustments made by native speakers when speaking to nonnative speakers (Ferguson, 1975). It can involve syntactic changes (e.g. shorter, less complex sentences), semantic ones (e.g. simpler lexicon), and articulatory ones (e.g. slower flow of speech). As shown by Adams (1998), foreigner talk affects gesture as well. Adams's (1998) study compared the use of gestures by native speakers of American English when addressing Korean students versus when addressing other native speakers. When addressing language learners, native speakers used more pantomime, iconic (representational) gestures, and deictic gestures (pointing), although only the higher rate of deictic gestures proved statistically significant. Possibly because the participants had no metalinguistic awareness of their own use of gesture, they did not use fewer metaphorical (more abstract) gestures or emblems (more conventionalized) with language learners (see McNeill, 1992, for detailed definitions of these gesture functions).
Tellier and Stam (2012) studied the use of gesture by students who were training to become teachers of French as a foreign language when they explained action verbs to Erasmus students who were learners of French versus other native speakers of French. As future teachers, they were more sensitized to the needs of language learners; they did not use significantly more gestures, but their gestures were significantly longer and ampler when addressing learners. The rate of iconic gestures was significantly higher when addressing learners, while the rate of metaphorical gestures was significantly higher when addressing native speakers.
Study 4 (conducted by Léa Baldran and myself) focuses on the kinesic behavior of five native speakers of French participating in the storytelling game Liar, Liar with a native speaker of English who was learning French versus a native speaker of French. The ten video recordings amount to a total length of 58 minutes. Systematic annotation was made in ELAN (Wittenburg et al., 2006) to quantify various features of the gestures used, including the use of nonmanual gestures (head gestures as per McClave, 2000, and facial gestures of the mouth and eyebrows as per Bavelas & Chovil, 2018). Double coding was made on a portion of the annotations so as to ensure their reliability.
Study 4 yielded the annotation of 1,018 gestures of the hands, head, and face (269 manual gestures and 749 nonmanual gestures). The frequency of gestures produced was overall higher when addressing a nonnative speaker (18.4 gestures/ minute on average) than a native speaker (14.9 gestures/minute on average). Participants produced far more nonmanual gestures than manual ones in both conditions, yet the proportion of manual gestures increased when addressing a nonnative speaker. Indeed, out of the 285 gestures addressed to native speakers, 230 were nonmanual (81%) and 55 were manual (19%); out of the 733 gestures addressed to nonnative speakers, 519 (71%) were nonmanual and 214 (29%) were manual (see Figure 7 for the distribution of nonmanual gestures in the data). This suggests that tandem participants who are not training to become foreign language teachers but are sensitive to the language gap when addressing a nonnative speaker spontaneously intensify the use of markers of affect and interpersonal relations (e.g. eyebrow movements, smiling, head nods). In contrast to Tellier and Stam's (2012) findings, there was no drastic change in the duration of manual gestures depending on whether the interlocutor addressed a native speaker (2.4 seconds on average) or a nonnative speaker (2.3 seconds on average). No striking change was observed either in the gestures' amplitudes, annotated in terms of their realization in the center or periphery of the gesture space, following a simplified, two-fold partition of the gesture space inspired by McNeill's (1992, p. 89) model. This could be due to the fact that participants in Tellier and Stam's (2012) study are future teachers sensitized to the needs of language learners, contrary to the tandem participants in the data. Gesture functions were grouped in terms of cultural functions (including emblems and metaphorical gestures) and referential ones (including deictic and iconic gestures). Although the participants in the data set did use more referential gestures in the presence of nonnative speakers, they used far more cultural gestures overall in both conditions. Again, this result contrasts with Tellier and Stam's (2012) finding that future language teachers use more iconic gestures when addressing nonnative speakers. Being professionally trained to address native speakers seems to be a crucial differentiating factor in this case as well. And yet, the participants use a more varied repertoire of manual gesture functions when addressing a nonnative speaker than a native speaker; again, this suggests an attempt on their part to adjust their communication to a nonnative interlocutor.

Perspectives for higher education language training and professionalization
The research presented above has shown how gesture plays a central communicative role in interactional contexts (Goodwin, 2013;Kendon, 2000), especially in exolingual ones (McCafferty & Stam, 2008). Language tandem participants use gesture for a variety of purposes: notably to collaborate, stabilize referential meaning, ensure discourse cohesion and mutual comprehension, scaffold language learning, provide feedback, adjust to their interlocutor, secure mutual engagement, and develop interpersonal relations. The use of gesture also mirrors how participants spontaneously adapt to the intercultural aspect of the language tandem. Gestures are more frequent (especially nonmanual ones) and their functions are more varied when addressing a nonnative speaker. The use of gesture also reveals differences in sociocultural positionings toward what it means to speak a target language, to learn it, and to help others learn one's mother tongue. Based on this research, I propose recommendations for the professionalization of language learning in higher education. I focus on language learning, teacher training, and the institutional recognition of the relationship between research, pedagogy, and innovation.

Language learning and professionalization
Universities are becoming increasingly international places, with a substantial potential for preparing students for the international workplace. Although students and faculty from all over the world make universities places of cultural diversity, this cultural diversity could often be more recognized or valued. Exchange students often stay for a short period of time, and often end up socializing mostly among themselves. Study abroad offices could collaborate more systematically with faculty so as to facilitate links between exchange students and local ones who volunteer for intercultural meetings, exchanges, and projects, of which the language tandem is just one example among many. While the European Commission (2012) remarks that "not all languages are equally valuable on the labour market" (p. 13), linguistic diversity remains absolutely vital for cultural and personal development.
The research on language tandems presented above shows the considerable potential of intercultural settings during which university students of diverse backgrounds interact as peer partners to achieve a common goal. Student projects with a core intercultural component should be more systematically included in university curricula in order for all students to develop, value, and learn to integrate 'soft' skills which have now become a must, namely intercultural and interactional ones. More specifically, intercultural collaborative learning projects can help students develop the following skills: becoming sensitized to openness and diversity, becoming more open and responsive to new and diverse perspectives, bridging cultural differences, using differing perspectives to increase the quality of work, and using appropriate sociolinguistic skills in order to function in diverse cultural and linguistic contexts (ACTFL, 2011). Intercultural collaborative learning projects can include more than two participants, and they can aim at personalized real-world tasks to ensure motivation and/or a specific domain to develop expertise. Another benefit is that they can be included in lifelong learning programs (EUA, 2008). They can also take the form of online collaborations if opportunities to be in the same room are limited (Guth & Helm, 2012). Whether face-to-face or online, collaborative learning arrangements will give university students opportunities to develop their communicative competence as well as a large array of other skills: responsibility and autonomy, creativity and innovation, problem-solving, and social and cross-cultural skills. By developing intercultural collaborative projects in their curricula, universities can at once professionalize students and promote an inclusive learning society.

Language teaching, teacher training
The research presented above can also inform teacher training in two main directions: developing more learner-centered approaches to language learning, and including more multimodal, interactional, and intercultural aspects of communicative competence in both teacher training and language teaching. Although the language tandem was invented to complement language learning in the classroom, it is today closer to what is considered the essence of language learning: the learner is active in a learner-centered arrangement, developing personal skills in the context of an authentic exchange, while the language teacher remains at the periphery as a facilitator (ACTFL, 2011). Future higher education language teachers could be trained to set up language tandems so as to diversify opportunities for learners to use language beyond the classroom, but they could also be encouraged and trained to develop their students' interactional skills in the classroom directly. As facilitators supervising language tandems, teachers should be trained to develop specific know-how, such as ways to be available to students, means of securing students' motivation, and forms of nonintrusive verification that tandem meetings are taking place. If assessment of students' progress is planned, it will require careful design and the targeted skills (language, communication, and intercultural competence) will need to be made explicit.
Language teacher training should also more systematically include researchbased modules on how to develop the multimodal, interactional, and intercultural aspects of learners' communicative competence. As Tellier and Yerian (2018) suggested, the training of future language teachers should cover multimodal communication (e.g. topics such as the role of gestures in the multimodal coconstruction of meaning, gesture functions, manual and nonmanual gestures, gestural alignment, and gestures related with foreigner talk), so that language teachers become more aware of the multimodal resources that are (literally!) at hand to enhance their communication and teaching skills. A key tool for future teachers' development of communicational self-awareness is retrospective reflection sessions based on videotaped recordings of their performance, for instance as part of exolingual interactions (Rivière & Guichon, 2014).
This method from applied research on online exolingual interactions can be transposed to in-class interactions by filming future teachers in training; seeing themselves teach allows them to study their own performance from an analytical and more distanced standpoint, as they go through past interactions again (Guth & Helm, 2012).
Teachers whose training has sensitized them to the multimodal dimension of exolingual interaction can, in turn, sensitize their students to the role of gesture. Two basic pedagogical goals come to mind: first, encouraging students to go beyond stereotyped perceptions of gesture, whose functions are usually broader and more complex that they might think, and second, helping them develop self-awareness of their own bodies in communication, and an awareness of how gesture can enhance or hinder communication. From the perspective of intercultural interaction, it will be especially useful for students to learn to distinguish between the idiosyncratic, cultural, and iconic dimensions of gesture but also to become aware that these dimensions are not always easy to tease apart. Interactional exercises (e.g. role-play, theater exercises, and public speaking) followed by reflective discussion can be used to help students pinpoint differences in the ways language users communicate in intralingual and exolingual contexts. Interactional exercises can be used to discuss sociocultural variation in the use of speech and gesture and cover notions like foreigner talk or the functions of other-repetition. During reflective sessions, language learners can for instance become aware that they already use gestural alignment and reiteration when speaking in their native tongue, and they can be encouraged to transpose this strategy to exolingual or L2 interactions so to secure meaning and interpersonal relationships.

Institutional recognition of the relationship between research, pedagogy, and innovation
As a learner corpus (Gilquin et al., 2007;Granger et al., 2015), the SITAF data shows the need for gesture to be more systematically included in language class curricula, language teachers' training, and language evaluation frameworks. Gestural cues should not be relegated to compensatory strategies used by pupils and/or beginners (although they can fill that purpose as well); gesturing skills are closely intertwined with spoken language skills, which should be recognized in levels B and C of CEFR descriptors (Council of Europe, 2001), when it comes to speaking, understanding, and interactional skills.
Interactional and intercultural competences are transversal to all domains, both at university and throughout professional life. As such, they could become core topics in lifelong learning in higher education. As the European Universities' Charter on Lifelong learning (EUA, 2008) suggests, successful lifelong learning will rely on a strengthened relationship between research, teaching, and innovation -an idea that this chapter has, I hope, exemplified to some extent. A research perspective on multimodal interaction shows that the contribution of gesture is at once essential, subtle, and complex, at the crossroads of culture, language, and communication. More broadly, lifelong learning opportunities developed by universities can provide uniquely innovative training based on research -and, in turn, lifelong learning can itself be a great source of new research methodologies and topics. After all, university researchers themselves are a fine example of lifelong learners whose own educational needs are continually evolving (EUA, 2008).

Conclusion
Taking research on the use of gesture in language tandems as a point of entry, this chapter has proposed directions for training future professionals who communicate in exolingual interactions. It draws on research findings to inform pedagogy and innovation in higher education, and advocates for an increased institutional recognition of multimodal, communicative, and intercultural skills. These soft skills are central in today's internationalized professional life and for that reason they need to become core features of language and communication training for future professionals, among whom future language teachers. On a final note, one can say that preparing university students for the professional world is at once a fundamental mission of higher education and a continuous and exciting challenge, that of constantly adjusting to the transformations of the professional world of today and tomorrow.