A review of data collection practices using electromagnetic articulography

This paper reviews data collection practices in electromagnetic articulography (EMA) studies, with a focus on sensor placement. We first introduce electromagnetic articulography as a method. We then focus on existing data collection practices. Our overview is based on a literature review of 905 publications from a large variety of journals and conferences, identified through a systematic keyword search in Google Scholar. The review shows that experimental designs vary greatly, which in turn may limit researchers’ ability to compare results across studies. Finally, we describe an EMA data collection procedure that includes an articulatory-driven strategy for determining where to position sensors on the tongue without causing discomfort to the participant. We also evaluate three approaches for preparing (NDI Wave) EMA sensors reported in the literature with respect to the duration the sensors remain attached to the tongue: 1) attaching out-of-the-box sensors, 2) attaching sensors coated in latex, and 3) attaching sensors coated in latex with an additional latex flap. Results indicate no clear general effect of sensor preparation type on adhesion duration. A subsequent exploratory analysis reveals that sensors with the additional flap tend to adhere for shorter times than the other two types, but that this pattern is inverted for the most posterior tongue sensor.


Introduction
Electromagnetic articulography (EMA) is a popular technique for the study of speech production that supports the tracking of articulatory kinematics using sensors attached primarily to the tongue, lips, and jaw. This paper provides a comprehensive overview of studies that have used EMA as a method for the investigation of speech-related topics, with the ultimate goal of characterizing various data collection procedures and comparing them to our own practices. In Section 2, we introduce electromagnetic articulography and address some methodological considerations, such as device safety and accuracy, usage, and general sensor placement guidelines. Section 3 continues with a discussion of data collection practices drawn from a systematic literature review of 905 publications from conferences and journals published since 1987. In this contribution, we focus on 412 journal publications. Sections 4 and 5 of this paper are practical, as we describe our own data collection procedure in detail, and we evaluate the adhesion duration of three different types of sensors through a sensor adhesion experiment. We hope this paper will be of help to those starting out with EMA data collection.

Advantages and limitations of EMA
Electromagnetic articulography (EMA) 1 is a point tracking method, whereby sensors placed on target articulators (including tongue, lips, and jaw) are used to track movement in real time in 3D. As with any method, there are both advantages and disadvantages to EMA (Kochetov, 2020;Earnest & Max, 2003;Maeda et al., 2006;Mennen, Scobbie, de Leeuw, Schaeffler, & Schaeffler, 2010;Stone, 2010;Whalen et al., 2005). We first discuss some advantages of EMA. The data collected within the oral cavity has high spatial accuracy and temporal resolution (see Section 2.4 below), yielding relatively precise information on articulatory gestures. Unlike with some other methods (such as ultrasound tongue imaging), it is possible to measure multiple articulators simultaneously and therefore allows the investigation of inter-articulatory interactions. It is one of the few methods that allows researchers to study movements of articulators directly, as opposed to more indirect acoustic methods. EMA is biologically safe (contrary to some methods used in the past, such as x-ray cineradiography or microbeam) and minimally invasive. Furthermore, the sensors are mostly well-tolerated by adult participants and only moderately interfere with speech production (speakers adapt within 10 minutes; Dromey, Hunter, & Nissen, 2018). Compared to other methods used to track speech articulators, articulographs restrict the participants' movement less, they are not line-of-sight (such as, e.g., VICON or OptoTrak), and they are not restricted to in-plane visualization (such as, e.g., real-time magnetic resonance imaging or ultrasound tongue imaging).
However, several limitations should be considered when employing EMA for speech-related investigations. For example, the positioning of sensors is limited to the anterior oral tract. It is more problematic to place sensors on the more posterior part of the tongue (e.g., tongue dorsum) than its anterior part, and it is not possible to track velum movements without discomfort to the participants (see exceptions below). Furthermore, depending on the size and location of the articulator of interest, it is not possible to place many sensors on an articulator at the same time due to mutual electrical interference and increased perturbation of articulation. Additionally, sensors still cannot be placed too close to each other without disturbing their measurement accuracy (the Carstens AG500 manual, for example, states that the minimum distance between sensors should be 8 mm), which again limits the number of points that can be tracked on the articulators. Furthermore, because EMA is a fixed point-tracking technique, it does not capture the global movements of articulators, for instance the full midsagittal tongue shape (as obtained using rtMRI).
Additionally, the equipment is expensive and requires a relatively high level of technical knowledge, prior training, and practice to use successfully. Finally, as sensors are firmly affixed to orofacial structures, they constitute a form of articulatory perturbation. While articulation does return to nearly normal after a while (see below), the acoustics are changed when sensors are attached (Meenakshi, Yarra, Yamini, & Ghosh, 2014). Nevertheless, some 1 Electromagnetic Articulography (EMA) used to be known as Electromagnetic Midsagittal Articulography (EMMA). While the 'midsagittal' part is not applicable anymore as the sensors are tracked in 3D, both spellings remain in use in the literature. Other alternative names include '(electromagnetic) articulometry' and 'electromagnetometry.' The device can be called an EMA, an articulograph, an articulometer, or (especially in the early years) a magnetometer.
earlier problems (such as restricted head movement, the need for extensive calibration, and data being restricted to the midsagittal plane only) were present for previous articulographs, but have largely been eliminated with the newer devices (see more details below).

EMA devices
EMA systems have been used for speech-related research since the 1980s (see Figure 1 for an overview of EMA market releases). In the past, the MIT system articulograph (Perkell et al., 1992), the Movetrack system (Branderud, 1985), and the Aurora system (NDI; Kröger et al., 2000) were used as some of the first available commercial articulographs. 2 For the past two decades and up until recently, there were two main manufacturers with a continuing production of EMA devices, namely Carstens Medizinelektronik (Bovenden, Germany) and Northern Digital Inc. (Waterloo, Canada). Carstens Medizinelektronik has manufactured several articulography devices over time spanning from the late 1980s until now, including models AG100, AG200, AG500, and the most recent AG501. Northern Digital Inc. (NDI) has manufactured the Wave articulograph, which came to the market in 2009 and was discontinued with the arrival of their latest articulograph, the NDI Vox in early 2020. The NDI Vox has since then likewise been discontinued, as NDI decided to reduce their product portfolio (Northern Digital Inc., 2020). Consequently, at present only Carstens offers a commercial articulograph that has not been discontinued.
As articulographs are costly, it is not uncommon for a lab to use an older system despite a new version being available on the market. Regardless, considerable advancements have been made since the first commercial articulograph. Technological advances have made it possible to collect more comprehensive data, going from 2D EMMA (midsagittal) systems to 3D (or rather 5D) systems collecting three Cartesian coordinates and two angular coordinates (Hoole & Zierdt, 2010). Thus, although early articulographs only measured in one plane (i.e., the midsagittal plane), modern devices track data in three isotropic spatial and two angular dimensions, and sensor orientation is tracked in addition to position. Furthermore, early articulographs required extensive calibration before testing and restricted the participants' head movement, while modern systems permit free head movement.

Uses of EMA
Starting in the 1980s, EMA was designed as a way to track points both inside and outside the vocal tract . Early studies evaluated the suitability of Figure 1: Timeline of articulographs. Note that the AG200 is not included as it was a combination of the AG500 with the helmet from the AG100. The Aurora system is not included because it was a point-tracking tool but not one meant exclusively for the study of speech production.
(2010) used it to study acquired apraxia of speech, and Yunusova et al. (2017) used it to provide feedback to patients with Parkinson's disease.

Accuracy and safety of EMA devices
Since the advent of EMA devices on the market, their sampling rate and number of channels have increased, and the accuracy has improved. Regarding the recording capabilities of the most recent articulographs, the NDI Wave and NDI Vox have a maximum sampling rate of up to 400 samples/s and can track 16 channels simultaneously (i.e., up to 16 sensors can be used). The AG500 can record 200 samples/s in 12 channels, while the AG501 can record 1250 samples/s of up to 24 channels (Sigona et al., 2018;Savariaux, Badin, Samson, & Gerber, 2017). The speed of current devices is more than enough to capture speech movements from the articulators. For example, Tasko and McClean (2004) indicated that the maximum speed of the tongue body during connected speech was 200 mm/s, and controlled (non-ballistic movements are much slower). A sampling rate of 400 Hz thus has sufficient temporal resolution to track the fastest known articulatory movements. Several studies have investigated the spatial accuracy of articulographs. Berry (2011) reported that the Wave system showed < 0.5 mm errors for 95% of position samples recorded during human jaw movement for nine out of ten participants. A study on the Carstens AG500 has reported a median error of < 0.5 mm across different types of recordings, including manual movements and various speech tasks, with the error magnitude being dependent on calibration and on the location of the sensors in the electromagnetic field as well as on the proximity between the sensors (Yunusova, Green, & Mefferd, 2009). In addition, the AG500 was found to display some numerical instabilities and anomalies (Stella, M., Stella, A., Grimaldi, & Fivela, 2012) which were not predictable (Kroos, 2012). Finally, a comparison between the Wave and several Carstens systems (namely the AG200, AG500, and AG501) revealed that all four devices showed a local precision of around 1 mm, but a large range of global precision, spanning from 3 mm to 21.8 mm (Savariaux et al., 2017), with the AG501 as the most accurate device with precision of 0.3 mm (RMS; Electromagnetic Articulograph, 2019). Comparisons of the AG500 and AG501 additionally revealed that the AG501 was found to be more accurate, stable, and user friendly (Stella et al., 2013;Sigona et al., 2018) than the AG500. A recent study on the newest NDI articulography-namely, the NDI Vox, which has been discontinued recently-has shown it to be significantly more accurate than the NDI Wave, with an average sensor pair tracking error of 0.1 mm, although a direct side-by-side device comparison would be necessary to establish how the Vox compares with the AG501 (Rebernik, Jacobi, Tiede, & Wieling, in revision).
In general, electromagnetic articulographs are safe to use (Hasegawa-Johnson, 1998). The AG500, AG501, NDI Wave, and NDI Vox articulographs fulfil the safety requirements for electrical equipment as set by the International Electrotechnical Commission and the American Federal Communications Commission (Carstens AG500 Manual, 2006;Carstens AG501 Manual, 2014;Wave User Guide, Northern Digital Inc., 2009, rev. 2016Vox User Guide, Northern Digital Inc., 2019). Note, however, that little research has been targeted specifically at the electromagnetic frequency ranges of EMA systems (Hoole & Nguyen, 1999;Earnest & Max, 2003). Furthermore, due to the moderate strength magnetic field 3 a few exclusion criteria must be considered that impact participant recruitment, predominantly the use of implanted devices that might be prone to electromagnetic electromagnetic articulography Art. 6,page 6 of 42 interference. These include (as discussed in the Wave User Guide, Northern Digital Inc., 2009, rev. 2016, and Carstens AG500 manual, 2006: -the use of a pacemaker (the magnetic field of the EMA may interfere with pacemaker operation; see Smith & Assen, 1992, for a description of how electromagnetic fields affect cardiac pacemakers); -large metal objects in or around the head (such as a hearing aid or cochlear implant; see Crose, Kuk, &Bindeballe, 2011, andTognola, Parazzini, Sibella, Paglialonga, &Ravazzani, 2007, for electromagnetic interference in hearing aids and cochlear implants, respectively); -the use of insulin pumps (see Zhang, Jones, & Jetley, 2010, for a hazard analysis of insulin pumps).
Some studies have tested the potential adverse effects of the EMA magnetic fields on metal objects in the field and, vice versa, the effect of metal objects on the integrity of the collected EMA data. Katz et al. (2003) tested compatibility of the Clarion 1.2 S-Series cochlear implant with the Carstens AG100 articulograph in order to determine whether EMA affects the functioning of the implant and the participants' speech perception on the one hand, and whether the implant could potentially affect the accuracy of EMA data on the other hand. They determined that the tested cochlear implant was compatible with the AG100, as no adverse effects could be observed. Joglar, Nguyen, Garst, and Katz (2009) tested potential interference between pacemakers/ implantable cardioverter-defibrillators with the Carstens AG100. They determined that devices from Medtronic (type D154VRC), St. Jude (types 5172 and V-193), and Guidant (types 1860, T180, 1852 and 1853) were compatible with the Carstens AG100. Finally, Mücke et al. (2018;see also Hermes, Mücke, Thies, & Barbe, 2019) tested Essential Tremor patients who had undergone thalamic deep brain stimulation (DBS) surgery. Participants were tested using the Carstens AG501 while the implant was active and inactive, with no reported adverse effects. However, as new articulographs and medical devices are introduced, it is necessary to verify their field strength and electromagnetic frequency before doing any testing on participants. Additionally, some researchers advise against including pregnant women in empirical studies using EMA (Hoole & Nguyen, 1999;Stone, 2010) as the effect of the magnetic field is not entirely clear and it is better to err on the side of caution.

Participants
Due to the high time demands of the method-including long participant preparation times as well as data processing and analysis steps-EMA studies frequently limit their number of participants. Our literature review (see description below) showed that around 75% of studies published in journals included ten participants or fewer; around 46% included five participants or fewer. This is also in line with Kochetov (2020), who reported the median number of participants in an EMA study to be five. Early studies (e.g., earlier than 2003) have often only included one or two participants, and it was not uncommon for one of the authors to be a participant. With EMA's increasing popularity, however, there has also been an increase in the number of studies with more participants, with the largest participant samples including around 50 participants (e.g., Schötz, Frid, & Löfqvist, 2013, N = 50;Cheng, Murdoch, Goozée, & Scott, 2007, N = 48;Wieling et al., 2016, N = 48).
In general, most participants tested with EMA are healthy adults (around 80% of the studies). Nevertheless, several studies have tested children from five years of age onwards (e.g., Katz & Bharadwaj, 2001;Cheng et al., 2007;Schötz et al., 2013), giving important insights into the development of individual articulators during the process of early speech acquisition. Articulographs have also frequently been used to study disordered speech in individuals suffering from various conditions that can impact speech production and/or speech motor control, ranging from speech disorders such as stuttering and cluttering (Didirkova & Hirsch, 2019;McClean, Tasko, & Runyan, 2004;Hartinger & Mooshammer, 2008) or apraxia or speech (e.g., Bartle-Meyer, Goozée, & Murdoch, 2009;Nijland, Maassen, Hulstijn, & Peters, 2004); hypokinetic dysarthria (e.g., Kearney et al., 2018;Mefferd & Dietrich, 2019) or Amyotrophic Lateral Sclerosis (e.g., Lee & Bell, 2018;Shellikeri et al., 2016) to congenital conditions such as cleft lip (e.g., van Lieshout, Rutjes, & Spauwen, 2002) or congenital blindness (e.g., Trudeau-Fisette, Tiede, & Ménard, 2017). Using EMA to study disordered speech (more studies can be found in the Appendix) is important to provide insight into the underlying issues of speech motor control that cannot be detected through acoustics only. However, as a method, EMA can also be more fatiguing, and researchers should thus distinguish between what they can and should ask of their participants (Gibbon, 2008;van Lieshout, 2007; see below).

Literature review
Section 3 of the paper is intended as a review and discussion of the prevalent trends in EMA data collection of the past three decades. To identify these practices and trends, we performed a systematic literature review. 4 Using Google Scholar, we collected journal publications, conference proceedings papers, and other academic writings by employing the search terms 'articulography, ' 'articulograph,' 'articulometry,' and 'articulometer,' between the years of 1987 and 2019. We excluded publications that were less than four pages long, publications that did not describe participant studies (e.g., because the authors used an existing database, focused on a new analysis procedure or assessed the more technical aspects of the EMA such as device accuracy), and publications that were written in languages other than English. 5 These search criteria led to 905 identified publications, which likely encompasses the large majority of published works utilizing articulographs. It should thus provide a representative overview of EMA data collection procedures. The present review considers 412 journal publications, 413 conference papers, and 80 other writings (most frequently doctoral dissertations).
During the reviewing process, we identified the following parameters: type of EMA device used, number of participants, population, total number of sensors, number of tongue sensors, sensor placement, sensor preparation, and adhesive used for sensor placement. Not all publications reported all information. For example, while most publications mention the device type (especially after several manufacturers started producing articulographs) and number of sensors, few of them mention the adhesive in use.
In the Appendix, we have provided a table with all identified studies. Please note that for this paper, we have analyzed the trends and practices based on journal publications only (N = 412). This prevents us from counting the same study multiple times, because 4 Our literature review underwent three separate stages, going from 247 publications (first draft) to 626 publications (second draft) and finally to 905 publications (final publication). For the first draft of this paper, we collected publications from five international peer-reviewed journals (namely the Journal of Laboratory Phonology; The Journal of the Acoustical Society of America; the Journal of Phonetics; the Journal of Speech, Language, and Hearing Research; and Clinical Linguistics and Phonetics) as well as conference abstracts from the International Congress of the Phonetic Sciences, which led us to identify 247 publications. On the basis of reviewers' comments, we decided to perform a more extensive literature review for the second draft of the paper. We used the search terms 'electromagnetic articulography' and 'electromagnetic midsagittal articulography' on Google Scholar, which led us to identify 626 publications. In the second round of revisions, however, a reviewer (justly) pointed out that 'articulometry' is a frequent term that should be included. We therefore finally used the search terms described in Section 3 of this paper (namely, 'articulography,' 'articulograph,' 'articulometry,' and 'articulometer'), excluding the search terms we had looked for previously for the second draft. We did not discard any publications at any stage of the process. 5 As electromagnetic articulography was pioneered in Germany, many early papers are written in German. studies described in journal publications have often already been presented at one or more conferences but are rarely published in more than one journal.

Data collection practices
To draw valid conclusions about speech kinematics and speech motor control based on EMA data, it is necessary to ensure between-subjects and between-studies comparability. On the one hand, it is important to correctly place EMA sensors on the speech articulators depending on the specific goals of the study and to optimize sensor adhesion time to ensure cross-trial comparability (after re-attachment, a sensor might not be in the exact same position as before). On the other hand, it is necessary to make the experimental procedure as comfortable as possible for participants while not impeding scientific accuracy.
In the sections below, we lean on our literature review to report some general information on sensor placement, followed by information on certain anatomical considerations that might result in a different sensor attachment strategy, and finally information on the placement and preparation of specific sensor categories (including reference sensors, jaw movement sensors, tongue sensors, and lip sensors).
At this point, we would like to emphasize that most authors follow a certain template when reporting on their EMA study. Such a template is usually of the form: Articulatory data was collected using [device name, device manufacturer] at a sampling rate of [sampling rate, often 100, 200 or 400 Hz]. Acoustic data was simultaneously collected using [microphone device] at [sampling frequency, often 16 kHz].
[Number] sensors were attached to the tongue, lips and jaw using the non-toxic adhesive [name adhesive]. Specifically, [number] sensors were affixed to the tongue: one on the tongue tip, [location, often "about 1 cm from the anatomical tip"], one on the back of the tongue [location, often "as far back as comfortable"], and one [location, with three sensors often "midway between the tongue tip and tongue back sensor"]. One sensor affixed to [location, often the lower incisor] tracked jaw movements and two sensors were placed on the vermillion border of the upper and lower lips. [Number] reference sensors were additionally placed on [location, often the left and right mastoid, nasion and/or upper incisor] to correct for head movement. A recording of the bite plane was made using [description of the process] and a palate trace was made [description of the process].
In the following sections, we discuss the variables that are indicated in this template in bold. Some of the other parts (such as devices and sampling rates) have already been discussed above. Finally, the following sections do not provide information on the EMA data analysis process: The reader is directed to consult Gafos, Kirov, and Shaw (2010) who provided guidelines for using mview, the frequently-used EMA data analysis programme developed by Mark Tiede at Haskins Laboratories (Tiede, 2005); Hoole (2012) who provides a tutorial on his software for processing AG500/AG501 data; and Kolb (2015) who details some other existing software tools and analysis methods. A tutorial on how to analyze EMA data using non-linear regression techniques is provided by Wieling (2018).

General sensor placement information
Articulographs can be used to study the behaviour of both extraoral (i.e., the lips and the jaw) and intraoral (i.e., the tongue) articulators. The exact choice of sensors depends on several factors, including the studied population (clinical versus healthy, see below; impacts the number of intraoral sensors) and the sounds that are to be investigated (e.g., apical versus lateral; impacts sensor placement). Researcher preference also plays a role: Some prefer to adhere the minimum number of sensors (to decrease the time necessary for participant preparation), while others prefer to adhere more sensors (to collect additional data, using it to answer more research questions). With few exceptions, sensors are almost always placed midsagitally.
The number of intraoral sensors is an important consideration in EMA studies. On the one hand, having more sensors on the tongue allows the tracking of more points and thus yields a better picture of the movement of the tongue. On the other hand, when also including the intraoral jaw movement sensor and reference sensor on the upper and lower incisors, respectively, speakers frequently have five or more wires in their mouth. This may lead to discomfort and affect participants' speech. More tongue sensors are especially problematic where sensitive populations are concerned. These individuals may be more prone to fatigue (e.g., Friedman et al., 2007, on fatigue in PD patients), more likely to drool (Reddihough & Johnson, 1999), and find it more difficult to stick out their tongue or open their mouth. Furthermore, their speech is more likely to be impeded by a foreign object in their oral cavity. In the case of children, their tongues are smaller, they also salivate more, and need more frequent toilet visits, which necessitates shorter experimental procedures, including shorter preparation times. When testing children and patients, researchers therefore often opt for only two tongue sensors (tongue tip and tongue back) in addition to the intraoral jaw movement sensor and the intraoral reference sensor.
While the exact sensor placement depends on the study, there are some typical sensor placements. These are depicted in Figure 2, which shows movement sensors used to track the movement of articulators (red dots; including the lips, jaw, and tongue) and reference sensors, placed on orofacial structures that do not move during speech production (green dots; including both mastoids, the nasion, and upper incisor). More details on individual sensor categories are provided below.
After all sensors have been placed, a biteplate 6 recording can be made with a biteplate object that has several sensors attached to it (see Figure 9 in Section 4.2 for a picture 6 Researchers refer to both 'biteplate' and 'biteplane' recordings. of our lab's biteplate with three sensors). The object is placed between the participant's teeth and a recording is made to obtain the relative orientation of the sensors on the biteplate compared to the reference sensors. This information is then used to rotate the acquired sensor movement data (of the sensors attached to the articulators) to a comparable occlusal plane per participant (Westbury, 1994). Finally, palate trace recordings are made, where a sensor is used to trace the palate across the occlusal plane, providing an estimate of the shape of participants' oral cavity (see Neufeld & van Lieshout, 2014, for a description on how EMA sensors can be used to construct a 3D model of the hard palate).
The time it takes for all sensors to be placed varies. Earnest and Max (2003), for example, state that it can take anywhere between 30 and 60 minutes. This time can be reduced depending on the device, the number of sensors, and their placement. Before starting the experiment, researchers additionally allow some time for the participants to adjust to the sensors. A study by Dromey et al. (2018), who tested sensor habituation, found that after ten minutes, participants reached a level of habituation to the sensors that did not improve even if the habituation stage lasted longer. In general, if researchers include a sensor habituation stage, it is most often 5-10 minutes of informal conversation (e.g., Katz, Mehta, & Wood, 2018;Goozée et al., 2007).
Several brands of adhesive can be used to adhere the sensors. The Carstens website recommends Epiglu (Meyer Haake GmbH), whereas NDI does not give any adhesive recommendations on their website. Other popular adhesives include PeriAcryl®90HV (Glustitch), Isodent cyanoacrylate adhesive (Ellman International), Cyano Veneer Fast (Scheu Dental Technology), Cyanodent (Ellman International), Histoacryl (B. Braun), and Aron Alpha (Toagosei). Note that IsoDent and Cyano Dent adhesives appear to be discontinued 7 , and Cyano Veneer Fast has not renewed its medical certification, while the intraoral use of Histoacryl may be problematic due to potential cytotoxic effects (Schneider & Otto, 2012). PeriAcryl®90HV has been used most often in recent years.
What these adhesives (except for Histoacryl; Schneider & Otto, 2012) have in common is that they are intended for oral tissue (e.g., for use in dental or oral surgery), are biologically safe, and relatively viscous. Dental cements, including Ketac™, Durelon, and Fuji, have also been used by several labs to attach tongue sensors (e.g., Mooshammer, Hoole, & Geumann, 2006;Tabain, 2003;Steele & van Lieshout, 2004), but are more invasive, as they involve covering the tongue dorsum with a hard substance. Dental cement also causes faster deterioration of sensors and leads to participant discomfort. However, it does have the benefit of making sensors adhere to the tongue for a longer period of time (e.g., Ball, Gracco, & Stone, 2001, state that the sensors remain firmly attached to the tongue surface for over 90 minutes).

Tongue anatomy
The tongue is a highly mobile and muscled articulator, responsible for speech, mastication, and deglutition. For the purposes of speech production, there are two potential ways of defining parts of the tongue: the anatomical perspective (see note 8 for details) and the functional perspective, which defines the tongue in terms of functions that different parts serve in the process of speech motor control, and is thus directly relevant to EMA data collection. Following Ladefoged and Maddieson (1996, Ch. 2), the tongue consists of the tongue tip (Figure 3-1), tongue blade (just behind the tip), tongue body ( Figure  3-2), and tongue root (Figure 3-3). The tip of the tongue starts parallel to the surfaces of incisors and extends to cover a small area about 2 mm wide on the upper surface of the tongue at rest. The blade of the tongue is the part that starts behind the tongue tip and extends to 2 mm behind the point of the tongue that is located below the center of the alveolar ridge (i.e., the point of the maximum slope). Sounds made with the tongue tip are said to be apical while those made with the tongue blade are said to be laminal. When discussing sensor placement, we refer to the sensor adhered to this most anterior part of the tongue (encompassing both the tip and the blade) as the 'tongue tip' sensor (Figure 3-1).
The tongue body (Figure 3-2) is the mass of tongue behind the blade and can roughly be divided into tongue body front (below the hard palate) and tongue body back (below the velum). Sounds that are produced with this part of the tongue are dorsal. When discussing sensor placement, we refer to sensors placed on the tongue body as either 'tongue mid' or 'tongue back,' depending on how close to the tongue root the sensor is. Unless specified differently, all sensors are placed along the midline of the tongue, i.e., the median sulcus, which divides the tongue into the left and right parts.
Finally-regarding the tongue parts that are not easily accessible for sensor placement and EMA measurements-the tongue root is found behind the tongue body (Figure 3-3), in the oropharynx, together with the epiglottis. It is not easily possible to track tongue root movements with an EMA sensor due to the gag reflex.
Depending on the target sounds and/or phenomena being studied, different sensors are used (see Table 1 for some common sounds and corresponding sensors). In all cases, it is presumed that reference sensors (most frequently on the nasion, upper incisor, and both mastoids) are additionally being used. Note that the table only shows a limited subset of sounds that have been studied with EMA. Importantly, Yunusova, Rosenthal, Rudy, Baljko, and Sakalogiannakis (2012) describe which lingual sounds can be distinguished using articulography, and state that consonants cannot be distinguished on the basis of only one characteristic, such as the tongue position measured with a single sensor, as more dimensions are needed (e.g., also lip sensors).
Tongue shapes vary vastly from one individual to the next (King & Parent, 2001;Kullaa-Mikkonen, Mikkonen, & Kotilainen, 1982). For example, some individuals may have a more fissured tongue with more grooving than others, which makes sensor adhesion directly to the median sulcus more difficult. Regarding tongue anatomy, several factors should be considered, including age (namely, adults have a longer tongue than children; Vorperian et al., 2005), body weight (namely, tongue muscle volume positively correlates with body weight; Stone et al., 2018), and gender. The effects of the latter are less clear, as some studies have shown that men have significantly larger tongue breadth and volume (Oliver & Evans, 1986;Mahne et al., 2007), while others failed to find such an effect, even though men do usually have a larger bony structure (Hopkin, 1967). Additionally, tongue rhythm and velocity correlate with age (movements are slower and more irregular in the elderly; Hirai et al., 1989). Finally, different types of tongue movements exist, from hollowing and grooving to pulling back, tipping, heaping, and bunching (Hiiemae & Palmer, 2003), which impacts the production of different sounds.

Hard palate, salivary flow rates, and gingival tissue
Aside from considerations related to the tongue itself, restrictions posed by the rest of the oral cavity have to be taken into account when placing intraoral sensors. Particularly relevant in this regard are the hard palate, gingival tissue, and salivary flow rates. Differences between speakers occur in the height, length, slope, width, and curvature of the hard palate (e.g., Brunner, Fuchs, & Perrier, 2009;Rudy & Yunusova, 2013;Lammert, Proctor, & Narayanan, 2018). These differences in palate shape are also responsible for variability in speech production. When comparing the speech produced by individuals with flat, domed, or regular palates, it has been hypothesized that speakers with flat palates have more precise articulations because that is the only way to maintain acoustic consistency (Bakst & Johnson, 2018;Brunner et al., 2009). Furthermore, palatal morphology can also account for some variability in tongue positioning (Rudy & Yunusova, 2013).
Other anatomical considerations include the production of saliva and gingival tissue. Salivary flow rates (i.e., the quantity of saliva) differ greatly across healthy individuals (Whelton, 2012). This may substantially influence how well intraoral sensors adhere to the tongue and incisors, as the usual cyanoacrylate adhesives (see description of adhesives above) polymerize after coming into contact with saliva. Moreover, the production of saliva is heavily influenced by external factors, such as degree of hydration or circadian rhythm, but also by minor factors including gender, age, and body weight (Whelton, 2012). Specifically, men salivate more than women (Inoue et al., 2006), elderly adults salivate less than middle-aged adults (Navazesh, Mulligan, Kipnis, Denny, P. A., & Denny, P. C., 1992), and individuals with a higher body mass index have a less heavy salivary flow rate (Flink, Bergdahl, Tegelberg, Rosenblad, & Lagerlöf, 2008).
Finally, especially relevant for the attachment of the intraoral jaw-movement and reference sensors, which are usually positioned on or close to the lower and upper incisors, is the amount of gingival tissue above and below the incisors. These two (lower and upper incisor) sensors can be more easily placed when the speaker has a larger gingival surface above and below the incisors. For speakers with a small gingival surface, or for speakers who have a prominent labial frenulum, an alternative sensor placement plan may be considered (e.g., on the chin-which is non-ideal due to skin movement-or directly on the incisors as opposed to the gingival tissue).

Use and positioning
During the post-processing stage of EMA data, positional data from the reference sensors is used to correct for deviations in head position relative to a consistent reference position, which is usually the occlusal plane. The reference sensors are usually placed as far apart as possible (to minimize the effect of noise on the position estimation of individual sensors) on bony structures with least skin movement, including the nasion (N), mastoid processes (i.e., on the bone behind both ears; ML and MR), and the gingival tissue of upper central or lateral incisors (UI). Our literature review shows that older studies predominantly included two reference sensors placed in the midsagittal plane (i.e., on the nasion and upper incisor), while newer studies often include more.
While reference sensors are usually similar in architecture as movement sensors (i.e., capturing five degrees of freedom, hereinafter 5DOF), NDI has additionally developed a (two-channel) 6DOF sensor in which two 5DOF sensors are integrated to have a specific distance and relative orientation. If a 6DOF sensor is used, it is usually attached to the forehead, and automatically corrects the data of the other sensors for the head movements (measured via the 6DOF sensor). While it is convenient to use only one reference sensor, the potential for noise (induced by skin movement) is greater in comparison to the more commonly used three-sensor setup as discussed above.

Preparation and adhesion
Reference sensors are prepared differently depending on where they are being placed. Those placed on extraoral structures (i.e., the nasion and mastoid sensors) are generally taped using medical tape. They need to be taped firmly to prevent movement; a small drop of adhesive can additionally be added to achieve this. They can also be coated in latex to make disinfection after the experimental session easier and to prolong sensor longevity. The intraoral reference sensor is usually placed on the gingiva above the upper central or lateral incisors. Section 3.6.2. provides more information on preparing the intraoral incisor reference sensor.
The reference sensors can alternatively be prepared and placed on a pair of goggles, on the frame of a pair of plastic glasses, or on a headband (e.g., Ji, Berry, & Johnson, 2013;Mefferd, 2019;Thompson & Kim, 2019;Kearney et al., 2018). The Appendix shows additional information regarding individual researchers' strategies to place reference sensors.

Use and positioning
Tongue sensors are used to track tongue movements and investigate the production of a wide range of sounds, from alveolar stops (with a tongue tip sensor) to velars (with a tongue back sensor). Sensors are placed midsagitally unless the researcher wishes to specifically study lateral sounds, in which case one or two sensors may be added on the lateral parts of the tongue.
Concerning tongue sensors, 375 journal studies (out of 412 in total) explicitly mention the number and/or positioning of tongue sensors (as opposed to, e.g., only generally mentioning that they used tongue sensors). A total of 41 out of 375 studies (11%) use one tongue sensor, 90 studies use two tongue sensors (24%), 165 studies use three tongue sensors (44%), 70 studies use four tongue sensors (19%), and nine studies use five tongue sensors or more (2%). Either two or three sensors on the tongue are thus the most frequent choice, bringing the total number of intraoral sensors to four or five (including the reference sensor on the upper incisors and a jaw-movement sensor on the lower incisors).
If three sensors are used, they are usually placed on the tongue tip (TT), tongue middle (TM), and tongue back (TB) along the tongue's median sulcus. When three sensors are used, there are two main approaches to dividing the tongue dorsum: either by placing TT and TB according to a predetermined measurement strategy or by spacing the sensors equidistantly (see below and also Table 2).
In their placement of the TT sensor, most researchers provide a measurement, with 'approximately 1 cm' from anatomical tongue tip as the most popular choice (note that the sensor cannot be placed directly on the tip because it would interfere significantly with speech production and fall off quickly). Keeping in mind the functional perspective on tongue anatomy, this means that the 'tongue tip' sensor is in fact placed on the tongue blade as opposed to the tongue tip. The exact method of measurement (i.e., by ruler, calliper, or simply 'eyeballing') is mostly left unspecified. Furthermore, with a few exceptions, it is not indicated whether the measurements were performed with the tongue comfortably extended, stretched out, or at rest inside the mouth.
Regarding the placement of the TB and TM sensors, strategies vary to a greater extent than the strategies for the TT sensor. Some researchers decide on a specific measurement, e.g., by placing TB and TM sensors with 2 cm of space in between each sensor or by placing Rebernik et al: A review of data collection practices using electromagnetic articulography Art. 6, page 15 of 42 the TB sensor 4-5 cm from the TT sensor, with the TM sensor in between the two. Others decide to place the TB sensor 'as far back as possible' and the TM sensor in between. If two TM sensors are used, they are most often defined as being placed equidistantly between the TT and TB sensors.
Few studies use lateral sensors (some exceptions include e.g., Katz, Mehta, & Wood, 2017;Thibeault, Ménard, Baum, Richard, & McFarland, 2011; see the Appendix for a full list of studies using tongue lateral sensors). If lateral sensors are used, they are most often placed to the side of the TM sensor, about 1 cm from the tongue edge. Table 2 provides an overview of the most common strategies for tongue sensor placement as well as their usage frequency in our literature review. The main strategy for each sensor type is highlighted in bold. In total, 273 out of 375 studies explicitly defined the position of at least one tongue sensor. For more details on which researchers use which strategy, the reader is invited to consult the 'tongue sensors' tab in the Appendix.
While not strictly in the purview of this literature review, we would like to mention two recent publications, which proposed more data-driven approaches to sensor placement. First, Patem, Illa, Afshan, and Ghosh (2018) used dynamic programming in order to determine optimal sensor placement for the sounds of American English based on rtMRI video frames of the vocal tract. Based on data of four participants (two male, two female), they determined that the optimal placement for three tongue sensors is to place the tongue tip sensor at 19.93 ± 11.45 mm from tongue base, 9 the tongue middle sensor at 38.2 ± 11.52 mm from the tongue tip sensor, and the tongue back sensor at 80.51 ± 13.51 mm from the tongue tip sensor.
These measurements are informative for the four participants examined, however it would in practice be difficult to measure a participant's tongue in such detail and difficult to find participants for whom such measurements would be suitable (e.g., placing a tongue back sensor at 8 cm from the tongue tip sensor is often not practically possible due to limited tongue length; Patem and colleagues themselves state that they did not consider the level of discomfort in determining optimal sensor locations). Furthermore, it is not possible to accurately determine the tongue base without access to MRI, and the confidence intervals of the presented optimal placements are rather large.
Second, Wang, Samal, Rong, and Green (2016) used machine learning to determine an optimal set of points needed for classifying speech movements. They determined that for classifying most sounds (including both vowels and consonants), a set of four sensors (tongue tip, tongue back, upper lip, and lower lip) suffices. This is especially informative when studying the speech of clinical populations, since in those circumstances it is often desirable to use the minimal number of sensors to limit the burden on the participants.

Preparation and adhesion
Few studies mention the preparation of tongue sensors prior to placement. However, no conclusions can be drawn from this, as some researchers might simply not mention the specifics of sensor preparation due to manuscript length limitations or a perceived lack of interest from the readers. We could nonetheless identify some tongue sensor preparation options. Note that the tongue itself is also often 'prepared,' as it is dried to improve sensor adhesion (see also Section 4 for our drying procedure). First, some researchers adhere the sensors to the tongue without any preparation (i.e., using bare or out-of-thebox sensors).
Another option is to coat the sensors in latex before adhesion, a frequently-used approach (Earnest & Max, 2003). This method is suggested on the website of the Carstens articulograph (Electromagnetic Articulograph, 2019), where it is indicated that Plasty late latex milk (Glorex GmbH) is a suitable product for coating the sensors. The latex coating, they report, keeps the sensors clean and without glue residue. In their Carstens AG500 Manual (2006) they additionally state, under the 'Cleaning and disinfection of sensors' section, that coating the sensors in latex is recommended, as the latex can simply be peeled off after testing. Sensors can (and, if possible, should) according to Carstens be coated in latex for use on other facial surfaces, not just lingual, as this increases sterility and sensor longevity. Latex coating should also increase the longevity of (reusable) NDI Vox sensors (NDI, personal communication).
The third approach for preparing tongue sensors consists of increasing the sensor size to increase the adhesion surface and thereby potentially increasing the sensor adhesion duration. This can be done, for example, by placing small pieces of silk between the sensor and lingual surfaces (e.g., Ji et al., 2013;Goozée, Murdoch, Theodoros, & Stokes, 2000;Fuchs, 2005), gluing a small transparent layer of plastic to the bottom of the sensors (e.g., Wieling, Veenstra, Adank, Weber, & Tiede, 2015), or covering the head of the sensors with a small, thin flap of latex (our approach; see Section 4).
We carried out a sensor-adhesion experiment to compare these three approaches for tongue sensor adhesion. This experiment is reported on in Section 5.

Jaw-movement sensors
3.6.1. Use and positioning Jaw movements can be tracked with either an intraoral sensor that is adhered on the lower incisors or an extraoral sensor adhered to the chin. The former is preferred, as the position of the chin sensor may also be affected by skin movement during speaking. From 286 studies that use a sensor to track jaw movement, 214 (75%) use a sensor on (or near) the lower incisors, compared to 72 (25%) which use a sensor on the chin. However, note that there are also differences in the placement of incisor sensors: While most researchers refer to placement on 'incisors,' only few place the sensor on the incisors themselves (i.e., on the teeth). Most place the sensor on the gingival tissue below the incisors.
Most studies use only one jaw movement sensor. However, some have also used several (e.g., Wang et al., 2016, who placed three sensors on the jaw; Mooshammer, Tiede, Shattuck-Hufnagel, & Goldstein, 2019, who placed two sensors on the lower gumline, one below the front incisors and one below the left premolar; Mefferd, 2017, who placed three sensors to the lower gumline; or Mooshammer, Hoole, & Geumann, 2007, who placed two sensors on the outer and inner surface of the lower gumline and one sensor on the chin). Note that even with a single sensor, jaw movements can easily be tracked but are often hard to decouple from tongue and lower lip movement (e.g., Henriques & van Lieshout, 2013), as components of jaw movements are also present in tongue and lip movements. Furthermore, as the jaw is a rigid body, at least two 5DOF sensors are necessary to correctly track its orientation relative to the head.

Preparation and adhesion
If the jaw-movement sensor is placed extraorally, most frequently on the chin, no special preparation is mentioned in the reviewed studies (although the sensors can be coated in latex to increase sterility and longevity). In contrast, our literature review revealed several methods of preparing an intraoral jaw sensor (and the intraoral reference sensor). These methods include using the same dental adhesive as on the tongue, creating a custom dental mould of the incisor to which the sensor is adhered (e.g., Steele & van Lieshout, 2004;Steele, van Lieshout, & Pelletier, 2012), or adhering the sensor to a piece of Stomahesive wafer (e.g., Mefferd, 2017;Berry, Kolb, Schroeder, & Johnson, 2017;Dromey et al., 2018). The latter approach-using Stomahesive-increases the surface of the sensor as well as its adhesion to the participant's gingival tissue due to the nature of the material. As this is the method used in our lab, there are further details on the preparation of Stomahesive-covered sensors in Section 4. Lip sensors are generally placed on the vermillion border of the upper and lower lips. Data obtained through these sensor positions allow to estimate variations in lip aperture or lip protrusion that are phonetically relevant (e.g., production of bilabial stops as compared to fricatives, or between rounded and unrounded vowels). In some cases, such as when a study focuses on lip movements specifically, more lip sensors are attached, namely at the right and/or left lip corners (e.g., Meenakshi & Ghosh, 2018;Rong et al., 2012;Cler, Lee, Mittelman, Stepp, & Bohland, 2017).

Preparation and adhesion
Lip sensors can be bare or coated with latex (to increase hygiene and longevity, as these sensors come in contact with saliva). If more than two lip sensors are used, latex-coated sensors are likely to result in affected articulation due to their larger size. Most often, lip sensors are adhered with a piece of tape. To increase adhesiveness, a small drop of adhesive can additionally be added, which ensures that the sensors are firmly adhered for the duration of the experiment. This is especially important if the medical tape does not stick adequately (e.g., due to the participant's sweat or repeated large labial movements in stimuli targeting plosives).

EMA data collection in practice: A suggested procedure
In Section 4 of the paper, we provide a practical description of the data collection procedure employed in our lab at the University of Groningen. Our approach is only one of the many possible strategies available to researchers who collect speech production data with EMA, as was also illustrated in the previous part. The description includes all details which are important, but often omitted from publications.

Preparation of the sensors using latex
In the procedure used in our lab, all sensors are prepared at least half a day before the experiment. In this preparation stage, we distinguish between three types of sensors: (1) the extraoral sensors (identified with MR, ML, N, UL, and LL, below) plus the sensors attached to the tongue (TM and TT), except for the most posterior tongue sensor, (2) the most posterior tongue sensor (TB), and (3) the sensors attached close to the incisors on the upper and lower gums (UI and LI). We check the sensors for any visible defects (e.g., broken wire) before using them. The first group of sensors is prepared by dipping each of them in mask-making latex (RD 407 Mask Making Latex, Monster Makers). The TB sensor is prepared similarly but having an additional latex flap cover (see Section 5.1), which increases the surface of the sensor and may be beneficial for the adhesion duration (see Section 5.5). Finally, the UI and LI sensor are prepared using a Stomahesive wafer (ConvaTec PLC). A small rectangular piece of Stomahesive is cut measuring about 10 mm × 6 mm. The sensor is placed on top of this piece and a drop of latex is applied to it in order to make it adhere (Figure 4, left and right). The early preparation phase is necessary, as the latex takes several hours to completely dry. However, the sensors should not be prepared too early (e.g., a week in advance), as the latex becomes less flexible with time and more difficult to remove. In case of re-use, we disinfect sensors first using SPORECLEAR Medical Device Disinfectant (Hu-Friedy Mfg. Co., LLC) and then wipe them with an alcohol wipe before storing them.

Preparation and attachment of reference sensors
After checking that participants are not pregnant, do not have a pacemaker, and do not have a latex allergy, our data collection procedure is as follows. All sensors are screwed into the miniature terminal blocks of the NDI Wave (or, in the case of the NDI Vox, plugged into the sensor harness assembly), wiped with an alcohol wipe, and placed on a sterilized tray a short time before the participant's arrival. We perform a sensor validation check by verifying that each sensor that is screwed in also functions as it should. Once participants arrive, we first ask them to take a disposable toothbrush and scrub their tongue (especially along the midline). They do this in front of a mirror, so that they are aware of how far back they are reaching and do not trigger their gag reflex. By scrubbing their tongue, they remove the coating that covers the tongue (the amount of coating differs per participant 10 ). We subsequently ask the participant to remove jewellery, glasses, and hearing aids, when applicable, as they make sensor placement more difficult and potentially could interfere with the signal (as the presence of metal inside the magnetic field has a negative effect on the precision of the recovered sensor positions). The glasses and hearing aids are returned to the participant once the sensor placement is complete if their use is necessary for successful participation in the experiment.
We additionally ask participants whether they are wearing dentures, as these may move slightly during speaking, which could result in some wire pull for sensors placed on the gingival tissue. Since dentures cannot be removed without impeding articulation, we note their presence but otherwise do not ask the participant to remove them. Additionally, if possible, participants should shave before the experiment and avoid wearing makeup as this makes sensor placement more difficult.
Subsequently, the participant is asked to sit down next to the EMA field generator (we were using the NDI Wave system, but have very recently moved to using the NDI Vox system). We first place four prepared reference sensors: 11 10 Coffee, especially, leaves a brown coating on the tongue, which is not optimal for sensor placement. 11 In principle, three (or even two) reference sensors are enough to correct head movement. However, we (as many other researchers) use one additional sensor as a backup in case one of the reference sensors malfunctions. We do not use the NDI 6DOF sensor (containing two sensors with a specific distance and orientation towards each other) which may be used to automatically correct for head movement, but use separate reference sensors instead, as it is beneficial to maximize the difference between the reference sensors to minimize the influence of noise from the reference sensors on the rotation. All sensors (reference and others) are first held in reverse action tweezers (Hobbycraft), as they make the application of sensors to the participant easier. The first three reference sensors are applied after the researcher has sterilized their hands using Sterilium® (Medline). Before placing any intraoral sensors, the researcher puts on (latex) dental gloves and a dental mask. 12 The mastoid sensors (ML and MR) are placed behind the participant's ears on the skin covering the mastoid part of the temporal bone, where there is minimal skin movement (Figure 5).
The nasion sensor (N; Figure 6) is placed on the part where there is least skin creasing. If the participant is wearing glasses, the sensor is placed right above or below their glasses, depending on how big the frame is. The first three sensors are secured with a drop of glue. We use PeriAcryl®90 HV adhesive (GluStitch Inc), which is kept in the fridge (at ~2°C) until 12 We use the dental mask for adults but often avoid it for children, as they do not yet have such a strong 'germ reflex' and we noticed it makes them feel uncomfortable.  the participant's arrival. At that moment, two to three drops of adhesive are added to a small plastic mixing well (Maxill Inc.) after which the adhesive is returned to the fridge. A small disposable plastic pipette is used to transfer the adhesive from the mixing well to the sensor. The sensor wires are adhered to the participant using Leukopor or Leukosilk tape (BSN medical GmbH). A piece of tape is additionally placed over the ML and MR sensors to secure them (see tape in Figure 5). We add a piece of tape to the N sensor but place it slightly higher on the forehead (see tape in Figure 6), as it otherwise disturbs the participant's visual field.
The final reference sensor (UI), on top of the piece of Stomahesive, is attached to the gingiva above the left upper incisor. No glue is added to the Stomahesive, as it adheres to tissue by itself. We avoid placing any incisor sensors to the midsagittal line, directly above the central incisors, due to the labial frenulum, which connects the upper lip to the gingival tissue and is quite sensitive. The UI sensor placement relative to the labial frenulum can be seen in Figure 7.
After the reference sensors have been placed, the palate trace and biteplate recordings follow. These are crucial (particularly the biteplate recording) to ensure the subsequent quality of the collected data. For the palate trace, we adhere one spare sensor to the end of the participant's dominant thumb using Leukopor tape (so that the sensor wires are leading down the thumb and pointing towards the wrist) and instruct them to trace the thumb from the back of the hard palate to their front teeth. The purpose of this procedure as well as the tracing method are explained by means of a mouth puppet (Super Duper® Publications; Figure 8), which, due to its cartoonish look, is also useful in decreasing participants' potential anxiety. The palate trace is performed twice.
For the biteplate recording, we created a (reusable) fixed triangular protractor with three sensors glued to it (Figures 9 and 10). The same protractor is used for all participants; it is wiped with an alcohol wipe before every use and disinfected with SPORECLEAR Medical Device Disinfectant (Hu-Friedy Mfg. Co., LLC) after every use. The protractor is pushed as far back as comfortable into the corners of the participant's mouth. The participant is then asked to hold the protractor firmly between their teeth and sit still for a few seconds while the biteplate recording is made. The protractor must be in contact with the molars in order to obtain a true occlusal reference. We check the biteplate recording directly by comparing the Euclidean distances between all the reference sensors and the three sensors on the biteplate, using MATLAB (MathWorks Inc.). If these distances remain relatively constant over time, this indicates that the position of the reference sensors and the biteplate sensors is correctly tracked.

Attachment of movement sensors
After the palate trace and biteplate recordings, we proceed with attaching sensors to the articulators that we wish to capture. Most frequently, these sensors are the following (listed in the order of placement): -tongue back (TB) -tongue mid (TM) -tongue tip (TT) -lower incisor (LI) -upper lip (UL) -lower lip (LL) To determine where to place the tongue back sensor, we use a colour transfer applicator stick (Dr. Thompson's, GUNZdental). We ask the participant to drag the stick midsagitally across the midline of their hard palate (as they had done before with the palate trace sensor) and then pronounce the velar /k/, followed by directly sticking out their tongue. 13 They are asked not to swallow while their tongue is being marked. The colour from the applicator is transferred from the palate to the part of the tongue where the back-most (velar) sound is made. We use the same stick to draw a coronal line through this spot. Additionally, we use measuring tape to measure 1 cm from the tongue tip (when the tongue is stretched) and drag a coronal line through that point as well.
The coronal line enables us to always re-adhere the sensor to the approximately same position if it starts getting loose, as the point might become smudged through speaking and swallowing, but the line will remain clearly visible. Figure 11 below shows the coronal lines on the tongue left by the colour transfer applicator stick, with the median sulcus still clearly visible. 13 This procedure is similar to the procedure used by Brunner, Hoole, and Perrier (2011b). However, we use the colour transfer applicator to mark the spot where the participant produces their /k/. Brunner et al. (2011b) used an oral disinfectant with a strong purple colouring agent and asked the participant to close their mouth and push their tongue (neutral position) against the hard palate. The colour mark was thus transferred to the tongue dorsum. The participant can now swallow as the coronal lines will remain clear, even when they come in contact with saliva. The participants are asked to stick out their tongue as far as comfortable. We place barber tape (Comair GmbH, folded three times to contain at least eight layers) on the back line marking on participant's tongue, dab the tape on the tongue for about 5-10 seconds, and finally drag the tape across the tongue. This procedure dries the tongue dorsum and is crucial in ensuring that sensors do not fall off easily. We hold each sensor in the tweezers and add a drop of adhesive using a small plastic disposable pipette before placing the sensor on the tongue.
The TB sensor is placed on the crossing between the marked posterior line and the median sulcus, so that the wire of the sensor is pointing downward and towards the lip corner. A disposable wooden tongue depressor (Tegler) is used to press the sensor to the tongue for 10-20 seconds. The wire is then secured to the cheek using Leukopor tape. It is essential that the wires have enough slack, as large speech gestures may otherwise lead to wire tension, which is uncomfortable for the participant and may cause the sensor to come loose. The process is repeated for the TT sensor, which is placed on the crossing between the marked anterior line and the median sulcus. Note that the TT sensor is positioned in such a way that the wire is pointed towards the side of the tongue, as a wire running over the tongue tip feels uncomfortable for the participant and leads to lisping (Hoole & Nguyen, 1999).
The tongue mid sensor is placed halfway between the marked lines for the TT and TB sensors on the median sulcus by eyeballing. In line with previous methodological considerations (see Section 3.5.1), we generally do not use the TM sensor when testing clinical populations or children. If we are using lateral sensors, we place these to the right and left side of the TM sensor, 0.5-1 cm from the edge of the tongue (depending on how wide or narrow the participant's tongue is). We only place more than three sensors if that is required for the purposes of the study. The final intraoral sensor (LI) tracks the jaw movement. This sensor, prepared with Stomahesive, is attached to the gingiva below the right lower incisor. No additional glue is needed, as Stomahesive adheres to tissue by itself.
Finally, two lip sensors (UL and LL) are attached at the vermillion border of the upper and lower lip using a drop of dental adhesive. Depending on the amount of facial hair surrounding the upper and lower lip, the removal of lip sensors can lead to mild discomfort.

Aim
The present experiment tested how different preparation methods for EMA sensors affect adherence to the tongue. As discussed in Section 3.5.2, several methods for sensor preparation exist. We specifically focus on the tongue sensors, as these usually are most likely to come off relatively quickly. The aim of this experiment was therefore to determine which type of sensor preparation (see below) is most beneficial for adhesion, also depending on the position on the tongue. In addition, we evaluated (qualitatively) whether the participant's tongue anatomy influences adhesiveness.
We tested three types of sensor preparations: out-of-the-box ('bare') sensors, latex-coated sensors, and sensors with a latex flap. Out-of-the-box sensors (Figure 12, left) are the sensors as provided by NDI for the Wave device (approximate surface: 30 mm2), latexcoated sensors (Figure 12, center) are dipped in latex (with only a slightly larger surface than the out of the box sensors, but with rounder edges), and sensors with a latex flap (Figure 12, right) are covered in the same latex, but now a brush is used to apply the latex while the sensor head is lying on a flat surface (approximate surface: 70 mm2).

Participants and experimental procedure
To test these three types of sensor preparations, we tested 10 female adult participants in three separate sessions. All 10 participants were between 20 and 30 years of age. The study was approved by the Faculty of Arts Research Ethics Review Committee of the University of Groningen (approval number 71276154).
For each of the three sessions, we used one type of sensor and followed the same application procedure for each type (as described in Section 4 above). The sessions took place on three different days, thus avoiding the risk of glue residue and tongue fatigue, both of which would have influenced the resulting adhesion times. During the first session, we adhered out-of-the-box sensors, during the second session the latex-coated sensors, and during the final session the sensors with the latex flap.
During every session, we placed five sensors on the tongue, as this is the maximum number of tongue sensors used by researchers (see Section 3.5.1 and the Appendix). The sensors in question were placed on the tongue tip (TT; 1 cm from the tip), tongue back (TB; place of /k/ constriction), tongue middle (TM; between TT and TB), and tongue lateral right and left (TLR and TLL, placed to the left and right of the TM sensor, respectively). While few studies investigate lateral sounds (see above), we wished to assess whether different types of sensor preparations are also suitable for studying lateral movement of the tongue, as those parts move differently, and the sensors are more prone to interference from the participants' molars. Figure 13 below displays sensor placement examples for latex-coated sensors.  The sensor placement process took approximately ten minutes. After we placed the five sensors on the tongue, we started displaying the stimuli to the participants using Microsoft PowerPoint on a computer monitor in front of them. The articulograph was not turned on for this experiment, as we were not collecting kinematic data and merely wished to determine how long it took for each sensor to fall off.

Stimuli
The experimental procedure consisted of the following tasks and stimuli. First, the participants read the short text Please call Stella from the Speech Accent Archive (Weinberger, 2015). This allowed them to get used to speaking with sensors in their mouth (i.e., the sensor habituation stage) and took approximately one minute. We did not include a longer sensor habituation stage as our goal was not to record the participants' natural speech. Following the text was a wordlist. It contained 300 words of varying lengths and from various thematic fields (e.g., vegetables, fruit, school, vocations). Each word appeared on the screen for four seconds, during which the participant read it out loud. This procedure lasted for 20 minutes. Finally, at the end of the first wordlist, the participants performed a rapid syllable repetition task, namely the diadochokinesis (DDK) task, at a comfortable but fast speaking pace as defined by the participants themselves. The DDK task involved the repetition of syllables /pa/, /ta/, /ka/, and /pataka/, and was included because fast repetitive movements may potentially cause the sensors to fall off faster. The experiment was considered complete once all the sensors had been detached or the three tasks were repeated twice, which took about 45 minutes. When a sensor fell off (the participants were instructed to inform us when they felt a sensor get loose), we removed it and noted the time it fell off. We did not re-attach any sensors.
The experimental procedure, including participant preparation (five minutes) and sensor placement (ten minutes), took 60 minutes at most. At that point, we stopped the experiment and removed the remaining sensors. The maximum time a sensor was adhered to a person was therefore 45 minutes. The experimental procedure is schematically presented in Figure 14.

Anatomical measurements of the tongue
For all participants, we measured the relative tongue length, tongue width, and maximal mouth opening. All three measurements were taken with the participant's tongue comfortably extended using a ruler. First, we measured the relative tongue length, defining it as the distance between the anatomical tongue tip and the place we had marked as the place of /k/ constriction. Second, we measured the tongue width, defined as the widest part of the tongue, parallel to the molars. Finally, we asked the participants to open their mouth as wide as they comfortably could and measured the vertical distance between the surface of the tongue and the edge of their upper central incisors. We defined this as 'mouth opening,' which in effect represents the maximum intraoral space that the researcher can work with during the sensor placement procedure. Due to the lack of suitable equipment, we were not able to measure the participants' salivary flow rate or take any other anatomical measurements.

Statistical analysis and results
To assess the potential effect of sensor preparation method and sensor position on sensor adhesiveness, we used linear mixed effects regression modelling with participant as a random-effect factor and the optimal random-effects structure (i.e., assessing the inclusion of random intercepts and slopes) determined via model comparison. Specifically, we evaluated whether sensor preparation type (out-of-the-box, latex-coated, flap) and sensor position (tt, tm, tb, tll, tlr) affected sensor adhesiveness. As our initial analysis appeared to show a clear distinction between the TB sensor (adhering for a much shorter duration than the other sensors, whose adhesion did not differ significantly from each other; see Figure 15), we created a new fixed effect predictor distinguishing the TB sensor from the other sensors.
The best model for our data, determined via model comparison, only warranted the inclusion of the distinction between the TB sensor and the other sensors, in addition to the by-subject random intercept and a by-subject random slope for the contrast between the TB and the other sensors. Specifically, this model showed that the TB sensor adhered approximately 14 minutes less than the other sensors (β = -14.0, t = -5.0, p < 0.001). Sensor preparation type (see Figure 16) did not reach significance in the best model, nor did any of the other anatomical predictors. Of course, this may be partly due to our limited sample size (N = 10).
When explicitly focusing on the interaction between sensor preparation type and the sensor (TB versus the other sensors), the flap sensor appeared to be detrimental for the adhesion time for non-TB sensors, reducing the estimated adhesion time with about five minutes compared to the bare sensor and about three minutes compared to the sensor coated in latex. However, for the TB sensors the opposite pattern was found. The adhesion time of the sensor with the flap was estimated to be about five minutes higher compared to the sensor coated in latex and more than nine minutes higher compared to bare sensor. Figure 17 shows this interaction; Table 3 shows speaker-specific differences in sensor adhesion times.   Figure 17: Visualization of interaction between sensor position (TB is tongue body and Non-TB are all other sensors) and sensor preparation type (out-of-the-box, latex, and flap).

Discussion of experimental investigation
In general, our sensor adhesion experiment demonstrated no clear general advantage of any particular sensor preparation type. With five sensors on the participant's tongue, it was difficult to make all of them adhere for the duration of 45 minutes (the only exception being two participants). The adhesiveness of the TB sensor, which was significantly lower than that of other sensors, did improve when the sensor was prepared with a latex flap. When attaching intraoral sensors, it is crucial to preserve a sterile environment. As sensors coated in latex (both with and without a latex flap) are more hygienic, easier to clean, and likely deteriorate slower, we recommend coating the sensor in latex when possible. Based on our results we further recommend adding a latex flap for the tongue back sensor. Additionally, we would like to mention some qualitative observations. Placing a total of five sensors on the tongue is not ideal, particularly when keeping in mind that in a regular experiment, two additional intraoral sensors would need to be included as well. This difficulty was especially pronounced with latex flap sensors, as the required tongue surface to attach the sensors to was largest. In the case of using sensors with a latex flap, participants appeared to take longer to get habituated. However, their articulation seemed to return to normal within the first ten minutes of the experiment (although note that we did not quantify this) and should therefore not be problematic for a regular experimental setup. A practical advantage of the sensors with a latex flap was that once part of the flap detaches from the tongue, this is quickly noticed by the participant and can be easily resolved by adding some glue underneath the flap.
There were several limitations to this study. First, we adhered five sensors to the tongue, which is a larger number than usual. While this was done on purpose, as we wished to assess not only adhesion of sensors placed midsagittally but also those placed laterally, it also might not reflect adequately how long the sensors would adhere in a normal experimental scenario with only two or three tongue sensors. Second, we did not readhere the sensors once they fell off. In a real experimental scenario, one would reglue the sensor to the same position where it fell off. In our experience, it is easiest to reglue a sensor with a flap as the adhesive surface is largest.
Another experiment would need to be conducted to assess how the different sensor types compare when focusing on ease of reattachment. Finally, sensor placement and its effectiveness are strongly impacted by individual factors. While we included certain tongue anatomical measures (none of which turned out to be significant predictors in our best model), others that were not measured and differ between participants-such as salivary flow rate (Whelton, 2012) and tongue surface (Kullaa-Mikkonen et al., 1982)likely play an important role as well. 14

Conclusion
The present paper provided an introduction to electromagnetic articulography and an overview of data collection procedures on the basis of reviewing 905 publications employing electromagnetic (midsagittal) articulography since 1987. In addition, we provided a detailed description of the procedure used in our own lab.
EMA data collection and analysis are time-consuming and technically demanding. Consequently, it is difficult to include a large numbers of participants. Compare, for example, the five participants that seem to be the norm in EMA research (see Section 2.5) with the 50 participants that would be needed for a study with 80% power and aimed at identifying effect sizes as low as Cohen's d = 0.4 (Brysbaert, 2019). If testing 50 or more participants is not really feasible, then individuals who participate in EMA studies should be carefully selected and the testing procedure should facilitate between-and within-speaker comparability. Reliable, accurate, and replicable sensor placement should therefore be ensured.
As we demonstrated in our review, however, there is currently still a great variety of approaches used for EMA sensor preparation and placement. For example, while nearly all studies use a tongue tip sensor, frequently placing it '1 cm' behind the anatomical tongue tip, researchers often do not specify how this distance from the tongue tip was measured (e.g., using ruler as opposed to eyeballing) nor the position that the tongue was in (e.g., at rest inside the mouth, comfortably protruded, completely stretched). This can make a substantial difference, however. Based on our experience, a point that is 1 cm from the tip with the tongue at rest can be nearly 1.5 cm from the tip when the tongue is protruded. Another example of varying sensor placement strategies pertains to the 'tongue back' sensor, which is often placed an arbitrary number of centimetres from the tongue tip or as far back as comfortable and/or possible. Participants, however, are not comparable, as tongue sizes, oral cavities, and comfort levels can differ greatly. One strategy for solving this (also used within our lab) is to place the tongue back sensor where the /k/ (or another sound involving a posterior constriction) is made. In this way, the placement of the sensor makes sense from an articulatory perspective, which is missing from the other (more arbitrary) approaches. 15 Other conundrums with intraoral sensor placements, unfortunately, are not as easily solvable. These, for example, include situations in which a speaker does not have enough gingival tissue for the placement of an incisor sensor, when the tongue of the speaker is too small to place the desired number of sensors, or when a speaker produces too much saliva, causing the sensor to fall off repeatedly.
As point tracking technology continues to improve, it is necessary to strive for better and more consistent methods of sensor adhesion, preparation, and placement. Not to limit the creativity of researchers, but rather to ensure more comparable results in a field usually only focusing on small sample sizes. It is our hope that this paper may serve as a starting point for further debate on the topic.

Additional File
The following additional file for this article can be found as follows.
• Appendix: An .xlsx file, which includes all EMA studies that were collected as part of our literature review. The appendix contains information on the topic, studied population, and sensors in use. It also includes specific information on sensor placement strategies for tongue sensors. DOI: https://doi.org/10.5334/ labphon.237.s1