King’s Research Portal

The effects of gender in human communication and human-computer interaction are well-known, yet little is understood about how it in ﬂ uences performance in the complex, collaborative tasks in computer-mediated settings – referred to as Computer-Supported Collaborative Work (CSCW) – that are increasingly fundamental to the way in which people work. In such tasks, visual feedback about objects and events is particularly valuable because it facilitates joint reference and attention, and enables the monitoring of people's actions and task progress. As such, software to support CSCW frequently provides shared visual workspace. While numerous studies describe and explain the impact of visual feedback in CSCW, research has not considered whether there are differences in how females and males use it, are aided by it, or are affected by its absence. To address these knowledge gaps, this study explores the effect of gender – and its interactions within pairs – in CSCW, with and without visual feedback. An experimental study is reported in which mixed-gender and same-gender pairs communicate to complete a collaborative navigation task, with one of the participants being under the impression that s/he is interacting with a robot (to avoid gender-related social preconcep-tions). The study analyses performance, perceptions and communication strategies. As predicted, there was a signi ﬁ cant bene ﬁ t associated with visual feedback in terms of language economy and ef ﬁ ciency. However, it was also found that visual feedback may be disruptive to task performance, because it relaxes the users ’ precision criteria and in ﬂ ates their assumptions of shared perspective. While no actual performance difference was found between males and females in the navigation task, females rated their own performance less positively than did males. In terms of communication strategies, males had a strong tendency to introduce novel vocabulary when communication problems occurred, while females exhibited more conservative behaviour. When visual feedback was removed, females adapted their strategies drastically and effectively, increasing the quality and speci ﬁ city of the verbal interaction, repeating and re-using vocabulary, while the behaviour of males remained consistent. These results are used to produce design recommendations for CSCW systems that will suit users of both genders and enable effective collaboration. & 2016 Published by Elsevier Ltd.


Introduction
While Computer-Supported Cooperative Work (CSCW) is steadily becoming the norm for many collaborative activities, there is limited understanding about the factors that influence its success.Research has suggested the influence of numerous factors pertaining to, inter alia, the individual, the group, the situation, the task and the features/affordances of the mediating technology (Patel et al., 2012).At the same time, the interplay between factors is likely to generate interaction effects while varying the individual effects.The study reported in this paper focuses on two factors and their interactions: the use of visual feedback as a resource for CSCW; and the gender of the individuals involved.
Theoretical accounts and empirical studies have established that the availability of visual feedback is a critical factor for the success of collaborative work.Visual feedback helps collaborators to exchange and ground information more efficiently, and to maintain awareness of each other's actions and task status (Gergle et al., 2004(Gergle et al., , 2013;;Kraut et al., 2003Kraut et al., , 2002a;;Clark and Krych, 2004;Brennan, 2005).As such, CSCW applications often integrate video or support the sharing of a visual workspace.In Convertino et al. (2009Convertino et al. ( , 2011)), CSCW prototypes were developed that integrated a shared workspace along with other features to support grounding and process awareness.It was observed that distributed teams that used these systems performed better than collocated teams.This finding shows that well-designed collaborative tools not only close the gap between CSCW and face-to-face collaboration, but also lead to better outcomes.
However, the utility of visual feedback in CSCW may vary in significance.First, it depends on the nature of the task.Visual feedback appears to be valuable for spatial and editing tasks, but not for brainstorming tasks (Whittaker et al., 1993), and is more useful for linguistically complex tasks than for simple ones (Ranjan et al., 2007;Gergle et al., 2004).Second, it depends on the features of the technology; that is, the kind and amount of visual feedback provided.For example, sharing workspace (being able to view physical actions and movements and relevant shared objects in the environment) is more important than being able to observe each other's faces or bodies (Whittaker, 2003a;Fussell et al., 2003aFussell et al., , 2003b;;Kraut et al., 2003;Anderson et al., 2000).Moreover, Sellen (1995) found that visual feedback improves satisfaction, but not performance.In fact, visual feedback can also act disruptively as, for example, in cases of even slight delays in, or lack of, speech/visual feedback synchronicity (Gergle et al., 2006(Gergle et al., , 2004;;O'Malley et al., 1996).A shared workspace has been argued to create inflated assumptions of common perspective, ultimately leading to coordination problems (Whittaker, 2003a;Schober, 1993).
More recently, research has also focused on how group and individual user characteristics moderate the benefit of visual feedback (see, Fussell and Setlock (2014), for a review); for example, studies have explored the effects of the expertise and cultural background of users.In particular, visual aids to communicate information may be more important in expert-novice interactions than in interactions between users with the same level of expertise, given that novices often lack the ability to use and understand domain-related terminology (Bromme et al., 2005).More specifically, availability of gaze and gesture cues are found to be more valuable to novices than to experts (Anderson et al., 2007;Stein and Brennan, 2004).Moreover, visual feedback has been found to provide no benefit to users of non-Western cultures (Setlock et al., 2004(Setlock et al., , 2007)).As such, these findings motivate further research regarding the mechanisms, conditions and factors (relating to the technology as well as the user) that make visual feedback a valuable resource for CSCW.
The second factor considered in the study is gender.Studies in diverse fields, ranging from psychology and linguistics to business and computing, have suggested that males and females communicate and process information differently (for instance, see Halpern (2000), Beckwith and Burnett (2004)).In particular, gender has been found to be a major factor underlying cognitive abilities, such that males outperform females in the majority of visuo-spatial tasks, while females perform better in most verbal ability tests (Halpern et al., 2007;Ullman et al., 2008).In addition to differences in performance, qualitative differences are observed in terms of communication style, use of linguistic elements and level of participation in both settings of face-to-face and computer-mediated communication (see, for example, Fischer (2011), Herring (2000), Herring and Stoerger (2014)).However, a review of meta-analyses of gender literature revealed that contextual factors influence the magnitude of gender differences, presenting the theoretical argument that gender differences may be moderated, exacerbated or even reversed owing to dyadic interactions between participants (Hyde, 2005).
The notion that interactive communication itself shapes performance and language useeither amplifying or offsetting gender differenceslies at the heart of the study reported in this paper.It originates from empirical research on dialogue (work within the Interactive Alignment Model (Pickering and Garrod, 2004) and the Collaborative Model (Clark, 1996)), which postulates that the way in which people produce and understand language, and coordinate in task-oriented dialogue, largely depends on interindividual processes and the context of use.In particular, a wellknown phenomenon is linguistic alignmentthe tendency of speakers to adapt to each other's pronunciation, word and syntax choicesa mechanism/strategy which is argued to underlie communication success.If, indeed, communication success is linked to linguistic alignment, it is important to discover whether alignment is mediated by gender (i.e., whether female or male speakers have stronger tendencies to align to their partners).
Finally, literature in the field of computing has recognised that there are notable differences in the ways in which females and males interact through and with technology.These differences pertain to skills, performance outcomes, perceptions and attitudes across numerous domains of Human-Computer Interaction (HCI) (Chen and Macredie, 2010).Specifically in the area of CSCW, initial evidence suggests that gender composition of groups and pairs of adults (Richert et al., 2011;Sun, 2008) and children (Underwood et al., 2000) affects computer-based collaboration in terms of performance and communication strategies.Despite these findings, our understanding of the ways in which gender interacts with the characteristics of the technology and mediates its effectiveness and acceptance remains incomplete (Burnett et al., 2011).System designs that exclude consideration of gender and other human factors ultimately lead to systems that marginalise the needs and preferences of large user groups (Dillon and Watson, 1996).As such, focused research is needed to guide system designs that can support users of both genders in computer-mediated collaboration.Moreover, in light of the evidence that visual feedback benefits collaboration and communication, the question that naturally follows is whether its utility, too, is mediated by gender.
This study builds on insights, and addresses questions, that have emerged from previous work by the authors.Koulouri et al. (2012) found that performance in navigation tasks does not depend on the individual's gender, but on the interaction of genders (that is, an individual's ability to provide/follow route instructions is also contingent on the gender of their partners).In addition, the study provided initial support to the notion that, while people may have their gender-preferential style of giving route instructions, they are willing to adapt it to suit their addressee's needs.These findings emphasised the importance of interaction and adaptation processes in collaborative tasks.Drawing on this earlier research, the study reported in this paper looks at actual as well as perceived performance in the navigation task, and undertakes fine-grained linguistic analysis to characterise and 'quantify' communication strategies and linguistic alignment as a medium for communication success.In addition, it extends the experiment to include two different interaction conditions in order to describe and explain how female and male partners benefit from visual feedback, or are affected by its absence, in CSCW.
The remainder of the paper is organised as follows.First, the area in which the research is situated is defined: Section 2 builds on past research and frames 10 research hypotheses targeting the effects of gender, visual feedback, and their interaction effects on computermediated collaboration.Section 3 presents the experimental methodology developed to address the research hypotheses, which involved pairs of participants collaborating in a simulated robot navigation task, and details the metrics used to capture performance and perceptions, as well as to describe user communication strategies.Section 4 reports the results of the statistical analysis on the metrics, which revealed simple and interaction effects.Section 5 discusses the findings of the study in light of existing literature and ends with recommendations for CSCW design.Section 6 reviews the limitations of the research and outlines additional areas that merit further exploration.Section 7 provides brief concluding remarks.The terms visual information, shared workspace and visual feedback will be used interchangeably in the rest of the paper to refer to shared visual workspace.

Background and research hypotheses
The previous section introduced the strands of research, from which the two main theses of the study are derived: (i) that there are differences in how females and males use and experience technology, which also emerge in CSCW; (ii) that visual feedback in CSCW may lead to more successful collaborative work.Drawing on existing literature and its findings, this section generates 10 research hypotheses, which are grouped into three subsections corresponding to the gender factor, the visual feedback factor, and their interaction effects.Given the study's HCI scope, gender and visual feedback effects are evaluated using the concepts of effectiveness (defined in terms of error rates), efficiency (that is, resources expended, such as time and word count), and user perceptions of the interaction.For brevity, the term performance is used to mean both efficiency and effectiveness.Finally, the study assesses communication behaviour and strategies through discourse analysis.

The effect of gender
Gender, and its effect in relation to cognitive abilities and communication styles, has a broad impact on computer skills and technology requirements.In particular, research has revealed gender differences in usage, preferences and perceptions in various application areas of HCI, including online shopping and web applications (Bae and Lee, 2011;Bimber, 2000), computer games (Cassell, 1998;Hartmann and Klimmt, 2006), virtual environments (Cutmore et al., 2000;Yoon et al., 2015), car navigation systems (Lin and Chien, 2010), office software suites (Burnett et al., 2011), domestic appliances (Blackwell et al., 2009) and decision support systems (Djamasbi and Loiacono, 2008).Existing literature has also discussed the role of gender in the areas of social computer-mediated communication (CMC) (on a larger scale, see, for example, Fischer (2011), Herring (2000), Herring and Stoerger (2014)) and CSCW (presented in this section), providing some evidence that gender pair/group composition impacts performance outcomes and experience.Given the growing popularity of CMC and CSCW systems, further systematic work is needed to clarify how gender mediates the use of such systems and influences group dynamics and collaborative work.This section discusses previous research, and the ensuing hypotheses concern the effect of gender (and gender composition) on performance, user perceptions and communication strategies in CSCW.Owing to the lack of conclusive evidence, non-directional hypotheses are constructed.
Studies have addressed whether gender and pair/group composition influence performance.Community co-membership (that is, belonging to the same group in terms of gender, age, culture, geographical region, etc.) reinforces the common ground (Clark and Marshall, 1981;Setlock et al., 2004), and could lead to a performance advantage for same-gender pairs/teams.A study in which physics students used a computer-supported collaborative learning environment found that females in all-female pairs outperformed females in mixed pairs, while males did equally well with female or male partners (Ding et al., 2011).In a day-trader collaborative game, all-female pairs were faster than all-male pairs (Sun, 2008), but the study did not include mixed-gender pairs.Prinsen et al. (2009) argued that females performed better in a collaborative learning task, because they could rely more on verbal skills and asked more questions.The authors concluded that computer-supported collaborative learning environments may have a greater utility for females because they are able to show their potential and use their linguistic skills more easily than in face-to-face interactions, in which females may face anxiety and the stereotype threat (Shih et al., 2002;Schmader et al., 2008).Stereotype threat is the popular belief about the competencies/deficiencies of a group (ethnic, racial, gender).It leads to suboptimal performance among the individuals in the group, caused by excessive cognitive and memory demands, anxiety and fear (Schmader et al., 2008).Motivated by these findings, the first research hypothesis is formulated as follows: H G1 : Gender has an effect on performance.
Gender has been found to influence user perceptions, satisfaction and self-efficacy (Busch, 1995;Bao et al., 2013;Durndell et al., 2000;Ong and Lai, 2006;Kling et al., 1999).In the domain of CSCW, Bernard et al. (2000) showed that females reported lower levels of self-efficacy and satisfaction and higher anxiety in mixedgender groups.Savicki et al. (1996) found that all-female groups expressed greater satisfaction and confidence with group processes and communication than mixed and all-male groups.Therefore, the second hypothesis investigates whether there are perception differences between females and males in terms of their performance when collaborating to complete a task: H G2 : Gender has an effect on user perceptions of performance.
Gender differences in terms of communication styles have been found in socially-oriented CMC (Savicki and Kelley, 2000;Choi et al., 2009), as well as in CSCW and computer-supported collaborative learning (CSCL) environments (Prinsen et al. 2007;Ding et al., 2011).Studies have suggested that males dominate CMC discussions, sending more messages (Carr et al., 2004), and females appear to adopt a more cooperative and agreeable conversational style (Sun, 2008).Females also ask more questions than males in CSCL situations (Ding et al., 2011;Prinsen et al., 2009).In a collaborative architectural task using a table-top interface, females tended to explicitly question and state requirements, resources and plans for action, while males tended to execute more and negotiate the plan less (Richert et al., 2011).As such, the following two hypotheses focus on the gender effect on communication behaviour, making use of measures of the frequency of queries, acknowledgements, and clarifications, and the level of specificity of the utterances, provided by both partners: H G3 : Gender has an effect on communication structure, in terms of frequency of queries, acknowledgements, and clarifications.
H G4 : Gender has an effect on communication content, in terms of language specificity.Linguistic alignment occurs in dialogue at all levelsphonetic, phonologic, lexical, syntactic, semantic and pragmaticand it is argued to be the mechanism that makes communication 'easy', efficient and effective (Garrod and Pickering, 2004;Pickering and Garrod, 2004).In respect of lexical alignment, dialogue is full of repetition of the same words (Brennan and Clark, 1996;Tannen, 1989); interlocutors align in terms of vocabulary in the sense that they use the same referring expressions (Garrod and Anderson, 1987).The argument that alignment underlies communication success has been explained in either social or utilitarian terms; that is, people adapt to each other linguistically in order to be positively perceived (Giles et al., 1991), or because it increases the probability of being understood (Brennan and Clark, 1996;Pickering and Garrod, 2004).The latter interpretation was constructed through experiments with collaborative dialogues, which corresponds to the domain of this study.Females appear to have stronger tendencies to adapt their language in terms of phonetics and stylistic elements in social conversations (Fitzpatrick et al., 1995;Stupka, 2011;Namy et al., 2002;Coates, 1986;Bilous and Krauss, 1988;Mulac et al., 1988;Holmes, 1990).However, there is no empirical data to suggest that females align more than males (or vice versa) in collaborative interactions or at other linguistic levels.The following hypothesis aims to address this knowledge gap and focuses on alignment in terms of vocabulary: H G5 : Gender has an effect on communication behaviour, in terms of linguistic adaptation (lexical alignment).

The effect of visual feedback
The second set of research hypotheses targets the simple effects of the absence/presence of visual feedback on performance and communications strategies in CSCW.Previous studies (Gergle et al., 2004(Gergle et al., , 2013;;Kraut et al., 2002aKraut et al., , 2003;;Clark and Krych, 2004;Brennan, 2005) have provided a body of robust evidence about the effect of visual feedback, facilitating the formulation of specific directional hypotheses.The studies explain these effects in terms of situation awareness and grounding, largely relying on concepts from the Collaborative Model developed by Clark and his colleagues (Clark, 1996).
Research in CSCW has emphasised the role of situation awareness for the success of computer-mediated collaborations (Carroll et al., 2003;Belkadi et al., 2013).Visual feedback is fundamental in supporting situation awareness (Dourish and Bellotti, 1992;Carroll et al., 2006;Daly-Jones et al., 1998;Kraut et al., 2003;Wu et al., 2013) by enabling collaborators to monitor task status and their partner's activities; speakers can assess the progress of the task, and the information necessary towards its accomplishment.Moreover, monitoring actions and their completion means that the next instruction will be provided precisely at the required moment.Similarly, an incorrect execution is readily recognised by the partner and he/she can take immediate action towards repairing it.Drawing on these findings, the following hypothesis is formulated: When interlocutors introduce and accept information, they perform a coordination process known as grounding (Clark and Marshall, 1981), in which they mutually establish that what has been said has also been understood.Visual feedback of understanding is faster and more secure than spoken claims of understanding (Clark and Marshall, 1981), and, as a result, grounding is not performed with discrete utterances, but with visible physical actions.Simply put, if someone following instructions is aware that their partner can see what they are doing, their action serves to demonstrate understanding and substitutes verbal turns.In this case, the speaker assumes responsibility to assess the perceptual evidence provided by the addressee.Generally, the responsibility falls to whoever is judged to have the strongest evidence, so that collective effort is minimised (Clark and Wilkes-Gibbs, 1986).However, when visual feedback is unavailable, both speakers are tasked to verbally assert that something was understood or executed, leading to different turn-taking patterns.Moreover, when sharing visual information and viewing the same objects, those objects are part of the common ground, and joint attention and reference can be easily established.As a result, the act of referring to elements in the environment is less effortful, leading to highly elliptical and deictic expressions, such as 'turn here' or 'put that there'.
To summarise this research, visible physical actions provide immediate and robust evidence of execution and understanding, which renders verbal turns redundant.Moreover, the availability of visual feedback facilitates joint reference, which enables the use of shorter referring expressions.These phenomena lead to the prediction that, when visual information is shared, communication will be more economical compared to a verbal-only condition.This is investigated through the following two specific hypotheses: H v2 : Visual feedback enables pairs to complete the task using lower frequency of queries, acknowledgements, and clarifications.
H v3 : Visual feedback enables pairs to complete the task with lower language specificity.

The interaction effects of gender and visual feedback
The third set of research hypotheses focuses on the interaction effects of visual feedback and gender on performance and communication strategieswhether visual feedback moderates (changes the strength or the direction of) the effect of gender on performance and communication.Specifically, the first research hypothesis addresses whether the performance of females or males is more adversely affected in response to a visually-impoverished interaction condition: H GV1 : Visual feedback moderates the performance of males and females.
As discussed above, in addition to an impact on performance, the presence/absence of visual feedback is expected to influence the content and structure of communication.However, there is no empirical data with regards to whether females or males adapt their communication strategies more drastically, in response to a less optimal condition, without visual cues.As such, the following non-directional hypothesis is formulated: H GV2 : Visual feedback moderates the communication strategies of males and females.

Methodology
The experimental study involved same-gender and mixed-gender pairs collaborating to complete a navigation task.Each pair consisted of an instructor and a follower, with the former being under the impression that he/she was instructing and interacting with a robot.The robot simulation aimed to increase the barriers to express social and personal identities, and, as such, it enabled the isolation of gender effects that arise naturally and not as a result of social bias.The literature and assumptions that motivated this setup are presented below.The associated limitations and trade-offs are discussed in Section 3.1.4.
Individuals use 'shortcut heuristics'salient social categories, like race or genderto form impressions about their interlocutors, especially when the situation precludes other cues and background knowledge.Such judgments about the speaker are argued to be more influential in communication than the actual information content of the message (Koh and Sundar, 2010;Zimbardo and Leippe, 1991;Norman, 1976).Moreover, gender roles and prescriptions have been found to exert a strong influence in how people behave and interact (Mazei et al., 2015;Gladstone and O'Connor, 2014).A study by Matheson (1991) investigated the extent of social perceptions relating to gender in CMC.In this study, males and females participated in a negotiation task, with the gender of their interlocutor either being known or unknown to them.In reality, however, the interlocutor was a software programme that provided standardised responses.When participants believed that their interlocutor had a gender, the 'female' interlocutor was perceived to be more cooperative.Matheson (1991) concludes that CMC interactions are largely mediated by social cues, and are influenced by stereotypes and a priori expectations.In addition to peer stereotyping, Herring (2000) suggested that males and females self-stereotype, and adopt the communication behaviour expected of their gender in CMC, even when gender is concealed.Finally, the performance of females has been found to deteriorate when males are present (Picucci et al., 2011;Inzlicht and Ben-Zeev, 2000).As discussed in Section 2.1, the effects of 'stereotype threat' and anxiety in performance are exacerbated when individuals carry out computing or spatial tasks (Shih et al., 2002;Schmader et al., 2008;Lawton, 1996Lawton, , 1994;;Beckwith, 2007).
These findings indicate that social biases associated with gender have a strong and confounding effect on language and performance.As such, by masking the gender through a robot simulation, it was possible to observe naturally-occurring cognitive processes and behaviours with less interference from social aspects of gender.

Task
The domain used in the experiment was pedestrian navigation in a simulated town, with the instructor having to guide a 'robot' (hereafter referred to as follower) to six designated locations.The environment consisted of highly salient landmarks such as buildings, and landmarks of lower salience such as pathways, which aimed to approximate a realistic urban environment.Environments containing a fair number of landmarks have been shown to be appropriate for users of both genders (Sandstrom et al., 1998;Saucier et al., 2002;Cutmore et al., 2000).Moreover, the environment was two-dimensional, given that navigation in 3D virtual environments has been found to be cognitively demanding for most people (Cockburn and McKenzie, 2002).Navigation tasks, including 'maze game' (see, for example, Garrod and Anderson (1987) and Mills (2007)) and 'map task' scenarios, (for example, Sanford et al., 2003;O'Malley et al., 1996;Anderson et al., 1991) are part of wellknown experimental paradigms in (computer-mediated) human communication research as they enable observation of natural cooperative behaviour, while the information available to participants remains finite and controlled at any point in the dialogue.The cooperative nature of the task lay in two aspects.First, in each pairing only the instructor knew the destinations and had a global view of the environment, so the follower had to rely on the instructor's route descriptions.Second, the instructor also needed the follower's descriptions to determine the robot's exact position and perspective.Data were captured on each participant's actions and utterances, to support analysis and understanding of how the participants approached the task and any problems that arose.The tasks in this study did not require learning the route through navigation or recalling the map or instructions from memory, which would give rise to different cognitive demands and errors.

The system
The experiment relied on a custom-built system which supported the interactive simulation and enabled real-time direct text communication between the instructor and follower.The system kept a log of the dialogues and also recorded the coordinates of the current position of the robot at the moment messages were transmitted.Thus, it was possible to analyse the descriptions against a matching record of the robot's position and reproduce its path with temporal and spatial accuracy.The interfaces used by the participants consisted of a graphical display and an instant messaging facility (the dialogue box).The dialogue box displayed each participant's messages (in green) in the upper part of the dialogue box; the messages sent by the other participant in the pair were displayed (in magenta) in the lower part of the dialogue box.The desktop PCs used by the participants were equipped with 17-in.LCD monitors with 1024 Â 768 pixel resolution.The underlying design principle of the system was that the interfaces remained basic and feature-light, and were operated through simple controls, so that any performance and communication differences would not be caused by the properties of the interfaces.The interface seen by the instructor displayed the full map of the simulated town.In order to explore the effect of the provision of visual feedback, there were two variants of the instructor's screen.The design and rationale of the visual feedback functionality, and the follower's interface, are presented below.
In the 'Visual Feedback condition', a small 'display' was available in the upper right corner of the screen showing the robot's immediate locality, but not the robot itself.This meant that the instructor shared the same visual space as the follower.This experimental decision follows the relevant literature investigating the effects of visual information (for example, Whittaker, 2003b;Kraut et al., 2003), in which the instructor can see what the follower is seeing and doing, but not the follower himself/herself.This is traced back to the 'What You See Is What I See' paradigm in the design of groupware and CSCW systems, developed by Xerox PARC (Stefik et al., 1987).The size of the 'display window' on the instructor's computer screen was approximately 7.2 Â 7.2 cm; this was considered appropriate given that it displayed a scaled-down, high-fidelity image of a relatively uncluttered environment, which was also part of the instructor's own map.No delays were noted in the display of the messages or the visual feedback.Displaying the map and the follower's visual space on one screen was considered more usable and less distracting for instructors than requiring them to view two different media (for instance, paper and computer monitor or two separate monitors).This, however, resulted in a compromise in the size of the display window showing the follower's visual space.Similar interfaces have been used in the related studies by Kraut, Gergle and Fussell, discussed in Section 2.2.For example, in Kraut et al. (2003), the instructor's display consisted of the repair manual and schematics of the bicycle, and a small rectangular window in the right bottom corner showing the view from the headmounted camera of the follower.
Clark and Brennan argue that the medium shapes the structure and content of communication by imposing different costs on the grounding process (Clark and Brennan, 1991).The medium can offer (or constrain) the following communication resources: physical copresence, visibility, co-temporality, audibility, simultaneity, sequentiality, reviewability, revisability, mobility and tangibility.CMC presents potential barriers to communication because it variably restricts these affordances.Adapting the framework to CMC, Brennan and Lockridge (2006) removed the mobility and tangibility affordances.Similarly, Kraut et al. (2002b) have proposed a refined version of the framework, which 'decomposes' the dimensions of physical co-presence.The framework that conceptualises the key affordances of a communication medium (and their definitions) is shown in Table 1.The characteristics of the interface used in this study are explained against Table 1 A characterisation of the interface used in the study against a framework of affordances of communication media (adapted from Clark and Brennan (1991) and Kraut et al. (2002b), as presented in Fussell and Setlock (2014)).

Affordances of Media Interface
(1)Physical co-presence dimensions: -Field of view: Participants can see what entities each person is oriented towards.
-Spatial perspective: Participants can see their partner's perspectives toward the environment.
-Display symmetry: Both parties can see equivalent aspects of one another's environments.
-Dimensionality: Participants have a 3D view of the environment.
-Spatial resolution: Participants have high-quality views of the environment.
-Temporal delay: Images as received relative when they were sent.No (4)Co-temporality: Messages are received without delay (close to the time that they are produced and directed at addressees), permitting fine-grained interactivity.

Yes
(5)Simultaneity: Participants can send and receive messages at the same time, allowing communication in parallel.Yes (6)Sequentiality: Participants take turns in an orderly fashion in a single conversation at a time; one turn's relevance to another is signalled by adjacency.

No
(7)Reviewability: Messages do not fade over time.Yes (8)Revisability: Messages can be revised before being sent.
Yes this framework and are also summarised in Table 1.
In the 'No Visual Feedback condition', the 'display window' feature was disabled so that the instructor had no direct visual information relating to the follower's position and actions in the environment (see Fig. 1).
The follower's interface displayed a fraction of the overall environment map, showing only the surroundings of the robot's current position (see Fig. 2).The robot (signified by a red circle with a yellow 'face') was operated by the follower using the arrow keys on the keyboard.The dialogue box also displayed a history of the instructor's previous messages to the follower.

Participants and procedure
A total of 64 participants (32 males and 32 females) were recruited from undergraduate and postgraduate students of various departments at a UK university.The participants were randomly allocated to the two roles (instructor or follower) and to each of the Visual Feedback conditions.Previous experience in using computers was necessary, as was familiarity with instant messaging applications.No other specific computer expertise or other skill was required in order to take part in the experiment.As shown in Table 2, pairs were formed with all possible combinations of roles and gender.
Instructors and followers were seated in separate rooms equipped with desktop PCs, on which the respective interfaces were displayed.The participants that were assigned to be followers were fully informed about the experimental setup.The instructors were told that they would interact directly with a robot, which for practical reasons was a computer-based, simulated version of the actual robot.The instructors were given minimal information about the robot.They were informed that the robot had advanced capacity to understand and produce spatial language and learn previous routes.This aimed to reduce the likelihood of instructors inferring during the interaction that the robot was actually a person.The pairs were given no examples of, or instructions about, how to interact with each other.The pairs attempted six tasks presented in the same order; the instructor navigated the follower from the starting point (bottom right of the map) to six designated locations (pub, lab, factory, tube, Tesco, shop).The instructors were free to plan and modify the route as they wished.The destinations were selected to require either incrementally more instructions or the use of previously described routes.

Limitations of the methodology
The methodology of this study involved two people collaborating in a task, one of whom believed that they were interacting with a simulated robot.This section discusses the limitations that arise from this experimental approach.
As explained in Section 3, this setup allowed for the observation of inter-gender interactions while inhibiting social elements that arise in human-human communication and have an adverse, or confounding, effect on task performance and communication.However, in realistic CSCW settings the gender of participants is typically known.As such, masking the gender introduces a degree of artificiality in the study and constitutes a limitation which needs to be addressed in future work.The second limitation of the approach lies in the fact that followers were aware that their partners were people, resulting in an asymmetry of knowledge within the dyad.This is a methodological compromise as it was assumed that if the followers had not been informed of the setup, they would have given away enough cues for the instructors to suspect that they were interacting with another person.In effect, the followers acted as confederates in the study; while a confederate in similar studies is a single, trained individual, followers were also naïve participants, and they were given no guidance on what to do or say in order to ensure that the interaction was natural and spontaneous.A related problem has to do with the 'audience design' phenomenon: people produce language based on a priori assumptions about what their addressees might know (Isaacs and Clark, 1987;Sacks et al., 1974;Schober, 1993Schober, , 2009)), and these assumptions are often influenced by 'community co-membership' (Clark and Marshall, 1981).Therefore, the question that naturally follows is whether coordination and communication were disrupted by the inability of instructors to form such assumptions.However, even if the model (set of assumptions) regarding the addressee is initially incomplete, the model is constructed dynamically, being updated on a turn-by-turn basis throughout the interaction (Clark, 1996).As such, it is argued that the simulated robot setup may not have had a significant or lasting effect on coordination and grounding processes.For example, research by Levin and colleagues (Levin et al., 2013(Levin et al., , 2008) )    indicated that people are willing to attribute human-like cognitive characteristics to robots, when users are given sufficient time to observe intentional behaviour by the robot.

Data analysis approach
The two strands of data analysisaddressing performance and communicationand their relation to specific hypotheses are explained in this section.

Performance analysis
The analysis of performance targets research hypotheses H G1, H V1, and H GV1 .Following HCI practice, it uses measures of effectiveness and efficiency, as also adopted in related research in human communication (e.g., Clark and Krych, 2004) and CSCW (e.g., Gergle et al., 2004;Kraut et al., 2003).The efficiency indicators used were time, and number of instructor/follower turns and words.Effectiveness was assessed through miscommunication rates.The study adapted McRoy's HCI-oriented framework and classified miscommunication into three types: execution errors; non-understandings; and incorrect instructions (McRoy, 1998).Execution errorsinstances where the follower deviated from the described routewere recorded by comparing the route executed by the follower (as determined by system logs) with the route described by the instructor.Non-understandings were responses to an instruction that requested clarification or expressed an inability to interpret the instruction.Incorrect instructions were errors by the instructor, for example confusing 'left' for 'right' when providing a route instruction. 1o examine research hypothesis H G2 , user perceptions were collected using a short questionnaire.After the completion of each of the six tasks, the instructors were asked to complete the questionnaire, rating their agreement with five declarative statements of opinion.The questionnaire used a Likert scale with seven levels of agreement: strongly disagree; disagree; slightly disagree; neutral; slightly agree; agree; and strongly agree.The items probed five different dimensions of the instructor's experience of their interaction with the follower: perceived task completion (item 1); execution accuracy (item 2); ease of use (item 3); helpfulness (item 4); and overall satisfaction (item 5).The responses were mapped to integer values between one and seven (with seven representing the highest level of agreement).The scores associated with each statement were summed for all six tasks, which resulted in a cumulative score for each statement ranging from 6 to 42.Followers were not asked to complete the questionnaire to ensure that they would be ready to respond as soon as the instructors initiated the next task.

Communication analysis
Evaluating and quantifying how people use language to collaborate is a non-trivial task.The approach followed existing annotation and analysis schemes to categorise the utterances of the participants and the components of these utterances, identify the degrees of alignment, and measure their associated frequencies to be statistically tested.In particular, in order to determine communication strategies and collaborative behaviour and address hypotheses H G3, H G4, H G5, H V2, H V3, and H GV2 , two types of fine-grained analysis were performed: dialogue act analysis (following the HCRC Dialogue Structure Coding Manual by Carletta et al. (1996)); and (two forms of) component analysis of the utterances (following the definitions and frameworks developed by Denis (1997), Tenbrink and Hui (2007) and Vanetti and Allen (1988)).These analysis schemes are outlined in the following subsections.
3.2.2.1.Dialogue acts analysis: queries, acknowledgements and clarifications.The utterances were annotated using a simplified version of the HCRC Map Task move coding scheme (HCRC Dialogue Structure Coding Manual by Carletta et al. (1996)).The scheme divides the participants' dialogue moves into initiations (expecting a response) or responses that are then classified as Instruct, Query, Clarify, Acknowledge, Reply, Explain, Check, etc.Following the scheme's definition, the Instruct dialogue act refers to a command/ instruction issued to the follower.The Query, Clarify and Acknowledge dialogue acts are directly related to hypotheses H G3 and H V2 , and are explained below through examples from the corpus.
The Query dialogue act covers all questions addressed to the partner which are not clarification requests, as in the instructor's turn in example 1, below.'I' and 'F' stand for instructor and follower, respectively.
Example 1 [NMC3_T56-57] I: Where are you?F: I am standing facing the Post Office, with the car park on my left.
The Acknowledge dialogue act is a minimal sign of positive feedback.It demonstrates that a previous utterance or action was received, understood or accepted.Acknowledgements can be formulated simply, as 'OK', 'Yes' or as the follower's response in example 2, below.
Example 2 [NMC7_T35-36] I: go to lab and walk ahead, when you see two roads take left and then keep walking for a while and take second left F: second left taken.
The Clarify dialogue act is a reply to a question that contributes information over and above what was strictly asked.The difference between 'Reply' and 'Clarify' is that 'Reply' contains only the information requested.This difference is illustrated in example 3, below; the first response by the follower is classified as a 'Reply' and the second is a 'Clarify' dialogue act.The follower's response in example 1 above was also a 'Clarify'.
Example 3 [NMF2_F62-64] I: Is the gym on your left or right?F: The gym is on the right.F: Brunel is on the left.
As can be seen from the definitions and examples, the dialogue acts Acknowledge and Clarify have high intrinsic value for taskoriented interactions, by grounding or supplying extra information, and were used in the analysis.
3.2.2.2.Component analysis: language specificity.The component analysis relied on established frameworks within the domain of spatial language that define the best practices for producing route descriptions and classify 'good' versus 'poor' descriptions (Vanetti and Allen, 1988;Allen, 2000;Denis, 1997;Lovelace et al., 1999;Tenbrink and Hui, 2007).An important concept within these schemes is specificity (also termed granularity), which refers to the level of specification used by a person to describe a particular situation, event or object (Tenbrink et al., 2010;Klippel et al., 2009).For example, descriptions with low specificity are simple in form, giving turn-by-turn directions to the destination, and only contain spatial directions.Instructions of high specificity were defined as including landmark references, which could also be anchored spatially.In particular, the schemes agree that 'good' descriptions contain: (i) references to 3D landmarks (locations like buildings or bridges) that provide cues for (re-)orientation; and (ii) delimiters that provide distinguishing information about actions and environmental features.Delimiters can be distance designations that specify the boundary of action ('move forward until you see a car park', 'from the bridge continue straight to the university', etc.) and relational terms that specify relationships between environmental features or frame of reference ('the lab will be on your right', 'go to shop next to the café', etc.).In brief, the analysis considers the relative frequencies of these two types of component that are known to contribute to the information value of utterances.
3.2.2.3.Component analysis: linguistic alignment.The study concentrated on the phenomenon of lexical alignment, which essentially manifests as the re-use of vocabulary between partners.First, alignment was measured 'locally'; that is, by looking at the adjacency pairs in the dialogue and comparing the two utterances (termed 'input/output matching' in the Interactive Alignment Model by Pickering and Garrod (2004)).An adjacency pair is a sequence of two related utterances by two different speakers, such that the second utterance is a response to the first (Levinson, 1983, p. 303).An utterance was a 'match' (and given a score of 1 for each repeated content element), if it contained the same content word as the utterance to which it was a response.If no content word matched, a score of 0 was given and the response was noted as a 'mismatch'.The sum of matching content words was calculated.The annotation of alignment at the adjacency pair level is exemplified through a dialogue excerpt, shown in Table 3.In this example, the instructor first produced an instruction which does not match the previous utterance by his/her partner (a mismatch and so a 0 score was given).This was immediately reformulated to repeat the exact expression used by the follower, 'at y-shaped junction', which is marked as containing '2' matches.
Alignment is not only a local input/output matching mechanism; it also develops over the course of a dialogue, such that interlocutors rely on a working vocabulary for a dialogue's duration.This has been described in terms of dialogue 'routines' or 'pacts' by the Interactive Alignment Model (Pickering and Garrod, 2004) and the Collaborative Model (Clark, 1996).Therefore, in addition to capturing alignment as matches at the adjacency pair level, it was necessary to measure alignment as lexical innovation.Following the approach introduced by Mills (2007), lexical innovation was determined by comparing every constituent word in an utterance to all previous words used in the dialogue.For example, the utterance 'go into the bendy road' leads to a backwards search in the dialogue for any previous occurrence of 'go', adding '1' to the innovation score if not found, and '0' if found, before moving on to the next word.

Experimental design and statistical analysis
A between-subjects factorial design was used that investigated the simple and interaction effects of Instructor Gender, Follower Gender, and Visual Feedback on the performance-related and communication-related dependent variables.It should be noted that the 'task' served as the basis of measurement (each pair completed six tasks).Table 4 summarises the dependent variables, the factors and the research hypotheses targeted.For categorical data, relationships were investigated through chi-square tests of independence.When appropriate, the chi-square analysis followed a 'top-down' approach in order to identify the locus of a significant association, and separate chi-square tests were performed on the Visual Feedback and No Visual Feedback data.

Effects of gender and visual feedback on performance
To test hypotheses H G1, H V1, H GV1 , which targeted performance, the analysis considered the frequency of words and turns required to complete a task, and the frequency of miscommunication.
The analysis revealed a main effect of Visual Feedback on number of words per task (F (1,24) ¼6.904, p ¼0.The analysis of the proportion of turns by the instructor contributed interesting results.A significant main effect of Visual Feedback initially indicated that instructors in the Visual Feedback condition produced 57% of all turns, which dropped by 6% in the No Visual Feedback condition (F (1,23) ¼5.5, p¼0.028, η 2 ¼0.131, d¼0.84).This result was refined, as a significant interaction effect of Visual Feedback by Instructor Gender was found (F (1,23) ¼5.548, p¼0.027, η 2 ¼0.137).As illustrated in Fig. 3, female instructors dominated the conversational floor in the Visual Feedback condition, having produced over 61.9% of turns.However, when visual feedback was disabled, female instructors' turn-possession was balanced, dropping to 50.5%.Comparisons between the groups verified the difference between female instructors in the Visual Feedback condition and No Visual Feedback condition (t (14) ¼ 3.211, p¼ 0.006, d¼1.6).In contrast, the turn ratio of male instructors remained consistent across conditions.As detailed in Section 3.2.1,miscommunication caused by the instructor was estimated by the number of incorrect instructions.Follower-attributed miscommunication encompasses two measures: number of (i) execution errors; and (ii) follower turns that were tagged as expressing non-understanding.
The three-way ANOVA revealed a strong significant main effect of Visual Feedback on the number of incorrect instructions per task.Surprisingly, the number of incorrect instructions per task was close to zero in the No Visual Feedback condition and high in the condition in which the instructor could confirm at all times the actions and understanding of the follower (F (1,23) ¼ 13.784, p ¼0.001, η 2 ¼0.304, d ¼ 1.35).The Instructor Gender Â Follower Gender interaction was found to be significant (F (1,23) ¼ 4.797, p ¼0.039, η 2 ¼ 0.106) indicating that instructors in mixed-gender pairs (FM and MF) tended to be less accurate compared to instructors speaking to followers of the same gender (FF and MM).The contrast between same-gender and mixed-gender pairs also confirmed the finding (t (29) ¼ À2.251, p ¼0.032, d ¼0.81).
Similarly, the ANOVA conducted on number of non-understandings yielded a significant main effect of Visual Feedback.Interestingly, when participants shared visual information, followers produced a greater number of non-understandings (F (1,24) ¼4.324, p ¼0.048, η 2 ¼0.134, d ¼ 0.76).Finally, for execution errors as the dependent variable, no differences were found among the groups.The results are summarised in Fig. 4, which shows the distributions of incorrect instructions, non-understandings and execution errors across the two conditions.
These results address the hypotheses that relate to the effects of gender and visual feedback on performance.Fewer words were needed to complete tasks with visual feedback.However, there appears to be a trade-off with accuracy, such that miscommunication was higher when visual feedback was available.Average completion times were similar across conditions, possibly because Visual Feedback pairs expended any time advantage to resolve miscommunication.As such, the results of this analysis do not fully support hypothesis H V1.
No interaction effects were observed, so hypothesis H GV1 cannot be confirmed.However, the difference in turn ratios suggests that females are more sensitive to changes in interaction condition, providing initial support to hypothesis H GV2 .Same-gender pairs also appear to be more accurate in terms of instructions, which alone may not suffice to validate hypothesis H G1 .

User perceptions
A mixed ANOVA design was employed to explore the effect of gender on user perceptions (hypothesis H G2 ).The within-subjects factor was Statement, with five levels corresponding to the statements in the questionnaire.The between-subjects factors were Visual Feedback, Instructor Gender and Follower Gender.The analysis found a significant interaction effect of Instructor Gender and Statement (F (3.626,79.768)¼2.750, p ¼0.038, η 2 ¼0.084).In particular, the results indicated that male instructors perceived higher task success than females (item 1).System accuracy was rated more favourably by female instructors (item 3).User satisfaction (item 5) was similar for both genders.System ease of use (item 2) and helpfulness (item 4) were also not significantly different.The mean summed scores for each question are shown in Fig. 5.This result appears to confirm hypothesis H G2 , which predicted that gender mediates user perceptions of task success.

Effects of gender and visual feedback on communication
In addition to their effects on task performance and perceptions, gender and visual feedback were expected to influence how instructors and followers communicated about the task.To determine communication strategies, statistical analysis was performed on the frequencies of dialogue acts and utterance Fig. 5. Mean summed scores for each statement for female and male users.The statements were the following: 1: I did well in completing the task; 2: The system was easy to use; 3: The system was accurate; 4: The system was helpful; 5: I am generally satisfied with this interaction.Gender differences were confirmed for items 1 and 3. Please cite this article as: Koulouri, T., et al., The influence of visual feedback and gender dynamics on performance, perception and communication strategies in CSCW.International Journal of Human-Computer Studies (2016), http://dx.doi.org/10.1016/j.ijhcs.2016.09.003i components associated with higher information value and specificity, and on the rate of vocabulary repetition and re-use.

Queries, acknowledgements, and clarifications
The analysis presented in this section considers the frequencies of certain dialogue acts: the number of queries (questions), acknowledgements (positive feedback to show that the utterance to which it responds has been understood and accepted) and clarifications (responses to questions that give information over and beyond what was asked) issued by the instructor and follower.The frequencies of these dialogue acts were used to address hypotheses H G3, H V2, and H GV2 .
The three-way ANOVA performed on the number of instructor queries yielded a significant main effect of the Visual Feedback factor (F (1,22) ¼ 14.710, p ¼0.001, η 2 ¼0.251, d ¼1.2).In particular, the instructor queries showed a dramatic increase in the No Visual Feedback condition.The analysis also revealed an interaction effect between Visual Feedback and Instructor Gender (F (1,22) ¼7.247, p ¼0.013, η 2 ¼0.124).T-tests and inspection of the error bar charts confirmed that the greatest number of queries was given by female instructors in the No Visual Feedback condition.
Finally, a significant three-way interaction of Visual Feedback by Instructor Gender by Follower Gender was detected (F (1,22) ¼4.203, p¼ 0.05, η 2 ¼0.072).The presence of the three-way interaction refined the result and indicated that, although in the Visual Feedback condition female instructors paired with female followers rarely asked questions (M ¼0.25, SD ¼0.29), when visual information was not shared, the number of their queries 'exploded', increasing by 2.36 standard deviations (M ¼3.45, SD ¼1.89).The effect is illustrated in Fig. 6.
The analysis looked at the other side of the communication, the number of follower queries per task, and also revealed a main effect of Visual Feedback (F (1,23) ¼11.014, p ¼0.003, η 2 ¼0.274, d ¼1.17), but inversely: the followers issued a larger number of queries when their partners were able to monitor their actions.
The analysis of acknowledgements per task revealed an analogous pattern of significant effects: in the absence of shared workspace, participants produced a larger number of acknowledgements to signal understanding and acceptance of previous statements (F (1,22) ¼4.459, p¼0.046, η 2 ¼0.102, d¼0.74).This effect was overshadowed by a significant effect of Visual Feedback by Instructor Gender (F (1,22) ¼ 6.786, p¼ 0.016, η 2 ¼ 0.155).Inspection of the error bar charts and t-tests showed that pairs of Female instructors in the No Visual Feedback condition provided a significantly higher number of acknowledgements compared to the other groups.
A conclusive result was reached through the second-order interaction effect of Visual Feedback by Instructor Gender by Follower Gender (F (1,22) ¼ 4.195, p¼ 0.05, η 2 ¼ 0.096, d¼2.23).In the Visual Feedback condition, FF pairs exchanged very few acknowledgements (M¼1.625,SD¼ 1.5).In contrast, in the No Visual Feedback condition, the number of acknowledgements for FF pairs quadrupled (M¼6.725,SD¼2.86), which translates to a difference of 2.23 standard deviations.Fig. 6 illustrates the result by showing the interaction of Instructor Gender by Follower Gender for each level of Visual Feedback.
The analysis of the number of acknowledgements by the follower also showed that when visual information is not shared, followers more frequently provide evidence of positive understanding (F(1,23) ¼9.629, p¼ 0.005, η2¼0.22,d ¼ 1.04).
The analysis of dialogue acts investigated the number of clarifications per task provided by the pairs.Inspection of the dialogue data showed that clarifications were provided exclusively by followers.There was a significant effect of Visual Feedback (F (1,24) ¼6.405, p ¼0.018, η 2 ¼0.173, d ¼0.89).In particular, followers gave a higher number of replies that were richer in information, in the absence of shared visual space.
Taken together, the results show that gender and visual feedback change the communication strategies used by participants to complete the task (supporting hypotheses H G3 and H V2 , respectively).Most importantly, it was found that all-female pairs adapt drastically when the CSCW medium excludes visual feedback, providing initial support to hypothesis H GV2 .

Language specificity
The level of specificity was determined by the presence and frequency of landmarks and delimiters.This analysis served to test hypotheses H G4 , H V3 , and H GV2 .
Mirroring their partners' behaviour, followers in the No Visual Feedback condition were also found to use more landmark references (contained in 33% of follower utterances in the Visual Feedback condition compared to 53% of utterances in the No Visual Feedback condition) (F (1,24) ¼ 8.818, p¼0.007, η 2 ¼0.247, d¼ 0.97).
The Instructor and Follower Gender factors were not significant.
In relation to the frequency of delimiters in instructor utterances, the ANOVA revealed a significant main effect of Visual Feedback for boundary/distance designations.These delimiters, which specify the boundary of the route, were scarcely used in the Visual Feedback condition (F (1,23) ¼ 4.539, p¼ 0.044, η 2 ¼0.136, d¼0.77).
Relational terms specify the relation between speaker and an environmental feature ('on your left') or between different environmental features.Followers were found to incorporate a larger number of these terms in their utterances in the No Visual Feedback condition (F (1,23) ¼5.332, p ¼0.03, η 2 ¼ 0.182, d¼ 0.90); that is, when not being monitored, followers tended to be explicit about the frame of reference.There was no significant effect of Instructor Gender on the frequency of relational terms.
In addition, the analysis concentrated on the frequency of simple or compound instructions.When instructions consisted of one or two components, verb or verb and direction, they were considered simple (for instance, 'walk straight ahead', 'move forward', 'turn right').Instructions with more than two components were compound (for instance, 'walk straight ahead until you reach the road junction', which has four components).As expected, the frequency of simple instructions that contained only the verb and the direction of movement was lower in the No Visual Feedback (F (1,24) ¼ 4.769, p ¼0.039, η 2 ¼0.144, d ¼0.77).
A three-way interaction effect of Visual Feedback by Instructor Gender by Follower Gender was also detected (F (1,24) ¼4.381, p¼ 0.047, η 2 ¼0.126).The difference was statistically significant only for FF pairs between conditions, showing that simple instructions by female instructors paired with females dramatically decreased in the No Visual Feedback condition.The interaction is plotted for each level of Visual Feedback in Fig. 8.
To sum up, these results indicate that when visual feedback is available, the pairs are able to complete the task with lower specificity, which confirms hypothesis H V3 .A complex gender composition effect was revealed that indicates female instructors, and all-female pairs, adapted their behaviour in order to ensure communication success in less optimal conditions, supporting hypotheses H G4 and H GV2 .

Linguistic alignment
To address hypothesis H G5 , the analysis investigated whether alignment is mediated by gender; that is, whether Instructor Gender and Follower Gender have an effect on alignment as: (i) input/output matching and (ii) lexical innovation.
First, the ANOVA failed to produce significant effects of gender on number of 'matches'.While it is out of the scope of this analysis, it is relevant to note that the number of matches was significantly higher in the No Visual Feedback condition (M ¼4.333, SD¼ 1.784) compared to the Visual Feedback condition (M ¼2.14, SD¼ 1.953) (F (1,22) ¼9.354, p ¼0.006, η 2 ¼ 0.263, d ¼1.17).
Second, a chi-square analysis was performed to clarify the link between gender and lexical innovation.It was also necessary to address whether strategies and behaviours change when communication problems, like errors and non-understandings, occur.As such, the variables tested were the gender of the instructor, occurrence of miscommunication, and the number of new words present in the next turn.In particular, this analysis considered the number of novel words contained in an utterance immediately after a (i) non-problematic and (ii) problematic utterance (that is, a dialogue utterance marked as a non-understanding, an incorrect instruction or in which an execution error occurred; a combined measure was used since the nature and cause of miscommunication was not the focus of this analysis).All utterances were grouped based on whether or not they contained novel words, and whether or not they followed a problematic utterance.The relationship between the variables was only found to be significant in the No Visual Feedback condition.Thus, the No Visual Feedback data were used in this analysis to explore the gender factor.
A significant relationship was found between gender and the number of new words category after problematic and non-problematic utterances.After problem-free communication, both female and male instructors tended to re-use old vocabulary.70% of utterances (for female instructors) and 64% (for males) comprised exclusively previously-used vocabulary.The Pearson's chi-square on the relationship yielded χ 2 ¼ 8.035, confirmed by the linear relationship of 8.031 (df¼ 1, p¼ 0.005, φ¼0.063).However, there was a difference between male and female instructors.The odds ratio of adhering to the old vocabulary was 1.3 higher for female instructors than for male instructors after a non-problematic utterance.
However, after miscommunication the analysis revealed that female and male instructors employed contrasting approaches.Male instructors responded by introducing new words (66% of the utterances).In contrast, female instructors appeared to continue adhering to the old vocabulary.The linear and Pearson's chisquare of 4.779 and 4.723, respectively, supported the existence of the relationship, statistically significant at p¼ 0.029 (the phi coefficient was 0.233, explaining 5.42% of the variance).The odds ratio was 2.6, indicating that female instructors were now 2.6 times less likely to try new words after miscommunication compared to males.In brief, the analysis suggests that females preferred to re-use vocabulary, even after miscommunication, whereas males were more inclined to introduce new words.The results are summarised in the graph in Fig. 9.
Pearson's chi-square analysis on similar experimental dialogue data appears to be common practice in related literature.However, strictly speaking, the use of Pearson's chi-square is incorrect, since the utterances were not independent from each other, being produced by the same 32 pairs.As such, this analysis also included the Cochran-Mantel-Haenszel (CMH) test, which has been proposed as an alternative method that can strengthen the reliability of the chi-square (Cochran, 1954).This test allows control for one variable (in this case, pair), while comparing the levels of the other two variables (in this case (i) miscommunication/no miscommunication and (ii) 1 or more new words/no new words).The   results of the CMH test confirmed the chi-square interpretation.It was verified that male instructors tend to produce considerably more utterances containing new words after miscommunication (χ 2 ¼15.203, df¼1, p ¼0.001).The CMH test did not find a significant association between miscommunication and new words for pairs with female instructors, indicating that females tend to use previous vocabulary in both situations of problematic and problem-free communication.
Based on this analysis, there is insufficient evidence to support the suggestion that the tendency to converge linguistically in taskoriented dialogues depends on one's gender; as such, hypothesis H G5 is not validated.However, when communication problems occurred, gender differences were observed, with females tending to resort to previously used vocabulary and males introducing different words.

Summary of findings
In terms of performance, the gender factor hypothesis was not supported.Consistent with previous research, a broad benefit of visual feedback in terms of language economy and efficiency was found.However, there was an accuracy trade-off, as indicated by the higher frequency of incorrect instructions and non-understandings.While there was no difference in actual performance, females reported lower perceived task success than males.In terms of communication strategies, males were more likely to linguistically diverge and introduce novel words, especially in the event of communication problems.In the absence of visual feedback, females adapted their communication strategies significantly and effectively, increasing the quality and specificity of their verbal contributions, requesting and negotiating information, while the behaviour of males did not vary across conditions.Table 5 lists the 10 research hypotheses tested in the study and summarises the respective outcomes.

Discussion
This section discusses the practical implications of the findings and uses them to frame specific design recommendations for CSCW systems.

The role of visual feedback in CSCW
Awareness of how visual feedback affects collaboration and communication patterns can enable interface designers to take full advantage of its benefits and avoid related pitfalls.The analysis confirmed that visual feedback has a significant effect on CSCW, and revealed positive as well as negative dimensions of the effect, which will be discussed in this section.

Visual feedback leads to more efficient and economical CSCW interactions
The study confirmed the benefits of visually-supported collaborations by showing that sharing the workspace facilitates situation awareness and grounding.Visual feedback enables users of CSCW systems to complete their tasks more efficiently, with shorter interactions and simpler language.This phenomenon is illustrated by juxtaposing two dialogue excerpts from the Visual Feedback and No Visual Feedback conditions (shown in Table 6).It was also corroborated that when visual information is shared, much of the communication is carried out through physical actions, which replace verbal turns.As in the example from the Visual Feedback condition in Table 6, verbal communication by the follower appears redundant, and it was ignored by the instructor.
The results of this study confirmed that, without visual feedback, instructor queries increased, and followers needed to provide elaborate responses and descriptions, such that the responsibility for task and understanding maintenance was equally distributed.CSCW frequently involves remote training or help-giving dialogues between novices and experts (for example, Fussell et al., 2000;Karsenty, 1999;Dix, 1994;Crabtree et al., 2006;Twidale and Ruhleder, 2004) and, by definition, novices are unable to provide equal or precise contributions.For such applications, in which there is an expected prior asymmetry in the knowledge and Table 5 List of research hypotheses and respective outcomes.

Hypotheses
Results and Commentary H G1 : Gender has an effect on performance.
Not supported; while same-gender pairs were more accurate when giving instructions, no differences were found for execution errors and non-understandings.
H G2 : Gender has an effect on user perceptions.Confirmed; females reported lower perceived task success than males.
H G3 : Gender has an effect on communication structure, in terms of frequency of queries, acknowledgements, and clarifications.
Confirmed; female instructors in all-female pairs in the No Visual Feedback condition provided the largest number of queries and acknowledgements.
H G4 : Gender has an effect on communication content, in terms of language specificity.
Confirmed; female instructors in all-female pairs provided utterances with higher specificity (higher frequency of landmark references and compound instructions), but only when visual feedback was withheld.
H G5 : Gender has an effect on communication behaviour, in terms of linguistic adaptation (lexical alignment).
Not supported; no differences in 'matches'.However, especially after communication problems, lexical innovation was higher for males.
H V1 : Visual feedback benefits performance.Not supported; while fewer words were required, miscommunication increased when visual feedback was available.
H V2 : Visual feedback enables pairs to complete the task using lower frequency of queries, acknowledgements, and clarifications.
Confirmed; instructors issued fewer queries, and followers provided fewer acknowledgements and clarifications, when visual feedback was available.
H V3 : Visual feedback will enable pairs to complete the task with lower language specificity.
Confirmed; instructors omitted landmark references and action boundary information and provided simple instructions; followers omitted landmark references and did not state frame of reference in the Visual Feedback condition.
H GV1 : Visual feedback moderates the performance of males and females.Not supported; no interaction effect was found for the performance variables.
H GV2 : Visual feedback moderates the communication strategies of males and females.
Confirmed; the interaction effects found for most communication variables suggest that all-female pairs fully adapted their strategies in the absence of visual feedback.
proficiency states of the collaborators, the ability to ground information should be reinforced.Developers should consider implementing visual functionality to support grounding that does not rely on language, such as remote pointers, highlighting tools, and other methods that ensure joint attention to objects (Gergle, 2006;Fussell et al., 2003aFussell et al., , 2003b)).Through such mechanisms, the expert can easily refer, and draw attention, to salient elements and details of the context when working with the novice who may be unable to comprehend or utilise domain-specific language.The results of this study also showed that when instructors can view the partner's workspace, the partner's actions function as verbal statements.As such, the novice may be able to use tools such as pointer trajectories (Gutwin and Penner, 2002;Fraser et al., 2007), which show the movement of the cursor, as a way to provide nonverbal feedback.Along the same lines, visually enriched interfaces can support non-native speakers of English, for whom linguistic aids may present an additional hurdle (as exemplified by a navigation study by Veinott et al. (1999)).Clark and Brennan's (1991) framework supports the view that different mediums impose different costs on how people ground information.Speech is ephemeral, so people engage in a frequent grounding process of small chunks of language.In contrast, typed communication involves higher production costs, so interlocutors ground less frequently and through longer utterances.The results of this study also suggested that visual feedback improves communication efficiency.Since CSCW often relies on synchronous text-based communication (instant messaging), visual information is argued to be a primary requirement for such systems because it can alleviate some of the higher costs of grounding.

Visual feedback has a negative effect on accuracy
The results of this study also demonstrated the potential pitfalls by revealing a rise in miscommunication (non-understandings and incorrect executions) when partners shared visual information.This empirically supports the argument that visually-enhanced interactions may present a close, but misleading approximation to face-to-face communication, giving rise to misplaced assumptions of continuous joint perspective and common ground, and leading interlocutors to relax their grounding criteria.Therefore, although counter-intuitive, it is argued that when the available evidence of understanding is less solid and reliable (that is, only language, no visual feedback), the criteria to ensure that understanding is being achieved become stricter, forcing interlocutors to be more accurate, persistent and detailed (and, consequently, less efficient in terms of word and turn usage, as also observed).In contrast, visual feedback relaxes the criteria and causes interlocutors to be less precise which, in turn, results in higher miscommunication.This phenomenon is further explained below in light of related literature and examples from the corpus.
The Collaborative Model postulates that interlocutors do not seek perfect and complete mutual understanding, but rather to sufficiently understand each other for current interaction purposes, meaning that grounding criteria are only as precise as they need to be (Brennan, 2005).In Brennan's study, followers collaborating in a spatial task reached the target more closely when instructors did not receive visual feedback.Two complete dialogues between pairs from the Visual Feedback and No Visual Feedback conditions from this study, provided in Table 7, illustrate this tendency.The destination in this case was the Lab.The instructor in the No Visual Feedback condition required that the follower not only reached but also went inside the location before asserting that the task was accomplished, whereas the instructor in the Visual Feedback condition provided directions that led the follower about 100 pixels off the target and ended the task (see image in Fig. 10).It is also interesting to note that instructors in the Visual Feedback condition did not usually state that this building was the destination, as in the dialogue example in Table 7.This confirms that visual feedback leads to inflated assumptions of what is mutually known or perceived.Finally, similar to the findings presented in this study, Brennan (2005) also observed that execution error rates were no higher without visual feedback than with it.
People try to complete a task by expending the least effort possible to achieve a satisfactory result; as Carletta and Mellish (1996, p.71) maintain, "in task-oriented dialogue, this produces a tension between conveying information carefully to the partner and leaving it to be inferred, risking a misunderstanding and the need for recovery."Thus, when the interaction conditions are deemed favourable (as in case of sharing visual feedback), speakers typically prefer to use expressions that may be more economical, but also increase ambiguity and the risk of incorrect interpretation.The conclusion that can be drawn is that more or stronger evidence does not necessarily lead to more successful interactions, but successful interactions depend on how well people are able to tune their grounding criteria.
This study showed a broad communication efficiency benefit, associated with an accuracy trade-off, and it has discussed practical implications for novice-expert and text-based dialogues.While it may be argued that these observations are linked to the properties of the visual feedback provided or the nature of the

The role of gender in CSCW
The analysis confirmed that the gender of the collaborators has a significant effect on CSCW.While the actual performance of males and females was comparable, there were differences in perceived performance.Differences were also observed in communication strategies, particularly in the absence of visual feedback, and when miscommunication occurred.

No actual performance difference, but lower perceived performance among females
This study supports the theoretical argument presented in Hyde (2005) that the impact of gender can be mitigated in interactive settings.Related empirical evidence can be found in other domains: studies in computer science education have reported that pair programming reduces the gender performance gap between male and female programmers, and failure rates for students of both genders (Berenson et al., 2004;McDowell et al., 2003).The results of the study reported in this paper should be viewed in light of its task domain.Males perform consistently better in navigation tasks, particularly in virtual environments (VEs) (see Coluccia and Louse (2004), and Martens and Antonenko (2012)).In the navigation task employed in this study, the performance of females was comparable to the performance of males.As such, it may be deduced that the interactive CSCW setting enables female users to tackle difficulties associated with the nature of the task.This argument may be relevant for the design of VEs, given that they are widely used as training tools in several professional fields.
While task performance was comparable, a difference in perceptions was observed.It is a recurring research finding that females perceive their ability or performance in computer-based tasks to be lower than it actually is (Busch, 1995(Busch, , 1996;;Hargittai and Shafer, 2006).Female users also tend to attribute problems to their own lack of skill and are less likely to 'blame' the computer system (Beckwith and Burnett, 2004;Boiano et al., 2006), which may also explain the finding of this study that females gave higher ratings to the system than did males.The observation that actual performance may remain unaffected by these negative emotional states does not reduce the urgency of the problem.Poor self-perception leads to disengagement with computer-based activities and unwillingness to adopt technology or more advanced applications (Hartzel, 2003;Beckwith et al., 2006b;Bao et al., 2013;Venkatesh and Morris, 2000), and has also been argued to contribute to the underrepresentation of females in STEM fields (Chipman, 2005;Kinsey et al., 2008;Sáinz and López-Sáez, 2010).
The argument that CSCW provides a general performance advantage through interaction has implications for the adoption of related technologies in education and the workplace.This benefit may also relate to the idea that 'social genders' in CSCW become less relevant.In fact, a preliminary study with university students has suggested that CSCW settings enable more effective, balanced and less 'stereotyped' interactions between females and males than typical face-to-face classroom settings (Tomai et al., 2014).Yet, even if the effect of social biases associated with gender is alleviated, low self-efficacy among females also emerges in the domain of CSCW, which could compromise the acceptance of this technology.

All-female pairs compensate for the lack of visual cues through rich verbal means
As discussed in the previous subsection, the argument that interactive situations moderate gender differences performance is not new.However, there has been no focused attempt to pinpoint which elements of the communication underlie this effect.The results of this study help to outline these elements and validate the hypothesis by Savicki et al. (2006) and Prinsen et al. (2009) that females will overcome the paucity of CMC settings by deploying linguistic tools.In the present study, all-female pairs managed to compensate for the 'cueless' interaction condition by increasing the specificity of their contributions, and performed the task through elaborate, detailed verbal contributions and grounding information.Female partners working in pairs exhibited strong collaborative and adaptive behaviour, by putting in more communicative effort, successfully reducing uncertainty and attending to their partner, when the interaction conditions were poorer.
A study by Devlin and Bernstein (1995) serves to exemplify how this insight can motivate a design decision; in their study, females performed better, and equally well as males, when given the opportunity to complement the visual and map aid configuration with verbal instructions.Similarly, as Hubona and Shirah (2004) maintain, interfaces should be rich in static and dynamic visual cues, but it should be possible to replace or complement some visually-presented information with verbal/textual content.Through such gender-neutral interfaces that offer the possibility of customisable settings (in this case, the provision of verbal aids), the needs and preferences of both genders can be met.
In seeking to understand the notable finding in relation to the performance of all-female pairs, it may be that the nature of the convergence in relation to the dialogue is important.Empirical models of human communication equate successful communication with convergence in the situation models of interlocutorsconvergence is progressively reached, as the dialogue unfolds, through alignment (or adaptation) across all linguistic levels (Pickering and Garrod, 2004).If this proposition is considered in conjunction with the finding that males and females have genderspecific discourse styles in CMC (Herring and Stoerger, 2014), it may be argued that female-female pairs, having the 'same' linguistic models to begin with, were able to achieve convergence more quickly and more efficiently than mixed-gender pairs, who may have started with a weaker common ground, within the relatively short time that they had to interact with each other.However, the present study failed to produce consistent evidence in support of this interpretation, and, as such, it remains a conjecture to be targeted and further explored in future research.

Females use conservative strategies, while males engage in explorative behaviour
The study yielded unique findings in relation to what females and males do when faced with communication breakdowns.The analysis on lexical innovation demonstrated that females draw on previously used vocabulary, while males introduce new terms.Even in smooth communication, females are less willing to 'experiment' with novel expressions compared to male users.This may suggest that females are more conservative and males more explorative when handling communication breakdowns.Among many possible explanations, this tendency could relate to gender differences in risk and cost perceptions.In the HCI domain, risk/cost perception is associated with users being less willing to try a useful but unfamiliar feature.Previous research argues that females perceive higher risks when they are involved in decisions or situations (for example, Finucane et al., 2000;Blais and Weber, 2001).As such, it is argued that females will be less likely to explore and experiment with unfamiliar features compared to males.Studies in various application domains, from programming IDEs (Beckwith et al., 2006a;Burnett et al., 2010;Cao et al., 2010) and spreadsheet software tools (Burnett et al., 2011) to web-based databases (Rosson et al., 2008) have confirmed that females are less confident to use novel software features while men typically engage in exploratory behaviour.It is argued that females' tendency to reuse vocabulary and not attempt a new strategy, even when these messages ostensibly failed, forms part of their general fear of 'tinkering' (the fear of trying new features), which may be traced back to females' low confidence and self-efficacy (Beckwith et al., 2006b) (as also discussed in the previous subsection).
Such findings should be considered by interface developers when unfamiliar or new features and strategies have to be adopted in the interaction with a system.In such situations, making use of techniques such as tutorial snippets, examples of what to say/do and short strategy explanations may help some users to feel more comfortable.In a gender-neutral interface, such features should be customisable in order to avoid compromising the experience of a gender group.Certainly, inclusive design does not mean that the experience of male users should be impairedfunctionalities should be made optional.Since gender is a stable user profile characteristic, such options can be easily implemented in an adaptive system.
However, it should be emphasised that neither females nor males are homogeneous groups of users exhibiting all the characteristics and preferences that are statistically associated with their gender.It is highly likely that many males are affected by the same interface complexities as females, and many females may enjoy the same software features as males (Beckwith et al., 2006a).This underscores the importance of gender-neutral software that supports all users.An interesting idea, proposed by Ljungblad and Holmquist (2007), is that designs that are informed by the needs and activities of a specific user population may also benefit the wider user population.For instance, verbal aids can provide support to users with a field-dependent cognitive style (Magoulas et al., 2004), as well as prove essential to users with visual impairments.Similarly, as argued in the previous subsection, verbal aids can catalyse the use and adoption of new software features by female users.
Finally, for many application domains, explorative and innovative user behaviour is not desirable.A good example is the (related to this study) domain of natural language interfaces, for which innovative and unpredictable user input is the main source of system failures.

Limitations and future work
Reflection on the methodology and results of this research has led to the identification of a number or limitations which shape directions for future experimental investigations, particularly, in the effect of gender dynamics in CSCW.
Section 3 discussed the effect of social aspects of gender, which motivated the decision to mask the gender of participants through a simulated human-robot interaction setup.As mentioned in Section 3.1.4,this experimental manipulation may limit the generalisability of the results because it does not directly map to a realistic CSCW setting, where information about the gender of the collaborators is available.In addition, the experimental manipulation created a knowledge asymmetry between participants, because only followers were aware that they were interacting with a person.In order to address these issues, the study should be replicated in two variations: a condition in which both interlocutors know they are collaborating with a person; and a condition in which interlocutors are told the gender of their partner.The comparative analysis of the results of the present study and the two variations could produce a comprehensive measure of the extent of the social effect of gender.
Second, the questionnaire used in this study was designed to be relatively short and simple in order for it to be completed after each task, and only the instructors' perceptions were captured.Therefore, the resulting observations are incomplete.A more sophisticated questionnaire tool tailored to the CSCW domain should be employed in a continuation study; for example, Convertino et al. (2007) have developed and validated a post-task questionnaire which specifically measures aspects of common ground, awareness, performance, interaction quality and satisfaction.The questionnaire should also target usability and affective factors, such as ease of learning, likeability, cognitive demands and annoyance, in order to provide insight into the role of gender (and gender interactions) in all dimensions of user experience.
The study focused on CSCW between pairs.A continuation of this work would investigate the effects of gender composition in group interactions.Indeed, initial empirical evidence suggests that mixed-gender groups report higher levels of satisfaction, social presence and performance (Wong et al. 2004;Houldsworth and Mathews, 2000;Hamlyn-Harris et al., 2006).Better understanding of the role of gender in group dynamics has important implications for the success of teamwork in organisational settings (Molyneaux et al., 2008).
Several techniques were applied in order to reduce variance and increase internal validity, because of the relatively small sample size of this study (Sauro and Lewis, 2012, p. 121).However, Please cite this article as: Koulouri, T., et al., The influence of visual feedback and gender dynamics on performance, perception and communication strategies in CSCW.International Journal of Human-Computer Studies (2016), http://dx.doi.org/10.1016/j.ijhcs.2016.09.003i a homogeneous sample consisting of university students may affect the generalisability of the results.As such, a study with users with different demographic profiles needs to be undertaken.The characteristics that are likely to co-vary with gender, and are relevant to CSCW, include culture, age, education, task-related experience and computer expertise.
Another limitation of the study relates to the use of typed communication.The use of text-based dialogue enabled the experimental manipulation of masking the gender of participants.While the modality (speech or text) may not affect how spatial language is processed and formed (Tversky and Lee, 1999;Tversky et al., 2009), there are known differences between typed and spoken communication.In particular, typed communication is 'quasi-synchronous' (Garcia and Baker Jacobs, 1999); that is, the recipient sees the message in its entirety the moment his/her partner presses 'enter', whereas in spoken dialogue, interlocutors start formulating their response whilst listening to their partners' utterance.This 'quasi-synchrony' may have also disrupted the sequential cohesion of dialogue, such that the second of two successive turns may not actually be the response to the first one (Herring, 1999).Moreover, as previously discussed, spoken dialogue involves more frequent grounding of shorter utterances.Finally, grounding is performed via auditory and gestural cues, while in text-based communication, mutual understanding is established through more explicit means.As such, further experimentation using speech or multi-modal interaction (for example, combining languagebased communication with pointing or drawing) is needed to confirm, and complement, the findings of this study.
The methodology of this study, and the interpretation of the findings with respect to visual feedback, are grounded in previous empirical research.These studies have used a rich variety of collaborative tasksfor example, puzzle solving (Gergle et al., 2004(Gergle et al., , 2013)), repairing (Kraut et al., 2003;Fussell et al., 2000), construction (Fussell et al., 2003a(Fussell et al., , 2003b;;Clark and Krych, 2004) and navigation tasks (Anderson et al., 1991) while also varying the level of task complexity (Ou et al., 2005).The diversification of experimental tasks is necessary, given that different tasks draw on different cognitive abilities and involve different collaboration activities and strategies.Therefore, in order to ensure that the gender effects revealed in this study were not merely a product of the properties and nature of the task, the investigation should be replicated using a variety of tasks drawn from these existing studies.

Conclusion
The study reported in this paper examined the effect of gender on CSCW, and explored whether visual feedback (often present in CSCW) moderates this effect.Two high-level conclusions can be drawn from this study: first, visual feedback results in more efficient but less accurate CSCW interactions; and, second, gender has a clear effect in terms of communication strategies and perceptions, but not in performance outcomes.It is hoped that the empirical contributions of this study will serve to stimulate further research in gender and human factors in CSCW.Unless design decisions are driven by research, systems are destined to include features that may be superfluous or even obstructive to particular groups of users.However, it is crucial that the objective of research in the user gender factor is to inform 'gender-neutral' systems; that is, the focus of research should shift from describing the differences towards describing interface features that are suitable for both genders.

Fig. 1 .
Fig. 1.The interface operated by the instructor in the Visual Feedback condition, with the small window in the upper right corner displaying the robot's current location.This window was absent in the No Visual Feedback condition.

Fig. 4 .
Fig. 4. Distribution of miscommunication in the Visual Feedback and No Visual Feedback conditions.The frequency of non-understandings and incorrect instructions was significantly higher when visual information was shared.

Fig. 3 .
Fig. 3.The turn ratios for female and male instructors ('F' and 'M', respectively) in the Visual Feedback and No Visual Feedback conditions.The first and third error bars do not overlap, indicating significant differences between female instructors in the two conditions.

Fig. 6 .
Fig. 6.Top Graphs: interaction of Instructor Gender and Follower Gender for each level of Visual Feedback.The Y axis represents the means of queries by Female or Male instructors.Bottom Graphs: interaction of Instructor Gender and Follower Gender for each level of Visual Feedback.The Y axis represents the means of acknowledgements given by Female and Male instructors.Pronounced differences are found between FF pairs in the two conditions for instructor queries and acknowledgements.

Fig. 7 .
Fig. 7. References to landmarks in turns by female and male instructors in the Visual Feedback and No Visual Feedback conditions.Pronounced differences were found between female instructors in the two conditions (illustrated by the first and third error bars).

Fig. 8 .
Fig. 8.The proportion of simple and compound instructions by instructors in all pair compositions in the Visual Feedback and No Visual Feedback conditions.Pronounced differences are found between FF pairs in the two conditions.

Fig. 9 .
Fig. 9. Probability of occurrence of new words after problematic and non-problematic utterances for pairs with female and male instructors in the No Visual Feedback condition.Males try new words when miscommunication is detected, while females are more likely to re-use vocabulary.

F:
I am in front of the car park I: take the road on the right [movement] I: turn right and walk till the end, along the road you will see a gym on your right I: stop F: Yes, gym to my right side I: move forwards a little bit [movement] I: good, keep going straight and you will see a factory on your left F: am I here yet?F: Yes, factory to my left side I: move forwards [movement] I: well done, goodbye I: stop I: you're at your destination, goodbye Table 7 Dialogue examples from the Visual Feedback (left-hand side) and No Visual Feedback conditions (right-hand side).I: go straight ahead I: walk straight then turn right I: turn right F: now where do I go?I: now, turn left then right I: where are you now?F: I have reached the junction F: the pub is on my right F: ok I: walk straight past the pub and stop at the lab F: straight ahead or turn left?F: I am at the lab now I: keep going straight I: go into the lab I: goodbye F: I am inside the lab now F: I have reached the junction by the bridge I: goodbye F: goodbye F: goodbye task, they highlight the necessity to gain better awareness of how visual elements modify behaviours, interact with verbal communication, and integrate with it.For example, it remains unknown whether visual feedback impairs long-term task performance, especially in expert-novice collaborations, as suggested in Yuviler-Gavish et al. (2011), which showed that monitoring inhibits deep learning and exploration.Therefore, further systematic research should focus on identifying the benefits and pitfalls of visual feedback in CSCW.

Fig. 10 .
Fig.10.The execution of the instructions provided in the dialogues in the table above (Table7).The thick yellow line represents the path taken by both followers.The red dashed line and blue solid line show the finishing execution of the followers in the Visual Feedback condition and No Visual Feedback conditions, respectively.

Table 2
Pair compositions and their abbreviations, used in the remainder of the paper.
015, η 2 ¼0.191, d¼ 0.94).Pairs in the No Visual Feedback condition (Mean ¼ 99.5, SD¼ 34.57) required a larger number of words to complete each task than the Visual Feedback pairs (Mean ¼72.46,SD ¼21.27).Both instructors and followers, individually, used a larger number of words under the No Visual Feedback condition.No significant difference in completion time was found.

Table 3
Example of lexical alignment annotation at the adjacency pair level.The first instruction is a mismatch, whereas the second instruction is a match and repeats two content words.

Table 4
List of measures, hypotheses and factors within the study.

Table 6
Dialogue examples from the Visual Feedback (left-hand side) and No Visual Feedback (right-hand side) conditions.