Anatomizing Students’ Task Engagement in Pair Work in the Language Classroom

Student engagement in the second language classroom has been the focus of numerous researchers and teachers. Previous studies have shown that there are several dimensions of student engagement, but it is still unclear how they change (or not) over time and consequently how they affect actual task performance. This study investigated the task engagement of language learners engaged in collaborative writing in pairs. Specifically, it focused on the combination of behavioral, cognitive, emotional, and social dimensions of task engagement, and examined which combinations resulted in better task performance. Participants were 60 Japanese university students who worked in pairs on a picture description task. Multiple data sources, such as the number of words/turns/language-related episodes, patterns of dyadic interaction, and self-reported questionnaire results, were utilized to investigate the process of students’ task engagement. The results showed: that the 30 participating pairs fell into three groups showing similar combinations of dimensions; that there was a significant difference in actual engagement between the groups and across time; and that such differences had a significant impact on task performance. Based on the results, pedagogical implications for teachers are discussed concerning the use of pair work in the language classroom.


INTRODUCTION
Students' active participation and involvement in schoolrelated activities and academic tasks have been defined as student engagement (Mercer & Dörnyei, 2020), and research on student engagement has received considerable attention in second language acquisition (SLA) research (Hiver et al., 2021a;Oga-Baldwin, 2019).Studies demonstrate that student engagement has been linked to active participation, task completion, time on task, persistence, enthusiasm, satisfaction, deeper learning, and improved academic achievement (Christenson et al., 2012;Hiver et al., 2021b).
There are at least four widely accepted dimensions of student engagement: behavioral, cognitive, emotional, and social.Studies have investigated how each of these factors relates to each other (e.g., Baralt et al., 2016;Nakamura et al., 2020;Phung;2017), but it is still unclear how these factors or the relationships between them affect actual task performance.The current study examines the task engagement of second language (L2) learners who engage in collaborative writing in pairs.It focuses on a combination of the behavioral, cognitive, emotional, and social dimensions, and examines which particular combination results in better task performance.
As another feature, this study incorporates a temporal aspect into its research design to determine how students' task engagement changes (or not) over time.The few studies that have addressed related issues (e.g., Aubrey et al., 2020;Chen & Yu, 2019;Yashima et al., 2016) have examined peer interaction within different tasks and different lessons.However, little research has been carried out about how the students interact with their partners within the same task and how they change (or do not change) their task engagement within a single lesson.By understanding the dynamic, situated, and temporally mediated nature of student's task engagement, teachers will be better equipped to identify strategies to engage all learners and thus to incorporate effective pair work into the L2 classroom.

LITERATURE REVIEW Student Engagement in the L2 Classroom
There are many definitions of student engagement, but the core meaning of the concept is action.For example, Skinner et al. (2009, p. 225) describe engagement as "energized, directed, and sustained actions," whereas Reeve (2012) defines it as "the extent of a student's active involvement in a learning activity" (p.150).Student engagement is at times used synonymously with other terms, notably motivation.However, the prevalent understanding across the literature is actually that these two are different-motivation is an antecedent or precursor of engagement.As Mercer and Dörnyei (2020) explain, "motivation is undoubtedly necessary for 'preparing the deal,' but engagement is indispensable for sealing the deal" (p.6; italics original).
In addition to the notion of action, Hiver et al. (2021b) identified three other characteristics of engagement, namely, that engagement is characterized by being highly dependent on context, having an object, and being dynamic and malleable.Previous research has been conducted at various levels, such as schools, classrooms, and specific tasks (Fredricks et al., 2004), and it has demonstrated that aspects of engagement vary depending on the context, such as students' sense of belonging in the classroom and their active participation within a specific task.In the view of Complex Dynamic Systems Theory (CDST; Hiver & Al-Hoorie, 2020;Larsen-Freeman & Cameron, 2008), each of these contexts constitutes its own system, while interacting with each other in various ways.Dörnyei (2014) states that if the system under analysis has "(a) two or more elements that are (b) interlinked with each other, and which (c) also change in time" (p.81), the system will have dynamic characteristics.Therefore, the way in which the entire system changes over time is also an important target of analysis in the current study.

Dimensions of Student Engagement
Although student engagement is a complex and multifaceted construct, earlier studies on task engagement focused on quantitative dimensions (i.e., behavioral engagement).This has been typically operationalized by time spent on the task (Gettinger & Walter, 2012) and words produced or turns exchanged (Bygate & Samuda, 2009;Dörnyei & Kormos, 2000).Studies on cognitive engagement have focused on interactions between learners (i.e., language-related episodes [LREs]), and have demonstrated that learners can use each other's linguistic knowledge to solve language problems and co-construct new knowledge (Dobao, 2012;Kim, 2008).As for emotional engagement, Dörnyei (2002) found that learners with a more positive attitude toward a task displayed more proactive engagement than those with more negative attitudes.This study also demonstrated that correlations between motivation and task engagement were much higher at the dyad level than at the individual level, concluding that students' task engagement was co-constructed with their partners.Phung et al. (2021) investigated the effect of the presence or absence of choice on students' emotional engagement.Using a decision-making task, they compared students who were given a set of prepared choices with those who came up with their own choices, and found that the latter group enjoyed the task more, concentrated more on the task, and had a stronger perception of freedom of expression.Based on the findings, the authors argued that "high engagement in a task has to be characterized by learners' positive affective response or emotional investment as well" (Phung et al., 2021, p. 175).
According to her study, the first two patterns (i.e., Collaborative and Expert/Novice) result in more opportunities for transfer, the occurrence of LREs, and the co-construction of knowledge, whereas the other two are less favorable for collaborative activity.Research based on her model (Chen, 2018;Wigglesworth & Storch, 2009) has shown that learners who adopt a collaborative attitude and are willing to co-construct ideas and knowledge tend to achieve better task performance.
Thus far, research has tended to analyze students' task engagement based on any of the above dimensions, but as Philp and Duchesne (2016) point out, these dimensions overlap, interact, and manifest differently in different contexts.Therefore, in order to get a fuller picture of task engagement, it is necessary to conduct research that takes all dimensions into account and investigates their interdependencies.
A small number of related studies have recently been reported.Baralt et al. (2016) examined the cognitive, affective, and social dimensions of students' engagement during task-based peer interaction.They found that tasks with a higher degree of complexity promoted greater task engagement; also, the degree of engagement was mediated by the mode of interaction, such as face-to-face versus online interaction.Lambert et al. (2017) revealed that tasks based on learner-generated content resulted in greater engagement than those with teacher-generated content, as measured by the amount of task content contributed (cognitive engagement), the amount of time invested (behavioral engagement), the extent to which content was elaborated and negotiated (social engagement), and students' positive responses to tasks (emotional engagement).Phung (2017) investigated factors that contributed to learners' preferences for tasks and whether these preferences had any impact on their engagement.Although tasks with genuine, familiar, and personally relevant topics facilitated greater behavioral, cognitive, and emotional engagement, simple repetition of the task decreased the level of student engagement.Regarding task repetition, Qiu and Lo (2017) also reported a similar finding-simply repeating the tasks negatively influenced behavioral and cognitive engagement, although the participants felt more relaxed and confident.
These studies were innovative in their attempt to reveal the full impact of task engagement on students' task performance.However, extant studies have examined the individual effects of each dimension of task engagement, suggesting that how the mutual interdependencies (or combinations) of dimensions affect actual task performance have not yet been fully investigated.By focusing on this point, this study aims to broaden the scope of research on student engagement.

Temporal Aspect of Student Engagement
Student engagement is also recognized as dynamic, situated, and characterized by temporal and contextual variation.Previous studies that have examined changes in engagement have yielded some evidence about this issue.For example, Chen and Yu (2019) used a case study approach and examined whether students' attitudes, participation, and learning in collaborative writing change throughout the repeated engagement with a task.Analyses of multiple data sources (e.g., pair talk, surveys, reflective journals) from two university students across three tasks revealed that students' attitudes did change as they experienced tasks with their partners.The authors pointed out that such changes can be influenced by many factors, including the perceived value of peer assistance and students' beliefs about collaborative tasks and experiences.Aubrey et al. (2020) explored the factors contributing to students' engagement and disengagement during task performance in a language classroom.The results of 10 different speaking tasks over a 10-week period revealed that various factors, at learner-level (e.g., perceptions about language skills), lesson-level (e.g., preparation for the lesson), task-level (e.g., task design), and post-task-level (e.g., evaluation of performance), influenced the students' (dis-)engagement in task.Yashima et al. (2016) discussed changes in student engagement across time while investigating the factors that determine the level of students' participation in group discussions.Participants were 21 university students, and the analysis was based on the number of turns, talk time, and silent time during the discussion over 15 weeks.Results showed that student engagement in each discussion differed greatly depending on the discussion topic, presence or absence of leadership, and students' motivation.Partly motivated by Yashima's study, Hiromori et al. (2021) also investigated how students' task engagement and group work dynamics differed between an experimental group that included a leader-role student and a control group that did not.Ninety students participated in the study, and they worked in groups of three on a collaborative writing task.Results revealed that both groups were proactive in the task, with and without a leader.However, there were qualitative differences in the process: when there are leader-role students in the group, those groups tend to engage in activities relatively smoothly from the start; in contrast, when there are no such students, it may take time for group members to stick together and collaborate.
Although student engagement usually changes over time, there are cases where the change is observable and situations where it is not.CDST explains this difference with the concept of attractor states, where the system is drawn to the attractor and is in a temporary stable state so that no qualitative changes are visible (Larsen-Freeman & Cameron, 2008).For example, Storch (2002) examined the nature of the interaction between 10 pairs of L2 students across a range of language tasks over a semester.Her results revealed that not only were there four distinct patterns of dyadic interaction (mentioned above) but that these patterns, once established, seemed to be relatively stable across tasks and over time.The process by which the attractor becomes stable with a particular pattern has been conceptualized as self-organization.Through this process of dynamic change, a new pattern is created in the system (Hiver, 2015).
While it is clear from previous studies that student engagement changes depending on the situation, it is less clear which dimensions of student engagement (i.e., behavioral, cognitive, emotional, social) do change.Therefore, the purpose of this study is to clarify how each dimension of learners' task engagement is interrelated, how it changes over time, and how the relationship (i.e., combination) of such dimensions of task engagement affects learners' task performance.The research questions of this study are as follows: RQ1.How does each dimension of task engagement change over time?RQ2.How do the dimensions of task engagement relate to each other?RQ3.How does learners' task performance differ depending on the combination of dimensions of task engagement?

Participants
Participants were 60 university students (34 female and 26 male) learning English as a foreign language (EFL).All students were Japanese and aged 18-20, and they were enrolled in a mandatory low-intermediate language course.They had studied English as a compulsory subject for at least six years in school before entering university.Among them, four students had studied abroad for more than half a year.Their overall level of English proficiency ranged from approximately CEFR A2 to B1 based on the results of the placement test and the teacher's observations.

Task
Students were given a set of four pictures (adopted from Heaton, 1975, p. 30; see Appendix A) and asked to interpret the story depicted in the pictures and describe it in written English.After receiving brief instructions (e.g., not to use a dictionary or any other reference materials), students had 20 minutes to complete the task in class as part of regular course work.Throughout the process, they had to work in pairs and produce one jointly written text.The teacher walked around the classroom to ensure students' fidelity to the task instructions, but did not provide any linguistic help.

Data Collection
Task activities were audio-recorded by the students using their own smartphones.The voice files (i.e., audiorecordings) were collected after all pairs finished the task and then transcribed.The transcribed data, as a result, included 76,946 words of oral interaction corpus.As for learners' behavioral dimension of task engagement, the number of words produced and turns exchanged were counted.Although the actual activity time was set at 20 minutes, some pairs finished the task earlier, while other pairs did not complete it within the time provided.The average duration of on-task engagement for all 30 pairs was 17.13 minutes.Since all pairs engaged for at least 15 minutes, the data obtained from the first 15 minutes of engagement was analyzed to account for time differences between pairs.LREs were used to measure the cognitive dimension, since they represent students' cognitive engagement with the task (Oga-Baldwin, 2019).Following previous research, four types of LREs were identified: grammar (e.g., verbtense choice, article choice), lexis (e.g., word choice, word definition), mechanics (e.g., spelling, punctuation), and content (e.g., adding or suggesting sentences, asking for ideas or opinions).Three pairs were first randomly selected from among the participants, and two researchers discussed how to categorize and count the LREs.After gaining a common understanding, the researchers coded and categorized LREs from all of the recordings independently.The inter-coder reliability (= [Total number of LREs -Number of disagreements] / Total number of LREs) was initially 79.3%; all disagreements were resolved by discussion.
For the emotional dimension, a five-item, multiplechoice questionnaire was developed for this study based on prior research (Dörnyei [2002] and Oga-Baldwin [2019]; see Appendix B).Items assessed participants' attitudes toward the task (e.g., "I found the task interesting."and "I was able to work on the task enthusiastically.").These items were rated using a five-point Likert scale ranging from 1 (Not at all) to 5 (Very well).While ideally the questionnaire could have been administered several times during the task (e.g., at 5, 10, and 15 minutes), considering the burden on the participants, it was administered only once, immediately after the task completion.The Cronbach's coefficient alpha was .87,confirming internal consistency.
For the social dimension, the pattern of dyadic interactions and their salient features was examined.Based on Storch's (2002) model of patterns of interaction, equality (i.e., equal contribution to the task) and mutuality (i.e., reciprocity of turn-taking) were used as references to select representative pairs.Specifically, it was considered that the smaller the difference in the number of words uttered by the paired learners (equality) and the higher the total number of turns by the two learners (mutuality), the more collaborative attitude was observed.By extracting pairs based on the above criteria and examining their transcripts and recordings closely, incidences that best represented high and low social engagement manifestations were explored.
Finally, students' task performance was assessed by quantitative and qualitative evaluation of their writing products.For the former, the total number of words written in English writing was counted (i.e., how many words each pair wrote in the collaborative writing task).For the latter, the writings were scored on a 10-point scale based on the four perspectives (content, organization, vocabulary, and grammar).This rubric was developed following previous studies (e.g., Shehadeh, 2011).Each writing product was scored by two researchers independently, and the average score was used as the English writing score for each pair.

Data Analysis
For data analysis, first, the five-minute values of each dimension of task engagement were calculated (descriptive statistics), and the relationship between the dimensions at each time point was examined (correlations).Next, cluster analysis (Ward's method with squared Euclidean distance technique) was employed to profile participants that exhibit the characteristics of similar combinations based on the score of task engagement.Each pair's scores for behavioral (number of words, number of turns), cognitive (number of LREs), and emotional engagement (questionnaire results) were used as clustering measures.ANOVAs were then conducted to confirm the validity of the grouping.Finally, the profiling of task engagement in each cluster and its relation to task performance were examined.ANOVAs with Bonferroni adjustment were performed, with each indicator as the within-subject factor and with group (i.e., cluster) as the between-subject factor.Any significant differences were subjected to a post-hoc analysis.In calculating effect sizes (eta-squared [η 2 ]), the classification proposed by Cohen (1988) was used, with η 2 = .01representing a small effect, η 2 = .06medium, and η 2 = .14a large effect.

Research Question 1
The first research question was: How does each dimension of task engagement change over time?To answer this question, the five-minute values of each dimension of task engagement were calculated.As presented in Table 1, both indicators of behavioral engagement seemed to show a decrease over time (also see Figure 1).Specifically, the number of words in the first five minutes (0-5 min) was 930.97 words, compared to 866.83 words in the second five minutes (5-10 min), and 830.53 words in the last five minutes (10-15 min).Similarly, the average number of turns was reduced from 88.73 to 78.97 and then to 69.27.For cognitive engagement, the number of LREs also decreased over time.Figure 2 summarizes the details of the breakdown of LREs (also see Table 1).The total number of LREs from the start of the activity to 5 min (7.67) decreased during 5-10 min (7.00) and 10-15 min (6.27).This result was similar to the decrease in the number of words and turns in Figure 1.Overall, a large proportion of LREs at all time points were related to lexis (28.67%-41.00%)and content (28.61%-39.83%).

Table 1. Descriptive Statistics for Each Dimension of Engagement Over Time
Indicator 0:00-5:00 5:00-10:00 10:00-15:00 Note.N = 60 (30 pairs) The second research question was: How do the dimensions of task engagement relate to each other?To answer this question, the relationship between the dimensions at each time point was examined.Table 2 shows the Pearson correlation coefficients between the number of words/turns/LREs and task attitudes on each time axis (0-5 min, 5-10 min, and 10-15 min).For task attitudes, the data at one point in time (i.e., measured at the end of the task) is used.A close relationship was observed between each dimension of engagement.First, not surprisingly, there was a strong correlation between the number of words and turns (r = .71-.82).The results indicated that the more smoothly the pair engaged in turn-taking (i.e., two people speaking alternately), the greater the amount of speech within the pair.Individuals who actively interacted with their partner more (i.e., a higher number of words/turns) had a greater number of LREs (r = .32-.75).
Overall, the correlations between number of words/turns and that of LREs appeared to be strengthened as time progressed (r = .38, .49, .75 and r = .32, .47, .62, respectively).As mentioned earlier, the number of words/turns/LREs all decreased over time (Figures 1 and 2).The stronger correlations under these circumstances indicated that each pair could pinpoint more LRE-related interactions as the task progressed.At the start of the task, students had to agree with each other on aspects of the content, like how to develop the story, but as they gradually came to a consensus, they were able to focus on the writing product itself.In other words, the issues and concerns became clearer, and thus, the interactions were considered to be more focused.

Research Question 3
The third research question was: How does learners' task performance differ depending on the combination of dimensions of task engagement?To answer this question, first, a cluster analysis using each of four indicators as clustering measures was performed.These clustering variables were: number of words, number of turns, number of LREs, and task attitudes.Since each indicator had different units, standardized values were used.With the aid of the dendrogram obtained from the analysis (see Appendix C), 30 pairs were categorized into three groups (see Figure 3).Cluster 1 (n = 13) tended to have a higher number of LREs than the others, while Cluster 2 (n = 9) had higher than average values for all indicators.Cluster 3 (n = 8) had scores in all of the indicators below average.To confirm the validity of the grouping solution, ANOVAs were conducted.Results showed that significant overall differences among the clusters were confirmed for all four indicators (p < .01).
Tables 3 and 4 show the descriptive statistics for each cluster of task engagement and their task performance, with the results of ANOVAs.The results showed that the changes in mean scores over time for the number of words/turns/LREs were not equivalent between the three groups (i.e., clusters).Thus, changes in mean scores for each group were further explored.Tukey's post-hoc tests using Bonferroni adjustment revealed significant differences between groups in the number of words/turns/LREs (see Table 3).As is clear from the table, the number of words and turns was consistently high for Cluster 2, followed by Cluster 1, and the lowest for Cluster 3. Concerning the number of LREs, Cluster 3 also had the lowest, but between Clusters 1 and 2, the former tended to have slightly higher LREs (with a significant difference only at 5-10 min).For the written word count (see Table 4), Cluster 2 performed better than both Cluster 1 and Cluster 3.For task attitudes and writing scores, there was no statistically significant difference between the groups.The following is a detailed description of the characteristics of each cluster.a Effect size of η 2 = .01represents a small effect, η 2 = .06medium effect, and η 2 = .14large effect (Cohen, 1988).**p < .001.(with Bonferroni adjustment)  a Effect size of η 2 = .01represents a small effect, η 2 = .06medium effect, and η 2 = .14large effect (Cohen, 1988).**p < .005.(with Bonferroni adjustment) Cluster 1 One of the features of Cluster 1 was the high number of LREs (Mtotal = 24.08).Looking at the breakdown of LREs, the effect of grouping (i.e., clusters) was evident for content.
As with the number of words/turns, the number of LREs decreased over time (see Table 1), but the learners were consistently interested in the content.This result indicates that pairs in Cluster 1 were engaged in the task, paying particular attention to the content aspect.
Turning to social engagement, the pairs in this cluster showed a general tendency to collaborate actively with each other.Among the pairs in Cluster 1, Taku (male; pseudonym, and the same for all participants) and Miho (female) were particularly collaborative in their interaction-their turn-taking was the highest in Cluster 1 (311 turns; Mtotal = 245.38 for Cluster 1 pairs) and the difference in the number of words uttered by the two was relatively small (429 words; Mtotal = 524.40for Cluster 1 pairs).The following examples show their interesting exchanges regarding LRE content.They were frustrated at the beginning of the task.In Example 1, Taku wanted to name each family member, but Miho preferred to prioritize task completion; interaction between the two was awkward, and discussion within the pair did not deepen.As the task progressed, however, they started to compromise, especially Miho.In Examples 2 and 3, Miho began discussing family names with Taku.Namely, she showed an interest in Taku's opinion and compromised.By the end of the task, exchanges between the two had become much smoother.

Example 2
In this way, this pair seemed keen on co-constructing meaning through cooperation, compromise, and concession-even when opinions differed.
As Table 4 shows, the written word count in Cluster 1 (M = 59.00) was similar to that of Cluster 3 (M = 57.38)but was significantly lower than that of Cluster 2 (M = 87.11).Their writing score was the highest among the three clusters.

Cluster 2
Cluster 2 consisted of pairs with a high degree of task engagement, as all indicators scored above the average.One of the characteristics of this cluster was the number of words/turns (see Table 3).The pairs recorded an average of 111.56 turns during the first five minutes of the activity, which was higher than that of Cluster 1 (M = 92.62)and Cluster 3 (M = 56.75).This result indicates that the learners started to work on the task immediately and exchanged opinions.
Two pairs in this cluster showed exceptionally high social engagement (i.e., large number of turns and small difference in the words uttered).The first pair was Yuna and Maiko (both females).The total number of turns taken by the pair was 375 (Mtotal = 305.78for Cluster 2 pairs), and the difference in the number of words produced was 143 words (Mtotal = 390.62 for Cluster 2 pairs).Observing their interaction during the task, it was evident that the overall relationship between the two was good.They listened actively to each other and acknowledged their partner's remarks, often using compliments or statements of admiration such as "Sounds good" and "That's right."A significant feature of their interactions was that Yuna's attitude towards the task seemed to influence Maiko's attitude over time.Concretely, Yuna persisted with meeting standards for the length of writing rather than the content.She tried to add many nouns and adjectives that seemed redundant, saying, "Let's use as many words as possible" and "Let's increase the number of words."Maiko merely observed Yuna's behavior initially, but gradually began to come up with ideas to imitate her.The following demonstrates this effect, showing that it was Maiko who was willing to increase the number of words in the latter part of the task: Example 4 1.Maiko: If we want to add a lot more, we can do a lot, like adding 'and a girl,' 'boy...' 2. Yuna: Yeah, let's do it.So, stop saying, 'There is a family.' 3. Maiko: 'There is…' 4. Yuna: 'A father, a mother…' 5. Maiko: Let's add as many words as possible.

Yuna: Let's add a lot. [laugh]
In the beginning, Maiko was just listening, but as the task progressed, she gradually began to cooperate with Yuna, as if pulled by her proactive attitude.This change over time may indicate that the task engagement of one learner in a pair has a significant impact on the other learner (Dörnyei, 2002).
The other pair was Kisuke and Taro (both males; 337 turns and 392 word difference).One of their main characteristics was that they could not help but address the spelling of the word "mosky" (mosquito) from the start of the activity.Soon after starting, when Kisuke asked, "mosky, what is it?How do you write it?"Taro replied, "Umm..., it looks like there is a 'th' [in mosquito].No, maybe not."They worked on the task all the way through, with the spelling on their minds.The following presents the exchanges just before the task ended.5. Kisuke: 'mosky…' 'mosky…' 6. Taro: Lend me your pen.I will write a lot.
Taro, who was more reserved at the beginning, suggested, "Let's do something about the spelling of mosky."At the very end of the task, he claimed, "It is interesting, like a quiz program," signifying that he was now fully immersed in the activity (i.e., emotionally engaged).Kisuke and Taro used the word "mosky" a total of 65 times throughout the activity (30 and 35 times, respectively).
As indicated in Table 5, Yuna and Maiko's engagement profile shows that each indicator of task engagement is closely related to performance.Furthermore, for the pair, each indicator seemed to mediate other indicators of engagement in a positive way.In contrast, Kisuke and Taro's engagement profile shows a somewhat different story-the number of words/turns was relatively high, questionnaire results were also positive, but their number of LREs was 17, which was less than the Cluster 2 average (M total = 22.33) or that of all other pairs (Mtotal = 20.93).This lesser LRE is evident when compared to the Cluster 2 pair of Yuna and Maiko.As the results in Table 5 suggest, Kisuke and Taro paid too much attention to a single aspect of the task and not enough attention to the others, resulting in more superficial engagement in the learning task (cognitively less engaged) than Yuna and Maiko.For Cluster 3, the number of words/turns/LREs was low.Notably, the number of words/turns was about half that of Cluster 2 (see Table 3).This result implies that there was more than a bit of silence in the interactions of the pairs.Focusing on social engagement, many of the students in this cluster tended not to play any leading role.For example, Hikaru (male) and Yu (female), whose number of turns was the lowest in Cluster 3 (108 turns; Mtotal = 145.88for Cluster 3 pairs) and who had a below-average difference in words uttered (547 words; Mtotal = 685.64 for Cluster 3 pairs), started their interaction smoothly.They cooperatively discussed the story structure, but their conversation did not deepen when writing, and their utterances were generally brief.It seemed that neither of them was confident in their English, so they were reluctant to express their ideas or to argue with each other's opinions.Another feature was that there was not much cognitive engagement-the number of LREs was 8; Mtotal = 14.25 for Cluster 3. Pairs in Cluster 1 (Mtotal = 24.08)and Cluster 2 (Mtotal = 22.33) tended to create a unique story together using their imagination, whereas pairs in Cluster 3 appeared to limit themselves to directly translating their ideas without consideration of style or flow.

Table 5. Differences in Task Engagement in two Pairs
For task performance, the results of the written word count indicated that the mean score for Cluster 3 (M = 57.38)was significantly less than Cluster 2 (M = 87.11),but similar to Cluster 1 (M = 59.00).Besides, although the writing score of Cluster 3 (M = 5.69) was the lowest among the three clusters, this difference was not statistically significant.In short, although the task engagement of the pairs in Cluster 3 was lower than that of the other clusters, this lower engagement had an impact only on the quantitative aspect of writing performance (i.e., word count) and not necessarily on the qualitative aspect (i.e., writing score).

DISCUSSION
This study is a unique attempt not only to understand students' engagement in pair work from multiple perspectives but also to describe and analyze the changes in students' engagement from a temporal aspect.The results showed that the 30 pairs in the study fell into three clusters with different characteristics, that there was a clear difference in actual engagement between the clusters and over time, and that such differences had an impact on their task performance.

Overall Results
Previous studies on students' task engagement have focused on the entire target population (i.e., the classroom) or a small number of participants as units of analysis (Hiver et al., 2021b).In contrast, this study focused on pairs and demonstrated significant differences in each dimension of task engagement, depending on the pair.Storch (2002) identified four different patterns of dyadic interaction.The results of this study showed that many pairs in Clusters 1 and 2 can be categorized as Collaborative, and analyses of their social engagement suggested that these pairs tended to have a high degree of equality and mutuality.As a result, there was a substantial amount of mutual interaction (behavioral engagement), various discussions about language (cognitive engagement), and a high level of task satisfaction (emotional engagement).
Furthermore, there were pairs in which one of the participants led the activity, and the other cooperated without being passive, similar to the Expert/Novice relationship.Some of these pairs worked like this from beginning to end, while others seemed to change as they worked (e.g., the pairs of Taku-Miho and Yuna-Maiko).It is possible that they observed their partners actively contributing to the task and felt compelled to imitate this behavior.Bandura's (1978Bandura's ( , 1986) ) observational learning theory clearly illustrates this phenomenon, where learning is established not only by taking action directly and receiving responses but also by observing or modeling the behavior of others.Unknowingly, as observers, we are affected by our partner's behavior through "vicarious experience" (Bandura, 1978), just as students in this study began to engage in the task after observing their partner's hard work and enthusiasm; the partner as a contagious facilitator can impact the co-participant through their attitude and behavior.
However, the impact of having a task partner was not all positive.In Cluster 3, pairs that included a student who would lead the activity-the Expert or Dominant type student in Storch's (2002) framework-did not emerge.Their emotional engagement showed that there was no negative attitude toward the task or their partners.If anything, it seemed that both students took a passive attitude toward the task.King (2013) also found that Japanese students rarely self-initiated communication in English or even Japanese, suggesting that they are not good at speaking on their own, although they try to respond to others when they are spoken to.If either partner in Cluster 3 pairs had shown a pattern other than Passive, their task engagement might have been different.These results indicate that, for pair work to work well, at least one student needs to take the lead (Northouse, 2009).It has recently become common for students to work in pairs or small groups in L2 classrooms, but some Japanese students, used to teacher-centered classes, remain passive in the classroom.For such students, it is necessary to explain the importance of "co-constructing" activities in pairs and enhance their awareness of the significant benefits of pair work.
In relation to dyadic interaction patterns, Storch's (2002) framework discussed only four types, but the results here show that such patterns may include a fifth type, i.e., Passive/Passive.Although Passive/Passive pairs are initially motivated to tackle a task, they are less assertive and perform the task without clearly demonstrating their ideas.Further research can determine whether this type of pairing is unique to Japanese students, or whether it is also common in students of other cultures and ages.

Changes in Each Dimension over Time
Regarding RQ1, the results indicated that student engagement likely changes as the pair work progresses.For example, the number of words/turns/LREs generally decreased.This tendency demonstrates the fact that interactions do not proceed in the same way from start to finish.Since the task was collaborative writing, the time spent on writing increased as the task progressed, resulting in decreased words/turns/LREs.
Past research (e.g., Chen, 2018;Dobao, 2012;Storch, 2002) reported that pair work became more productive when the two learners adopted a collaborative attitude and were willing to co-construct ideas.Findings in this study are in line with this claim and show that when both learners adopt a collaborative attitude, or at least when either of them plays a leading role (like many Cluster 1 and 2 pairs), the interaction between the two tends to become productive; students pay more attention to language, leading to frequent LREs.In contrast, when neither leads the activity, or both adopt a passive attitude (like many Cluster 3 pairs), peer interaction tends to become less active, and consequently, the pair has fewer words/turns/LREs.This study includes another important finding: the relationship between pairs may not consistently be the same; it can change during the process of the task.For example, the pair consisting of Taku and Miho disagreed at first, but as the task progressed, Miho began to respect Taku's opinion, and their interactions became much smoother.Additionally, under the influence of Yuna's positive attitude, Maiko imitated her goal of producing more words.These examples suggest that partners influence each other as they engage in an activity, which may result in a change in their relationship.However, as not all pairs in this study showed the changes described above, further verification of this point is necessary.

Interactions of Each Dimension
Concerning RQ2, the results showed an overall close and positive relationship between the four dimensions of task engagement.As Figure 3 shows, if one indicator was high, the other was also high (like Cluster 2) and vice versa (like Cluster 3).In other words, each dimension of task engagement appeared to have a mutual relationship with the others that activated or strengthened them, which supports the results of previous studies (Baralt et al., 2016;Lambert et al., 2017).Thus, teachers may be able to work on one dimension of task engagement (e.g., emotional dimension) to positively influence the other dimensions (e.g., behavioral/cognitive dimensions) and thus effectively promote the desired student engagement.
Notably, there seemed to be a negative relationship between behavioral and cognitive engagement, as evidenced by the number of LREs.Cluster 2 pairs should have had the highest number of LREs, but this did not happen.One possible explanation for this result is that students might focus too much on one aspect of a task and neglect others (e.g., Kisuke and Taro, who were obsessed with the spelling of "mosquito").Depending on the situation, each dimension of task engagement could deactivate or inhibit the others.This possibility suggests that teachers need to be mindful of the overall balance when thinking about students' task engagement.
These results indicate the importance of viewing learners' task engagement from a holistic perspective.While it may appear that students express their opinions with each other and actively engage in pair work (i.e., they appear behaviorally engaged), in reality, they may just be chatting and, thus, cognitively less engaged, or they may only care about their partners and not be at all interested in the activity (i.e., they are emotionally less engaged).Therefore, in judging students' task engagement, teachers and researchers alike should evaluate their efforts from multiple viewpoints, not just easily observable ones like behavioral dimensions, and consider the relationship between engagement and actual learning outcomes.
Finally, addressing RQ3, the results revealed that there was a clear impact on the quantitative aspect of task performance (i.e., written word count).Cluster 2, in which all of the indicators of task engagement scored above the average, wrote comparatively long English passages.The results demonstrate that the learners' task engagement strongly influences actual task performance, supporting previous research findings on student engagement (Hiver et al., 2021b;Oga-Baldwin, 2019).Therefore, paying attention to the state of students' engagement can be helpful for teachers wishing to improve their students' task performance.
No statistically significant differences were found between the clusters in the qualitative aspects (i.e., writing score).Based on Dobao (2012) and Kim (2008), it was expected that the more students talk about language (i.e., more LREs), the higher the writing achievement would be.However, the results only marginally confirm such a trend.There are several possible reasons.For example, in this study, picture description was used as a task.Because the learners did not have to create a story from scratch, this was a relatively undemanding task and it may have been easy for them to work on the task; but, it is also possible that this created a smaller difference in performance.Furthermore, the fact that the learners' engagement was high did not necessarily mean that it was immediately evident in the outcomes and quality of language produced by the students.Therefore, it is necessary to examine the impact of task engagement through longitudinal experimental studies with a temporal design.

CONCLUSION
Before concluding remarks, a few limitations of the study need to be acknowledged.First, to avoid adverse effects on participants' affective states during the task, the questionnaire survey was administered only once after the task was completed.In order to examine students' emotional engagement more precisely, it is necessary to measure their affective state multiple times during the task and examine its changes in detail.Second, a variety of task factors and conditions (e.g., purpose, difficulty, duration) have been recognized to have a critical impact on students' task engagement.Therefore, it would be worthwhile to examine similar research questions in different settings using different tasks.The third notable limitation is related to the pairing effect (Hiromori et al., in press).Students' engagement is likely to be influenced by the motivation and language proficiency of their partners.Future research should experimentally set up various types of pairs (e.g., pairs with high or low proficiency levels or pairs with different proficiency levels) and compare their task engagement and task performance, to further explore the influence of pairing partners.
Despite these limitations, the present study is a significant step forward to examine the combined effects of the behavioral, cognitive, emotional, and social dimensions of students' task engagement on collaborative writing in pairs.While there was generally a strong interdependent relationship between the dimensions, the results provide some preliminary evidence that one dimension of engagement might mediate the effect of others positively or negatively depending on the situation.Furthermore, these findings reveal a dynamic change in each dimension as the pair work progresses.Clarifying the conditions under which learners change their task engagement and showing the processes that facilitate such engagement increases the possibility of educational interventions with deliberate manipulation.It is hoped that the findings presented in the study will serve as a foundation for future studies on student engagement.

Figure 1 .Figure 2 .
Figure 1.Temporal Changes in the Number of Words and Turns

Figure 3 .
Figure 3. Cluster Composition for all Pairs : Here, maybe it's better to explain the family.'The family...' 2. Miho: Let's do it last.3. Taku: Why? 4. Miho: Do you want to do it now? 5. Taku: Let's give a family member a name.6. Miho: OK, but first, let's finish the task.

Table 2 .
Correlations Between Each Dimension of Engagement on Each Time AxisNote.The values of the correlation coefficients indicate, from left to right, the first five minutes, the next five minutes, and the last five minutes (i.e., 0-5 min/5-10 min/10-15 min).*p < .05., **p < .01.

Table 3 .
Task Engagement in Each Cluster

Table 4 .
Task Performance in Each Cluster