The initiative was well received and the 339 students organised themselves within a few days to form 86 working teams, mostly of four members. All had access to information on the tasks to be carried out and to the planning questionnaire. On the other hand, there were 59 entries in the Virtual Campus forum and 1909 views: 27 questions on PPT/video content, 13 related to tools and resources external to Moodle, 9 on Moodle tools, 7 on timing and deadlines and 2 on general work guidelines.
There were 318 students (91.4%) attending the first face-to-face feedback session. After explaining to the students how to perform quality feedback and how to evaluate the PPT documents correctly, each student performed the peer review, checking compliance with the more formal aspects, namely those related to the first three quality criteria. In this way, the students were able to get to know each other personally and establish a dialogue for an exchange of views on these format requirements.
During the second face-to-face feedback session attended by 314 students (90.2%), peer review was focused on information content A, B, C and D. Triangular exchanges were established between the assessor students, the assessed students and teachers aimed at reviewing the depth and thoroughness of the contents of the PPT, according to quality criteria 4, 5, 6 and 7.
Concerning RQ1 (What is the focus of the comments and/or proposals for improvement made by the students in a peer assessment process of a task), it can be affirmed that a large majority of the students carried out the peer assessment first of the PPT in Workshop 1 (92.9% of them) and then of the video created from the improved PPT (98.2%). The analysis of the comments allowed the categorisation of the aspects that according to them should be improved, considering the assessment criteria of the work. The suggestions for improvement affected all the quality indicators, but to varying degrees, as shown below (Table 3).
Table 3
Percentage of responses per task and assessment criteria
Criteria | Task | Responses Nº (%) |
Criterion 1: Level of preparation, technical quality of the product and format of bibliographical references | Initial PPT format | 266 responses (81.6%) |
1st video version | 215 responses (62.0%) |
Criterion 2: Attracting interest, structuring information, and length of presentation | Initial PPT format | 133 responses (40.8%) |
1st video version | 150 responses (43.2%) |
Criterion 3: Academic style and terminology | Initial PPT format | 124 responses (38.0%) |
1st video version | 64 responses (18.4%) |
Criteria 4 to 7: Content A, B, C and D | Initial PPT format | 185 responses (56.7%) |
1st video version | 131 responses (37.8%) |
No comments for the improvement of the task | Initial PPT format | 22 responses (6.7%) |
1st video version | 62 responses (17.9%) |
With regard to the peer assessment of the initial version of the PPT, it can be observed that the comments pointed to improvements not only in the format and structuring of the deliverables (criteria 1, 2 and 3), in particular those aspects of Criterion 1, but also in the information content (criteria 4 to 7).
Regarding the quality of the videos that were subsequently designed, namely the comments from Workshop 2, these points were also highlighted, although in general there were fewer of them, which could indicate that the suggested corrective actions were undertaken.
Based on the comments on the quality of the products created, the different aspects to be improved within each established quality criterion were categorised. Figure 3 shows the results of the analysis concerning the initial PPTs and the corresponding videos, respectively.
Regarding the technical quality and format of the PPT (Criterion 1), most of the comments for improvement mentioned deficiencies in the format of the “cover page” (56.1%) and in the bibliographical references (53.4%). Regarding Criterion 2, it was pointed out that in 17.5% of the PPT formats, the structuring of the information as well as the number of slides needed to be improved to comply with the required timing (22.1% cases). Some spelling, grammatical and typographical errors were mentioned in 35.9% of the documents and, in addition, some inappropriate or erroneous terms were detected in 5.5% of the deliverables (Criterion 3). Finally, constructive criticism of the content (Criteria 4 to 7) indicated that students assimilated many of the concepts and exercised their learning skills. However, there were comments for improvement in the case of a quarter of the PPTs concerning the depth and thoroughness of the information for a better understanding of the information.
Regarding the videos generated after the improvement of the PPTs, most of the comments on the technical quality of the videos (Criterion 1) pointed to the need for improvement of the sound and/or the voice-over in 46.15% of the products. Regarding Criterion 2, the need to adjust the voice-over timing and/or the number of slides was considered in 32.9% of the cases. The other quality aspects of the videos were largely improved compared to the first peer review of the PPT formats, due to previous constructive criticism. In particular, the percentage of comments concerning the format decreased significantly (from 60.0–8.1%). Similarly, there was a decrease in observations of improvement on bibliographical references (from 55.5–18.2%), academic style (from 35.9–14.1%) and also on all information content (A, B, C and D).
As for RQ2 (What is the students’ perception and assessment of the evaluations given and received in a peer-feedback process?), each student was asked to write a reflection on the feedback received in each of the two workshops and to grade on a scale of 1 to 5 (1 being of very low quality and 5 of high quality) of both the feedback received as the assessor and the feedback given as the assessed. NR (no response) is noted in the case of not having been an assessor/assessed. 326 students provided the two feedbacks. Figures 4 and 5 show the rating of the quality of these feedbacks. It can be noted that, in general, the comments of the PPT revision were very well rated and that the scores for feedback provided and received were more balanced in the second peer assessment.
On the other hand, the difference between the mark of the feedback given and the feedback received was determined separately for each student (Fig. 6). For both peer assessments, around 60% of the students thought that the quality of the feedback given and received was similar. In contrast, 26.0% thought they provided poorer quality feedback than their peers in the first assessment, and this fact was acknowledged by only 14.3% of the peers; whereas in the second assessment there was a strong agreement of opinion in this respect, ranging from around 18.6% -18.9% of the cases.
Responding to RQ3 (Are there differences between teachers’ assessment compared to peers’ assessment?), the scores with which each student rated the two products created by the peer group were analysed, using the already mentioned A to D marking scale. Table 4 shows the frequency distribution of the scores for the PPTs and then for the corresponding videos. It also includes the teachers’ rating of the final versions of the videos on the same scale, but using scores from 0 to 10.
The two peer assessments were predominantly rated between “good” and “excellent” with more than half of the products rated as being “very good”. No products were rated as “insufficient”. 3.4% of the students in the first assessment, and 0.9% in the second, preferred not to evaluate their peers.
Table 4
Rate of distribution of the products’ assessments.
| NC Not assessed | D Insufficient (< 5.0) | C Sufficient (5.0-6.4) | B- Rather low (6.5–7.4) | B Good (7.5–8.4) | B+ Rather high (8.5–8.9) | A Excellent (≥ 9.0) |
First peer assessment | 3.4% | 0% | 0.6% | 3.7% | 19.0% | 52.1% | 21.2% |
Second peer assessment | 0.9% | 0% | 0,3% | 2.0% | 20.5% | 51.3% | 25.1% |
Teachers’ assessment | 0% | 0% | 0% | 0% | 33.3% | 29.2% | 37.6% |
Overall, it can be seen in Table 4 that the scores of the products have improved due to the corrections of the previously delivered versions, in benefit of the quality of the final product. Thus, none of the final versions of 86 videos formally assessed by the teachers rated less than “good” (7.5 out of 10). The mean score was 8.7 ± 0.7, with partial scores of 8.7 ± 0.8, 8.9 ± 0.7, 8.8 ± 0.8 y 8.7 ± 0.7, respectively for quality criteria 1, 2, 3 and 4 to 7. The most notable deficient aspects of the final versions were the incorrect use of specific terminology (41.9%), errors in galenic concepts (39.5%) and speech/locution defects (25.6%), as can be seen in Fig. 7.
On the other hand, after the formal assessment, 107 student assessors (31.6%) in the so-called OVR (overrated) group were found to have over-assessed the quality of the first versions of their peers’ videos by one or two points higher than those established on the A to D scale. Tables 5 and 6 compare the academic performance of the assessors/assessed of the OVR group (n = 107 students) with the rest of the students (Group REF) (reference), (n = 232 students), differentiating within each of these populations the subgroup of assessors and the subgroup of those assessed by the assessors. The result of the mean comparison analysis for unpaired data using parametric treatment (Student’s t-test), which was carried out to detect possible significant differences between the scores obtained by the students in the four subgroups, is shown. Cohen’s d was calculated to measure the effect size. The analysis of the data using Student’s t-test did not provide conclusive results that could relate the academic performance of the OVR group’s assessors to the overrating of the videos assessed by them.
Indeed, it was found according to the teachers’ perception that the quality of the final versions of the videos submitted by the assessed of the OVR group (mean mark of 8.25 ± 0.57) was significantly lower (p 4 and p1 < 0.05 and d4 and d1 > 0.8) than the products created by the assessors of the same group (8.75 ± 0.59) and by the assessed of the reference group (mean mark of 8.95 ± 0.60). These lower scores were related to serious errors in terminology and fundamental concepts in the field of galenic pharmacy knowledge that had not been corrected, even though these deficiencies were mentioned in the different feedbacks. It is worth mentioning that no significant differences were found between the final scores of the subject for the assessors and those assessed in the OVR group and the REF group, with the average mark of the four subgroups being very similar and around 7.1 (p5, p6, p7 and p8 > 0.05 and d < 0.5).
Table 5
Difference in the videos formal scores (given by the teachers) between the assessors and assessed of the OVR and REF groups.
Comparison of two populations | Average of the formal scores of the videos (mean ± SD) | t-student (p value) | d Cohen value |
Peer assessors of the group OVR | 8.75 ± 0.59 | p1 < 0.05 | d1 = 0.861 |
Students assessed by the group OVR | 8.25 ± 0.57 |
Peer assessors of the group REF | 8.74 ± 0.69 | p2 < 0.05 | d2 = 0.323 |
Students assessed by the group REF | 8.95 ± 0.60 |
Peer assessors of the group OVR | 8.75 ± 0.59 | p3 = 0.897 | d3 = 0.016 |
Peer assessors of the group REF | 8.74 ± 0.69 |
Students assessed by the group OVR | 8.25 ± 0.57 | p4 < 0.05 | d4 = 1.199 |
Students assessed by the group REF | 8.95 ± 0.60 |
Table 6
Difference in the final mark of the subject (given by the teachers) between the assessors and assessed of the OVR and REF groups.
Comparison of two populations | Average of the final mark of the subject (mean ± SD) | t-student (p value) | d Cohen value |
Peer assessors of the group OVR | 7.10 ± 1.08 | p5 = 0.732 | d5 = 0.047 |
Students assessed by the group OVR | 7.15 ± 1.05 |
Peer assessors of the group REF | 7.00 ± 1.07 | p6 = 0.686 | d6 = 0.038 |
Students assessed by the group REF | 7.04 ± 1.06 |
Peer assessors of the group OVR | 7.10 ± 1.08 | p7 = 0.426 | d7 = 0.093 |
Peer assessors of the group REF | 7.00 ± 1.07 |
Students assessed by the group OVR | 7.15 ± 1.05 | p8 = 0.374 | d8 = 0.104 |
Students assessed by the group REF | 7.04 ± 1.06 |
In relation to the RQ4 (What competencies do students feel they develop from a peer assessment experience?), a total of 138 students responded to the survey available on the Virtual Campus at the end of the activities provided by the didactic sequence. The mean and the standard deviation of the scores assigned by the students were determined based on the responses obtained. The following tables show the values obtained for the scores from the role of assessor (Table 7) and from the role of the assessed (Table 8).
Table 7
Scores attributed to the experience from the role of assessor.
The assessment of the tasks of my colleagues (being an assessor) has allowed me to... | Mean | SD |
Rethink the objectives of the assessed task | 4.07 | 0.91 |
Have a more critical view of the work I have done | 4.24 | 0.71 |
Involve myself more in my learning process | 4.01 | 0.94 |
Realize the processes that I need to improve in my learning process | 4.23 | 0.88 |
Realize the processes that I must maintain and enhance in my learning process | 4.17 | 0.97 |
Contribute to the development of the competence learn to learn | 3.99 | 0.95 |
Learn how to give feedback | 4.21 | 0.84 |
Understand the evaluation criteria of the assessed task | 4.27 | 0.79 |
Global rate | 4.15 | 0.88 |
Note. The rating scale was from 1 to 5, where 1 = do not agree and 5 = very agree.
Table 8
Scores attributed to the experience from the role of assessed.
Receiving the opinions, evaluations and advice of my colleagues (being the one assessed) has allowed me to... | Mean | SD |
Rethink the objectives of the assessed task | 4.21 | 0.87 |
Have a more critical view of the work I have done | 4.30 | 0.86 |
Improve my own work based on the opinions/assessments and advice of my colleagues | 4.42 | 0.78 |
Realize the processes that I need to improve in my learning process | 4.12 | 0.90 |
Realize the processes that I must maintain and enhance in my learning process | 4.12 | 0.98 |
Contribute to the development of the competence learn to learn | 4.04 | 0.93 |
Learn how to give feedback | 4.08 | 0.94 |
Involve myself more in the learning process | 4.14 | 0.92 |
Global rate | 4.18 | 0.90 |
Note. The rating scale was from 1 to 5, where 1 = do not agree and 5 = very agree.
The responses of the participating students were also analysed in relation to their perception of the overall experience of peer assessment. In this sense, and as shown in Table 9, the global mean was of the same order as in the results according to the assessor/assessed role (4.16 ± 1.07). Nevertheless, better scores stand out for the item “I am able to self-assess the quality of my task” (4.50 ± 0.87) and lower scores (3.74 ± 1.30) for the item “I have discovered strategies, competencies or skills that I could apply to other contexts”.
Table 9
Scores attributed to the peer assessment experience.
Through the peer assessment experience... | Mean | SD |
I discovered strategies, competencies or skills that I could apply in other contexts | 3.74 | 1.30 |
I have become aware of the actions and processes that can allow me to improve learning with more autonomy, efficiency and understanding in future task | 4.00 | 1.25 |
I am able to represent the objectives, the evaluation criteria and the processes to plan and carry out a quality task | 4.26 | 1.06 |
I am able to self-assess the quality of my work | 4.50 | 0.87 |
Global rate | 4.16 | 1.07 |
Note. The rating scale was from 1 to 5, where 1 = do not agree and 5 = very agree.
Finally, and in relation to RQ5 (What is the students’ perception of the benefits and difficulties that a peer assessment process may have before and after participating in a peer-feedback-based experience?), the responses of the participating students were analysed at two different times: i) before starting the experience (PRE): What benefits and difficulties do you think the peer assessment process may have?; and ii) at the end of the experience (POST): Now that you have participated in a peer assessment process, what benefits and difficulties do you think this type of process may have? As shown in Table 9, from the first quartile of responses (N = 308) to the question of the PRE moment, the following verbs are extracted: learn, assess, have, improve, do, believe and see. Regarding these results, it is worth mentioning that the verb “to have” is used in many of the responses to express the benefits or difficulties that the experience may entail. This verb, in turn, appears linked to ideas such as: “…learn to have an assessment criterion”, “have a better internalisation of information” and “have an external view of the work”, among others. Regarding the verb “believe”, it should be disregarded, given that it is widely used to express opinions or perceptions about the experience; in this sense, it is used in all cases to introduce an opinion (“I believe that”).
In the first quartile of the responses (N = 164) to the post moment question, the following verbs are extracted: improve, believe, learn, see, do, have and know. In this case, and for the same reasons as in the PRE moment, the verb “believe” should be disregarded. The verb “to know” appears in this list. This verb is linked to ideas such as “to know how to identify weak points”, “to know the opinion of my colleagues”, “to know the assessment criteria of the task”, “to know how to admit mistakes in one’s own work”, “to know how to evaluate other people’s work in a tactful way”, and “to know how to rectify and improve”, among others.
As for the verb “to see”, both in the initial (pre) and in the final moment (post), it is used to express ideas linked mainly to the ability to consider other points of view and to identify errors.
Table 10
First quartile of verbs present in the responses before the start of the experience
Verb | Frequency | Percentage |
learn | 71 | 5.80 |
assess | 66 | 5.39 |
have | 43 | 3.51 |
improve | 42 | 3.43 |
do | 32 | 2.61 |
believe | 29 | 2.37 |
see | 26 | 2.12 |
From the comparison of Tables 10 and 11, the verb “to improve” has a greater presence after the experience has been carried out. In this respect, one student at the post moment, responded:
"Thanks to peer assessment, it can allow one to know how to identify weak points in order to work to improve them. Some difficulties that may arise is knowing how to identify the mistakes made.”[1]
This same student, before starting the experience, has responded: "Knowing how to value the work done by one’s companions." [2]
In the case of another student, this change is also evident. At the post time, he/she mentioned:
“As benefits I can highlight the improvement of critical thinking, the self-assessment of the work done, accepting and increasing the external work proposals of one’s companions. Although sometimes being objective and "hard" is difficult.”[3]
While at the PRE moment, this same student mentioned: "Difficulties: Not having a good criterion to evaluate or being subjective. Benefits: Be more permissive and score better”[4]
Table 11
First quartile of verbs present in the responses at the end of the experience
Verb | Frequency | Percentage |
improve | 53 | 5.38 |
believe | 33 | 3.35 |
assess | 33 | 3.35 |
learn | 30 | 3.04 |
see | 29 | 2.94 |
do | 27 | 2.74 |
have | 27 | 2.74 |
know | 20 | 2.03 |
Another skill that is considered relevant in the development of critical thinking is the ability to “be aware” and the capacity to consider other points of view. In this sense, we searched for the word "aware” and found that it did not appear on any occasion in the responses at the pre moment, while at the post moment, three responses were found. Regarding this capacity to “be aware” and the ability to consider other points of view, the following response from one of the participating students after the end of the experience is proof of how the peer-feedback process has contributed to these aspects:
These types of processes are very beneficial, as they allow you to gain new insight and a fresh perspective on the work. I believe that this allows us to perfect the work done as well as improve our abilities when doing an assignment, since we can identify small details that perhaps we would never think of looking at. On the other hand, I also think that it is very important to evaluate our companions because it allows us to look at the work done with different eyes. In other words, before evaluating my peers, I wasn't fully aware of what was being evaluated with this work. However, after looking at his work and checking that all the requirements have been met properly, I have been able to identify some errors in my own work. However, while I have been able to observe the multiple benefits related to peer review, I consider it to be a rather tedious process and can be very cumbersome at times. That is why I think it should be a shorter process, rather than such an elaborate one.[5]
[1] The quote has been translated from the following original: “Gràcies a la avaluació entre iguals et pot permetre saber identificar els punts febles per tal de treballar per millorar-los. Algunes dificultats que es poden presentar és saber identificar els errors comesos.”
[2] The quote has been translated from the following original: “Saber valorar el treball realitzat pels teus companys.”
[3] The quote has been translated from the following original: “Com a beneficis puc destacar la millora de l’esperít crític, l’autoavaluació de la feina realitzada, aceptar i incrementar al treball propostes externes d’altres companys… tot i que de vegades ser objectiu i “dur” és difícil.”
[4] The quote has been translated from the following original: “Dificultats: No tenir un bon criteri per avaluar o ser subjectiu. Beneficis: Ser més permissius i puntuar millor.”
[5] The quote has been translated from the following original: “Aquest tipus de processos són molt beneficiosos, ja que permeten obtenir una nova visió i una nova perspectiva del treball. Crec que això ens permet perfeccionar el treball realitzat, així com millorar les nostres capacitats quan fem un treball, ja que podem identificar petits detalls que potser mai pensaríem a mirar. D'altra banda, també crec que és molt important avaluar als nostres col·legues perquè ens permet mirar el treball amb diferents ulls. En altres paraules, abans d'avaluar els meus companys, no era plenament conscient del que s'estava avaluant amb aquest treball. No obstant això, després d'examinar el seu treball i comprovar que s'han complert degudament tots els requisits, he pogut identificar alguns errors en el meu treball. No obstant això, encara que he pogut observar els múltiples beneficis relacionats amb la revisió per homòlegs, considero que es tracta d'un procés bastant tediós i a vegades pot resultar molt enutjós. Per això crec que hauria de ser un procés més curt, en lloc d'un procés tan elaborat.”