A case study: using social tagging to engage students in learning

In exploring new ways of teaching students how to use Medical Subject Headings (MeSH), librarians at Boston University's Alumni Medical Library (AML) integrated social tagging into their instruction. These activities were incorporated into the two-credit graduate course, "GMS MS 640: Introduction to Biomedical Information," required for all students in the graduate medical science program. Hands-on assignments and in-class exercises enabled librarians to present MeSH and the concept of a controlled vocabulary in a familiar and relevant context for the course's Generation Y student population and provided students the opportunity to actively participate in creating their education. At the conclusion of these activities, students were surveyed regarding the clarity of the presentation of the MeSH vocabulary. Analysis of survey responses indicated that 46% found the concept of MeSH to be the clearest concept presented in the in-class intervention.

In exploring new ways of teaching students how to use Medical Subject Headings (MeSH), librarians at Boston University's Alumni Medical Library (AML) integrated social tagging into their instruction. These activities were incorporated into the two-credit graduate course, ''GMS MS 640: Introduction to Biomedical Information,'' required for all students in the graduate medical science program. Hands-on assignments and in-class exercises enabled librarians to present MeSH and the concept of a controlled vocabulary in a familiar and relevant context for the course's Generation Y student population and provided students the opportunity to actively participate in creating their education. At the conclusion of these activities, students were surveyed regarding the clarity of the presentation of the MeSH vocabulary. Analysis of survey responses indicated that 46% found the concept of MeSH to be the clearest concept presented in the in-class intervention.

STATEMENT OF CASE
The National Institutes of Health defines controlled vocabulary as: ''A system of terms, involving, e.g., definitions, hierarchical structure, and cross-references, that is used to index and retrieve a body of literature in a bibliographic, factual, or other database'' [1]. Perhaps the best-known controlled vocabulary is the National Library of Medicine's (NLM's) Medical Subject Headings (MeSH), which is used to index the premier biomedical database, MEDLINE. The use of MeSH is essential for health care professionals when they search the biomedical literature [2]. Furthermore, the failure to utilize MeSH when searching can be a key reason that a search may fail [3]. Unfortunately, the concept of a controlled vocabulary, including MeSH, is challenging to teach and difficult to master [2].
At Boston University Medical Center's Alumni Medical Library (AML), librarians teach the concept of MeSH and its utility to more than 4,000 patrons annually through the library's information literacy program. During these education sessions, librarians introduce and demonstrate the use of controlled vocabulary, specifically MeSH, in the context of searching MEDLINE. Librarians explain the structure of the MeSH hierarchy and the indexing processes as they perform a search. In the majority of these sessions, students follow along at their own computers, which reinforces the instruction with hands-on practice. Nonetheless, librarians have struggled to explain MeSH without resorting to library jargon, and patrons often have had difficulty understanding and applying the complicated concept of a controlled vocabulary when searching the biomedical literature.
The importance of controlled vocabulary in searching the biomedical literature led the library's education team to revise its teaching strategy by creating teaching methods to more effectively convey the MeSH vocabulary. A majority of the participants in the library's education program are in their twenties, and individuals in this age group commonly use Web 2.0 innovations, including social tagging [4]. Librarians explored how social tagging could supplement instruction by requiring students to actively participate in the instruction. The librarians chose social tagging technology as a model because it has been playing an increasingly large role in health-related professional and educational services [5] and it has been found to provide useful tools for positively impacting students' information literacy and librarians' connection with students [6].
Web 2.0 is broadly defined; Giustini explains that it is an Internet technology trend that encourages ''the spirit of open sharing and collaboration'' among users [7]. After discussing various Web 2.0 technologies such as wikis, blogs, and mashups, the librarians felt that instruction on controlled vocabularies could be improved by incorporating elements of natural language or social tagging. Tagging is ''the process of creating labels for online content'' [4] utilized by web services such as Flickr and Del.icio.us. Twentyeight percent of Internet users have tagged online content, and those who are most likely to tag content are between the ages of eighteen and twenty-nine [4]. Therefore, the librarians chose a social tagging-based exercise to provide a familiar context for students to learn the MeSH vocabulary. Other connections between tagging and controlled vocabulary are present in the library literature [8], and it has been proposed that tagging may help engage users in information management [6,9].

BACKGROUND
In the 2007/08 academic year, librarians at AML developed and taught the course, ''MS 640: Introduction to Biomedical Information.'' MS 640 is a 2-credit, letter-graded course required of all students in the master's of arts in medical sciences degree program offered by Boston University School of Medicine's Division of Graduate Medical Sciences. This course was designed by librarians to teach students how to locate, manage, and add to the biomedical literature and to prepare students for further education and for health care careers. Spanning 14 weeks, the course was delivered to 186 students through a combination of small group and large lecture sections led by 5 librarian instructors.
Significant consideration was given to tailoring the course to the students' age group. One theme that was revisited throughout this curriculum planning was the desire to meet students ''where they are, so that libraries and librarians are seen as relevant and become part of their experience'' [8]. With an average age of twenty-three, the majority of the students were recent college graduates and Generation Y members. Based on student demographics and data reported by the Pew Internet & American Life Project [4], the course designers assumed that many of the students were likely Web 2.0 users. Librarians hypothesized that social tagging would better enable students to understand controlled vocabularies. To test this hypothesis, librarians designed teaching plans and assignments using natural language tagging to teach MeSH.

Pre-class exercise
After the first session, students were required to complete the homework assignment: ''What would you call it? An exercise in tagging.'' This assignment took students approximately twenty minutes to complete and was due before the following in-class session. Presented completely online via interactive forms designed and maintained by the library's web coordinator, this assignment presented students with an image, short movie clip, and a MEDLINE article. These digital objects were stripped of identifying information to prevent biasing students with external information. In this study, the librarians focused their research efforts on tracking the students' progress in tagging just the article. However, students were required to tag all three digital objects for their homework assignment. Furthermore, as this exercise was a component of course activities, students were unaware that this pre-class exercise would be followed up with a post-class evaluation as a component of a research project. This research project was reviewed by the Boston University Institutional Review Board (IRB) and deemed in compliance.
The article selected for the assignment was a humorous piece from the Canadian Medical Association Journal (CMAJ) that explored how the stomach seems to expand to make room for dessert during the holidays [10]. This article was selected for its humor and accessibility to students at all levels. Students were required to supply 5 natural language tags to describe each digital object. This assignment accounted for 5% of the students' grades.
The primary purpose of this assignment was to encourage students to think about the many ways that digital objects can be described. The submitted tags provided user-generated data that could be used to illustrate inconsistencies and disadvantages of natural language description. In this way, the librarians sought to ''actively involve learners in their own construction of knowledge'' [5], which is a powerful teaching tool.
The assignment was available to the students through a hypertext markup language (HTML) form. Once students submitted their tags using this form, the data were sent to a table in the library's MySQL database. Instructors graded the submissions for completion. Next, all identifying information in the table was stripped to anonymize the tags submitted by the students.
Using Macromedia ColdFusion 8, the web coordinator queried tags submitted by students in the table. Duplicate tags were grouped together and counted to create a weighted list of terms. Lastly, using cascading style sheets (CSS), the web coordinator assigned larger fonts and darker colors to tags that had high counts or frequencies in the table and smaller fonts and lighter colors to those that were few in number. This information was then displayed as a tag cloud. Tag clouds are visual representations of tags, allowing users to view the various terms used to describe a particular information object. The number of times an individual tag was submitted is represented by the size and color of the tag in the cloud display. For example, one tag cloud that was generated from the submitted tags describing the article is depicted in Figure 1.

Intervention
One week after completing the online assignment, students attended a session that introduced MED-LINE using the Ovid interface. Major topics included: NLM indexing, MeSH, Boolean operators, limits, and search revision.
At the beginning of the MEDLINE session, tag clouds corresponding to the three information objects presented in the assignment were displayed to begin a discussion about the advantages and disadvantages of using natural language tags. Students were asked to think about how accurately the large-sized tags described the article along with possible problems associated with this type of description. Three major pitfalls of relying on natural language tags for searching were highlighted: synonymy, spelling mistakes and variations, and specificity [8,11]. During the discussion, these problems were contrasted with traditional indexing using a controlled vocabulary such as MeSH. & Synonymy: As illustrated by the tag clouds, students submitted a wide variety of synonymous tags. For example, ''funny,'' '' humorous,'' ''humor,'' and ''joke'' were all tags submitted to describe the article. Librarians used this example to demonstrate that many words can describe the same concept and that these inconsistencies can complicate searching. & Spelling mistakes and variations: ''Dessert'' was one of the most common tags used to describe the article. However, the spelling mistake ''desert'' was submitted by more than fifteen students. By highlighting this common mistake, the librarians were able to demonstrate the likelihood of misspelling words using natural language tags, which makes searching more difficult. Variations in US and British spellings were also addressed as potential problems. & Specificity: The tag cloud demonstration also allowed librarians to point out that there was great variation in the specificity level of submitted tags. For example, many students selected broad tags such as ''dessert'' to describe the article, whereas several students submitted narrower tags such as ''blueberry pie,'' which could prove difficult when attempting to locate an article.
The visual presentation and related discussions of the tag clouds took approximately twenty minutes of the MEDLINE session, although the tag clouds and the tags themselves were referenced throughout the session. As depicted in Figure 2, the natural language description provided the librarians with a framework that was familiar to students in order to describe the application of controlled vocabularies to MEDLINE. A major benefit of this strategy was that it allowed the librarians to avoid potentially alienating library science terms, something that has been viewed as an ongoing barrier in bibliographic instruction over the past fifty years [12]. Social tagging tools and vocabulary were specifically utilized to explain four major MEDLINE concepts that had previously been difficult for the librarians to teach and for students to understand: & PubMed in-process citations: Librarians utilized Web 2.0 vocabulary to describe the relationship of PubMed in-process citations to MEDLINE as citations that were waiting to be tagged before being included in MEDLINE. & MeSH Mapping Tool: Using Web 2.0 vocabulary, librarians demonstrated how natural language keywords entered into the search interface were mapped to MeSH terms. & Scope Note: As the librarians showed students the MeSH Scope Notes, they pointed out that words under the heading ''Used For'' were like natural language tags. Librarians explained that the ''Used For'' terms represent the variation in natural language for the MeSH concept and can be entered into the search box to retrieve appropriate MeSH terms. & Full Record and Explode: The full-record display was described as NLM's structured tag cloud because it provided a visual representation of all the ''tags'' that were selected to describe the article. The librarians also took advantage of this visual representation to explain that the ''major'' or ''focus'' terms were akin to the tags that would be larger in size in a more traditional tag cloud, as illustrated in Figure 3. Tag cloud Notice that ''dessert,'' one of the most popular tags is displayed in a larger font and darker color due to the high number of times that it was submitted by the students.

Post-class evaluation
Following the hands-on demonstration of MED-LINE, students completed an online evaluation. In this exercise, students retagged the original article. The article was again stripped of any identifying citation information to prevent students from locating the article's citation and using its related MeSH. In contrast to the pre-class exercise where students entered natural language tags, students submitted five MeSH terms that described the article. To complete the post-class exercise, students were encouraged to use Ovid's MeSH mapping tool. The instructors allotted fifteen minutes of class time for this activity. As with the pre-class exercise, the post-class evaluation was an interactive online form that sent the student's tags to the library's database.
After completing the post-class evaluation, students had the option of completing an anonymous online survey that asked them to identify the concepts that they felt were explained most and least clearly. Because this MEDLINE training was part of a fourteen-week course, the instructors used this information in subsequent sessions to target aspects of the search process that students still found difficult.

EVALUATION
Following the pre-class exercise, intervention, and post-class evaluation, librarians retrieved the tags that students submitted for both the pre-and post-class activities. Using MySQL and Macromedia ColdFusion, the web coordinator queried the database for all the tags that were submitted to describe the article. In compliance with IRB regulations, tags for the article from both the pre-class exercise and post-class evaluation were anonymized and stored in a new table in the database.
Following the anonymization process, the librarians tabulated the number and frequency of natural language tags submitted in the pre-class exercise that were valid MeSH terms by using the MySQL Count Function. To confirm whether or not tags were valid MeSH terms, each tag was checked using the Ovid MeSH mapping feature. This same process was repeated to determine the number of MeSH terms submitted in the post-class evaluation. Librarians then compared the number of MeSH terms submitted in the pre-class exercise to the number of MeSH terms collected in the post-class evaluation. This comparison was used to assess the value of the instructional intervention. The librarians also examined student tags to determine how many were in agreement with the MeSH terms attached to the article's citation by NLM indexers. These data have been summarized in Table 1.
The data from the optional survey were also collected and analyzed. Results from the survey were separated into 3 major categories: responses that identified the MeSH controlled vocabulary and indexing process as the clearest concept, those that identified MeSH as the least clear concept, and those that did not mention MeSH in their responses. One hundred seventy-one students completed the optional online survey. Seventy-eight (46%) students specifically mentioned MeSH as the ''clearest'' concept presented in the in-class session. However, 20 students (12%) identified MeSH as the ''muddiest'' concept in the session. Seventy-three (43%) students did not specify MeSH as either clear or muddy.

OUTCOMES
A comparison of the pre-class, post-class, and survey data provided useful information for evaluating the hypothesis, which theorized that integrating social tagging could be used to convey the complex concept of controlled vocabulary in relation to searching the biomedical literature. The pre-class and post-class data demonstrated an increase from 9.2% to 78.2% in the students' ability to recognize and select MeSH terms related to a specified MEDLINE article.
While it is true that students were asked to supply natural language tags for the pre-class exercise (as opposed to MeSH terms in the post-class evaluation), this comparison of the results still has some merit. Students were initially asked to supply natural language tags in the pre-class exercise simply because they had yet to be introduced to MEDLINE and MeSH. When librarians discussed the results of this exercise with students by means of the tag cloud, the fact that students managed to select MeSH terms only 9.2% of the time was used to illustrate the shortcomings of a natural language tagging and to generate inclass discussion. In the pre-class exercise, students were asked to ''enter five terms or tags that you think best describe the article.'' The fact that the terms they considered ''best'' were only 9.2% accurate compared to the professional standard generated a great deal of discussion: If what students thought was ''best'' turned out to be wrong, how could they conduct better searches? The answer is use of a controlled vocabulary, specifically MeSH. As Lowe and Barnett point out, the ability to utilize MeSH when searching the biomedical literature is crucial [2]. This intervention taught students how to select valid MeSH terms and the importance of a controlled vocabulary. These The full record display was described as the National Library of Medicine's structured tag cloud because it provides a visual representation of all the ''tags'' that were selected to describe the article. Survey data provided from 171 students indicated that after the in-class intervention and post-class evaluation, 46% of survey respondents (n578) found MeSH to be a ''clear concept,'' while only 12% (n520) found MeSH to be a ''muddy concept.'' Instructors had several opportunities throughout the semester to reinforce the concepts that the students identified as ''muddy,'' including controlled vocabularies, and to reuse the Web 2.0 vocabulary.
The use of social tagging provided the librarians with a teaching method that allowed them to ''connect to the real world of our client population'' [13] and to overcome barriers in library instruction, such as the use of library jargon [12]. The use of social tagging also enabled the students to actively participate in their own learning by supplying the ''tags'' that fueled subsequent discussions. This student participation enabled librarians to present this complex concept using concrete examples that were familiar to the students. The use of tag clouds provided a means to show, not tell, students about the pitfalls of natural language tagging and the benefits of a controlled vocabulary. Lastly, the social tagging concepts were easily incorporated into the instruction and no outside resources were needed.
A primary limitation of this study was that the collected data did not allow a direct comparison between the students' ability to identify MeSH with their search abilities. Students were asked to supply natural language tags, but they were not explicitly told not to use MeSH terms, so it was possible that some students used Ovid's MeSH browser to complete the assignment. Students were also not surveyed to determine any level of familiarity with controlled vocabularies in general or MeSH in particular. Many of the students worked in laboratories or other academic settings before enrolling at Boston University and might have had similar training before attending these library sessions. In the future, an objective evaluation such as a pretest/posttest evaluation of the students' search skills would provide valuable data that could be applied widely.
Upon reviewing the post-class evaluations, the librarians were initially concerned that 12% of students still found the concept of MeSH ''muddy.'' This might be attributed to the fact that librarians only had 10 minutes to introduce the highly complex concept of controlled vocabulary. Also, each individual librarian had freedom in delivering this session, which might have led to some variation in instruction. In addition, the term ''muddy'' is rather vague and was never clearly defined. Students could choose not to complete the survey without penalty, and some students elected not to share their opinions about the instruction. In the future, all instructors will be asked to follow a script to ensure standardized delivery of instruction. Furthermore, the word ''muddy'' will be replaced by a well-defined concept and the survey will be required.
The population tested in this research project is also a limitation. The 186 students were all students in a graduate medical sciences program with an average age of 23. Thus, the collected data may or may not be applicable to other student populations, such as medical or public health students. Additionally, it is unclear whether this teaching method would be effective with other groups, such as faculty members whose average ages are higher and who may therefore not be familiar with Web 2.0 technologies.
An additional shortcoming of this study related to evaluation is that students were not restricted to the MeSH terms assigned by NLM for a term to be ''counted.'' In some cases, this meant that students submitted MeSH terms that were unrelated to the article's content. For example, although the majority of MeSH terms selected by the students were applicable, some, such as ''Douglas' Pouch,'' a term relating to dentistry, were also selected. The librarians felt that it was unnecessary to hold students to the standard of the professional indexers. Similarly, some submitted tags were still appropriate to the article, although they were not selected by NLM indexers.

CONCLUSIONS
Starting with the hypothesis that a mutual familiarity with social tagging concepts would enable librarians to use this current technology as a tool to more effectively teach students, the librarians embarked on this project. After reviewing the data, the librarians found that this exercise-including the pre-class activity, intervention, and post-class evaluation-helped clarify the concept of MeSH, which can potentially impact the students' ability to search the biomedical literature. In this exercise, librarians learned the utility of using social tagging technologies in engaging students in creating and applying their own knowledge and the importance of presenting complex concepts in a framework that was familiar to students. This lesson, namely that couching unfamiliar concepts in the context of popular technologies, can lead to more effective teaching, is vitally important, and will remain long after social tagging technology becomes passé. As new tools are created, they too will be used in information literacy education.