: Placing Discovery Within the User’s Workflow

Leave the Browser Behind


Introduction
Since the mid 1990s, the Web has been a center for information discovery, retrieval, and creation. Seemingly overnight, the Web changed how library users found and utilized information. Where once there was a physical door to the library, there is now an electronic door in the form of the library's web site, the gateway to library subscription databases and other discovery tools. Yet as the Web (and information technology in general) has matured, finding and retrieving information has become more fragmented. Users previously had two choices not so long ago --get it from the shelf or find it electronically via a database. There are now infinitely more options for finding information, including globally available search tools like Google Scholar and academic communities, such as Academia.edu and ResearchGate.
Leave the Browser Behind 3 There are also more options for storing and sharing information. Users may adopt citation management software to store, annotate, and cite works, including Mendeley, Zotero, or Endnote. Broadly speaking, citation management software has the capacity to change how users manage their scholarly information collections. Citation management software provides a central place for several activities in the scholarly workflow --storing, annotating, citing, and sharing. At present, discovery of new materials is available within these tools, but user adoption of discovery options is not high, and comes with a learning curve.
This article explores the concept of embedding discovery within citation management software. First, we look at the most frequently adopted citation management software in current use. We review efforts to place information retrieval within citation management software, and explore related literature on situating discovery more intuitively for researchers. Building upon this prior research, we describe the results of usability testing and post-usability testing interviews focused on new enhancements added to Zotero, citation management software. The enhancements center in part on embedding discovery (finding new information) within the Zotero interface. This research follows the findings from a qualitative, ethnographic study funded by the Andrew W. Mellon Foundation, analyzing how users at Penn State University, University Park, find, store, annotate, cite, and share information resources. 3 The findings from this study indicated that two pervasive areas of disconnect exist within the scholarly workflow: discovering and saving new materials, and archiving self-authored work. We share our recent usability testing results, which confirm that faculty users welcome access to library resources within software significant to their individual research workflow. Our usability testing also found a desire for more automated services (anticipatory completion of citations, analysis of existing bibliographies), as well as an enthusiasm for commercial services such as Google Scholar and Academia.edu.
As discovery shifts away from web-based platforms and fully situates within software and other platforms, libraries must begin locating discovery services within these tools as well.
Leave the Browser Behind 4 This choice will help users more intuitively find relevant information sources while connecting those resources to other critical phases in the scholarly workflow, thereby maximizing productivity and minimizing information loss.

Review of Discovery Tools Within Citation Management Software
Like the Web, electronic citation management software became more readily available in the 1990s, and has experienced increased user adoption since that time. Originally envisioned primarily as a tool for citing sources and building bibliographies, citation management software has recently begun to increase its scope. Scholarly publishers, such as Elsevier and Nature, have already begun to see the utility of embedded discovery within citation management software, and have developed discovery options within the native interface.

Endnote
One of the oldest reference managers, Endnote has been in existence since the 1990s, and remains the most traditional software program of its kind. Endnote is produced by Clarivarate Analytics, and as such, is embedded within another company product, Web of Science. 4   interface. These feeds can be for a search, a journal title (and affiliated new articles) or other relevant syndicated content. The user views the feeds within the Zotero interface, and selects specific items from the feed to add to their Zotero library. There is a complexity to this feature, in that a user must know how to create and put to use an RSS feed. As this is a very new service (still in beta testing) within Zotero, its utility and level of usage remain to be seen.

ReadCube
ReadCube, software developed in 2011 by Labtiva and Digital Science, is a reference management program with search at the center of the interface. New users are encouraged to deposit their PDF collections within ReadCube, which then processes the articles by DOI and other metadata, and (in the paid version) allows users to automatically connect with and retrieve articles citing and cited by articles in the user's collections. In other words, ReadCube places discovery of new materials entirely within the user interface. ReadCube also has agreements with several large publishers, including Nature Publishing Group, Frontiers, and Wiley Publishing to feature their journal articles (including the option to purchase access to articles) within their interface. The user has the option to search several different catalogs from within the ReadCube interface, including Google Scholar, PubMed, and the ReadCube catalog. The option to authenticate with an institution specific SFX proxy to aid in full text discovery is also available. ReadCube also recently acquired Papers, another reference management tool featuring embedded search options. 7

Other Citation Managers
A wide range of reference management systems exist for users with specific needs.
ProQuest owned Refworks is an older tool, created in 2001 and entirely Web-based. Marketed heavily to libraries, access to Refworks can be embedded within library databases. Refworks also takes institutional affiliation and employs it within the interface as an institution specific link to full text articles. 8 Sente reference management software is Mac only, and features an embedded browser directly within the interface. The user does not need to leave the Sente Leave the Browser Behind 7 environment to search for and retrieve resources. 9 Sente's 'targeted browsing' features allow users to search for information on supported websites, seeing automatically which articles are already in their library. Within Sente, options also include the ability to automate regular searches of selected databases (PubMed, Web of Science, Z39.50 tools). Papers 3, also Mac only software, features an internal browser as well. 10 While discovery is more separate in Sente and Papers 3, it is integrated within the software in a manner that allows users to find, store, annotate, and cite from one tool. Perhaps because it is Mac only, Sente and Papers 3 have not experienced the adoption levels of Mendeley, Zotero, and Endnote. Other tools with smaller user bases (not studied in this article) include BibDesk, JabRef, Cittavi, CiteULike, BibTex, and Connotea.

Analysis of Discovery within Citation Management Software
Integration of library discovery services within citation management software remains limited. Of the tools detailed in this article, there are few options to link directly within a reference management interface to library databases or other services. ReadCube provides perhaps the best access, asking users explicitly for their institutional affiliation and authentication information. ReadCube also prompts the user to login for authentication as the article retrieval process automatically begins within the interface. Endnote provides the opportunity to enter an authentication URL and an SFX resolver in preferences to aid in connecting with article PDFs via 'Find Full Text'. Endnote Z39.50 search of library catalogs and other resources is rudimentary and perhaps one of the first examples of a search tool within citation software. Endnote also supports searching selected subscription databases (although this feature typically only works via a private subscription rather than an institutional one).
Zotero features RSS feeds for discovery, which could be generated from library subscription databases. Mendeley offers the option (on their web interface) to enter an authentication URL and connect through the library to subscription articles. While this option was changed to only provide a DOI search to a journal provider, Mendeley pledged to change this back in the near Leave the Browser Behind 8 future. 11 (As of March, 2017, this feature has not yet been added back into Mendeley).
Mendeley also features a newer service, Mendeley Suggest, which analyzes the user's library and offers recommended citations based on the user's library data. ReadCube offers a similar service as well, and their enhanced PDF optimization makes references within an individual article clickable, simplifying retrieval of related works. If anything, the current development trajectory for reference management software indicates that emphasis on the journal provider, rather than the user's academic institution, will remain at the forefront in the near future. Work and advocacy is needed from academic libraries to ensure greater recognition of the role of library services in reference management software use.

Literature on discovery and what users need.
Ithaka S&R has been a major source for research on faculty members research behavior and discovery practices. The most recent report, the 'Ithaka S&R Faculty Survey 2015' looks once again at where researchers begin their research --a question that has been explored in this series since 2003. 12 In previous years, faculty were more likely to use a discipline-specific, electronic subscription database (such as Web of Science, for example) than they were to use a broader search tool (such as Google or Google Scholar). The 2015 report notes a shift towards the broader search tools, with faculty now equally as likely to search a broader search engine or a subscription database to serve their research needs. However, the report also discusses the rise in use of the library web site as a discovery portal (a trend which has been on the rise since 2012, perhaps coinciding with the library web development trend towards creating a front page that functions primarily as a search interface. 13 In "Meeting researchers where they start: Streamlining access to scholarly resources", Roger Schonfeld outlines the six areas of failure relative to discovery and libraries, including the difficulty of off campus access and the shrinking profile of the library web site as a singular search destination. He notes that "Mechanisms for content access succeed only when they conform to Lorcan Dempsey's observation that "discovery happens elsewhere." Authentication Leave the Browser Behind 9 and authorization to licensed e-resources must work effectively without regard to the researcher's starting point." 14 Further, Schonfeld mandates that, "To understand researcher practices, user experience specialists both in a library and a content provider setting should examine the researchers' actual practices. Rather than trying to focus on specific tasks related to the system that their current project covers, as is all too often the approach taken, a more holistic, ethnographic perspective is vital." 15 Reference management software. Research articles on citation management software (from within a library-focused lens) have focused primarily on the library's provision of support for citation management software users. Less common are articles looking at the software itself, and opportunities for embedding library services, including discovery options.
"As the process of citation management changed, the social dimension became more important, as citations are 'social objects' around which connections can be made. Today, citation management programs not only provide a repository in which you can store your work, but also allow you to share your work and to search the work of others." 16 Dempsey notes the significance of citation management software for academic libraries beyond simply supporting use of these products: "As some of these researcher-facing "productivity" services are repackaged as licensed institutional offers, libraries will face important decisions about sourcing and procurement of workflow support services." 17 Discovery happening elsewhere.
In "Thirteen Ways of Looking at Libraries, Discovery, and the Catalog: Scale, Workflow, Attention", Dempsey discusses the movement of user search behaviors from the local library catalog to the "network scale" --for example, searching Worldcat.org, a consolidated catalog, for a locally owned book. In this environment, he notes that "syndication and leveraging strategies" are needed, including connecting to networked resources, such as link resolver recognition within Mendeley or Google Scholar. 18 In this respect, the local catalog remains the Leave the Browser Behind 10 data source, but the user accesses the data via a more globally available resource. Dempsey states, "The use and mobilization of bibliographic data and services outside the library catalog is an increasingly important part of library activity. This is especially important as "discovery increasingly happens elsewhere" -in other environments than in the library." 19 "Thinking the Unthinkable: A Library Without a Catalogue? Reconsidering the Future of Discovery Tools for Utrecht University Library" describes the challenges facing libraries with regard to search and the significance of a destination web presence. 20 Citing user study trends, the Utrecht University Library decided to implement several approaches with regard to discovery: the decision not to implement a large discovery search tool; a commitment to embed library collection-related metadata to global initiatives and re envisioning the local online public catalog as a tool primarily for known item searches.
Grant identified the concept of discovery within context specific software as a 'knowledge creation platform". 21 He highlighted the components of this platform as discovery, social networking, ready access to library expertise and services, and, perhaps most significant, "integrated tools for creating new knowledge." Grant further specified that the knowledge creation tools "should cleanly integrate within the interface of the KCP so that again, the enduser does not need to step out of the interface in order to actively work on their research or assignment." 22

Methodology and Prior Study Results
In 2012, the author received a grant from the Andrew W. Mellon Foundation to conduct research on faculty management of information within the scholarly workflow, including discovery and self-archiving of significant works. 23  The results of the 2012 study on faculty scholarly workflow were shared in the article, "Personal Library Curation: An Ethnographic Study of Scholars' Information Practices." 25 The article presents the results of a 2012 web-based survey of scholars (n=196), as well as the analysis of ethnographic interviews with 23 Penn State faculty members during the same time period. The survey and interviews indicated, across faculty, a preference for electronic searches for information sources, most often using commercial sources (such as Google or Google Scholar) than more local, library-based resources (although Humanities researchers were more likely to start with library databases). Faculty also relied heavily on their own personal collections of article PDFs and data. With regard to citation management software, the 2012 study found limited use of the software, slightly more than 50% of surveyed Penn State faculty in the Sciences, and 30% in the Humanities. Faculty noted dissatisfaction with citation management software as a reason for non-adoption. When queried, survey respondents indicated as a majority that the responsibility for education on workflow management resided with the scholar, and not the library or campus librarians. With the results of the survey and interviews combined, the study found overall that faculty experience a pervasive disconnect between the activities of finding information (typically in a web-based, commercial service) and annotating and citing the information (within Microsoft Word or citation management software).
Similarly, the act of archiving was also disconnected from the research process, with a majority of respondents indicating that they had lost important files or data. With this portrait of a disconnect existing within the scholarly workflow, particularly within the areas of discovery and self-archiving, the 2014 study was created to begin to explore and address this need.

Study on Software Optimizations and Impact on the Scholarly Workflow
In the second stage of our study, we conducted usability testing of software optimization.
This usability testing was focused on new enhancements in the areas of discovery and archiving added to Zotero, citation management software. The author worked with Zotero software developers to embed new functionality within Zotero, based on the findings re: the Leave the Browser Behind 12 disconnectedness of the faculty scholarly workflow within the first phase of this study. Two specific enhancements were added to Zotero (and as of 2016 are publicly available to all users) as a result of the initial study. 26 The first, addition of RSS feeds, addressed issues with the disconnected nature of discovery in relation to citation management software. The capability to add RSS feeds of any kind (including those pointing to journal level table of contents or targeted article or database keyword searches) was embedded within the Zotero interface. The user Camtasia recordings were created of the respondents' paths on the computer during usability testing, and audio recordings and transcripts were created for the sessions as well.
In general, usability testing subjects were not able to independently navigate the new enhancements within the Zotero interface. The option to embed RSS feeds is located under a very small 'box' icon within the Firefox Zotero interface under the unrelated ability to create new Leave the Browser Behind 13 groups within a Zotero library. In other words, the user must hunt for this option. Once they have found it, the user must manually input an RSS feed URL --there is no automation to help users identify relevant feeds or automatically receive and select URLs. This also proved, across the board, to be a challenge for subjects. In order to provide a feed URL, users had to leave the Zotero interface, navigate to a relevant information source (such as a journal homepage or a scholarly database), and then intuit how to retrieve an RSS feed. Once a feed was found and put into Zotero, a majority of subjects were pleased with the results, noting how useful it was to receive new articles from within the Zotero interface.
Similarly, it was very hard for subjects to connect the Zotero 'My Publications' folder with ScholarSphere (Penn State's institutional repository). The architecture of this connection required that a Zotero user must navigate to ScholarSphere and then (from within ScholarSphere and on their individual account profile page), connect their ScholarSphere profile to Zotero. There currently is not a mechanism to initiate the connection from within Zotero itself.
While a majority of usability subjects were pleased with the new Zotero 'My Publications' folder optimization, most subjects were not sure why they would want to also connect their profile with a local Penn State service such as ScholarSphere.

Post-usability interview findings
In post-usability interviews, a majority of subjects referenced the disconnectedness of their scholarly workflow, and indicated the need for increased discovery options within the citation management software interface. General trends that emerged across the interviews and testing addressed 'smarter' functionality within software (citation management software and word processing software), the value of commercial, broadly available scholarly services (such as Google Scholar, Academia.edu, and Research Gate), and a perceived lack of value for local storage and social networking services, including the institutional repository. While there are detailed findings in all of these areas, we will focus in this article on the findings related primarily to discovery within the Zotero interface and within the workflow as a whole.
Leave the Browser Behind 14

Automating and Connecting the Scholarly Workflow
In general, greater automation of the scholarly research process was desired by multiple respondents. Several subjects mentioned the ability for citation management software to automatically 'complete' incomplete citations (according to individual citation style needs) without intervention of direction from the user. Another subject mentioned 'anticipatory' automation, where (within Microsoft Word) the citation management software would automatically complete a citation based on the references discussed a specific paragraph. A tenured faculty member shared a 'wish list' of optimizations that he termed as "customizable automation": these included natural language searching, validation of localized services from within the tool, and notifications when new citations are found so that user can validate entries.

Discovery Feeds
Four of the subjects interviewed in the study indicated that they already receive new content alerts (for new journal articles, etc…) in their email accounts. All of the interview subjects were positive about the new discovery feeds within Zotero. The graduate students in our study preferred to have email alerts continue, in addition to receiving new citation feeds within Zotero. One graduate student noted that this dual notification would be a good reminder to go back into Zotero and engage with new sources. A tenured faculty member said of the utility of the feeds, ""I think the main role is obviously discovering a new work. Since I do some variations on that, I would probably use it. It's interesting now that I really think about it. I used to use RSS feeds all the time." This faculty member also noted that "It's much random now than it used to be. It's much more pull rather push." New research (for whatever reason) had stopped flowing naturally to his workflow, and he welcomed an option that might change this.
Another faculty member stated that he saw the feeds as a positive enhancement, yet would not use them within Zotero. He preferred to have his alerts continue to arrive in his email, where he could search his entire email collection to find and retrieve specific items.

Embedded discovery services
Leave the Browser Behind 15 All of the usability subjects were enthusiastic about multiple services (including library authentication and content) within the citation management software interface. In one participant's words, it would give her the ability to "multitask on one screen." Another was positive about this, and stipulated the institution-specific authentication must also be integrated into the interface in order for this enhancement to be useful. A faculty member expressed a desire for natural language searching within the citation software, as well as the ability to automatically receive new relevant citations within the interface, with an alert for the user to validate and accept citations. Another faculty member had a more intricate idea for embedded services --analysis of existing bibliographies, combined with embedded discovery and authentication to bring relevant new works automatically to the user. In essence, this idea is that the citation management software looks at publications the user has written (and deposited in their My Publications folder). It extracts the data from the publication bibliography, and retrieves cited works that are not currently in the user's library, for the user to accept. It also does analysis on relevant works (based on the cited works in the bibliography) and asks the user to accept those citations as well.

Discussion and conclusion:
Our post-usability interviews show a continued frustration and unmet desires across multiple phases of the scholarly workflow. The participants were uniformly focused on primarily using commercial search and software tools (Mendeley, Zotero, ResearchGate, Google Scholar), and saw the benefits of accessing and utilizing resources on platforms that are not primarily locally developed. They were open to embedded discovery services, and utilizing these services within citation management software. It seemed that perhaps the biggest user barrier was in learning how to use and integrate citation management software into one's workflow. Once that was achieved (as was the case with a majority of our participants), the idea of adding on additional services seemed natural and realistic.
Leave the Browser Behind 16 What do these findings mean for citation management software designers? For the Zotero designers, there are several clear outcomes to share. Our users as a whole liked the addition of RSS feeds into the Zotero environment. They were positive about the ability to discover new content within the Zotero interface. Despite this enthusiasm, they were largely unable to independently navigate the steps to activate RSS feeds in Zotero. Zotero needs to think about where they situate the RSS feeds within the interface. Our testing shows that it does not work to situate the service underneath an icon also used to create groups. Zotero also needs to look at the mechanism in place to create feeds. Currently the onus is on the user to go out, find a journal or site of significance, navigate to a feed, and bring the feed back to Zotero, populating it in the feed URL window. Perhaps there is a way to automate this service for users, so that a keyword search brings up a range of options from which the user can select different feeds (such as occurrence of keywords in journal title(s), subscription databases of relevance, or other search tools offering relevant results via feed)? Our users expressed a desire for better automation of tasks, across the board. Beginning by looking at strategies to guide the user through RSS feed creation is definitely an option. Zotero should also consider the primacy of email as an information collection (and as an alert / reminder service) for users, offering the option of emailing users when new citations are found by the feed. Several of our users valued their email collections, and looked to their email as a reminder to return to workflows in more disconnected applications, such as Zotero.
In addition to additional enhancements to the RSS feed option, Zotero should also consider at ways to mine user's existing citation collections for additional recommended citations. As we previously mentioned, other citation managers have begun this service, including Mendeley Suggest and ReadCube Recommendations. It makes simple sense to mine the data that the researcher has already deposited to increase the utility of Zotero within the scholarly workflow. The idea suggested by one of our subjects to pay special attention to importing works already cited in the researcher's publications would again likely be a huge Leave the Browser Behind 17 value-added service for users. Customization, automation, and predictive (i.e., smart) services are what the users in our study clearly wanted, and while Zotero is making gains with their new customizations, there is still more work to be done.
There are also recommendations for how Zotero continues to develop integration within word processing software. A majority of our participants indicated that they wanted better integration with Word processing software, including infusing discovery into the word processing environment. How could this work? It might mean embedding semantic web capabilities within word processing software, for uses such as predicting / completing citations for the user, finding works attached to quotes used in text, etc… This is a new area of development for citation management software, and one that should be taken seriously.
Like the conclusions leading away from local tool provision in Kortekaas' 'Thinking the Unthinkable', these findings are a clarion call for academic libraries' discovery, storage, and instructional strategies. The significant 'critical mass' of other researchers that one of our subjects referenced is not present on local tools, such as the online catalog or the institutional repository. The general focus of the usability study and post-usability interviews was to determine the utility of discovery and archiving within the Zotero interface. The findings were unanimous among our subjects that localized content and services are welcomed within citation management software. The challenge now is for software providers, publishers, and academic libraries to begin embedding content where our users are rather than where we want them to go (library web sites, publisher web sites, subscription databases). This requires cross-institutional work on the part of academic libraries; large developments like this can't occur on a campus by campus basis. It also means that academic libraries must begin to give up local development of services that are not heavily or intuitively used by their core user groups. It also may be time to move away from local discovery initiatives, including metasearch tools, and embrace search interfaces that are already embedded within the typical user workflow. From an ego, vanity, or branding perspective, this will be difficult for academic libraries. Yet, if it means that the Leave the Browser Behind 18 resources a user needs are directly (to borrow a phrase from Lorcan Dempsey) "in the flow" when and where they need it, haven't the library's goals as a content provider been met? 28 With a continued focus on how users find information outside the library web site, and beyond that, outside the web browser, software providers and libraries can begin to close the gap and bring resources to users more easily from directly within their research workflow.