Lessons Learned in Partnerships and Practice: Adopting Open Source Institutional Repository Software

INTRODUCTION After the establishment of the University Archives at the University of Arkansas, Fayetteville, it became apparent that processes needed to be established for collecting, preserving, and providing access to born-digital materials. The University Archivist established partnerships across multiple departments within the Libraries and with faculty and staff of colleges, schools, and administrative units across campus to test open source repository software and develop collections to fulfill this need. DESCRIPTION OF PROGRAM This case study examines three specific projects and workflows providing access to digital undergraduate honors theses, university serials, and music concert recordings. Lessons learned during the project include the success strategies for partnership formation along with the identification of project processes that need improvement, such as promotion and long term preservation. NEXT STEPS AND CONCLUSIONS The campus has transitioned to a proprietary system for the official institutional repository. However, the pilot projects examined in this study filled intermediate needs: providing a group of files and metadata for the official institutional repository and helping the Libraries to evaluate the sustainability of open source platforms. Staff gained experience and identified areas where improvement was needed. However, the most successful aspect of the project was establishing partnerships that will carry over to the new repository.


INTRODUCTION
Managing digital materials is an everyday task for many librarians and archivists. The University Archives was established at the University of Arkansas in 2010 with one full-time archivist (later a part-time student was added). A collection policy for the archives was developed in 2010, and many of the desired items on the collection policy were only available in born digital format. It was clear that digital materials would play a role in preserving the institutional memory of the University.
This case study will examine the set-up of a platform to support Arkansas's digital archives collecting, using open source software commonly used for institutional repositories. However, the article does not focus on software development but rather on the many other aspects involved to make a repository successful, especially the diverse skills and partnerships that were so important during this project. While these initial pilots did not produce the campus-wide institutional repository, they did provide a foundation and valuable lessons in both successful practices for the project as well as identifying aspects that needed improvement. During this initial phase, the most important lesson learned was the role that partnerships played. Partners assisted with technical development, establishing metadata standards, collecting content, and establishing workflows across every college and school on campus.

LITERATURE REVIEW
There have been many articles over the years discussing different aspects of institutional repositories. A few articles will be mentioned which focus on soft skills or non-technical skills found useful during the set-up process and the partnerships formed during the implementation of an IR to accomplish a specific purpose.
Institutional repositories are comprised of more than just software. Lagzian, Abrizah, and Wee (2015) identified six factors important to an institutional repository: management, services, technology, self-archive practices, people and resources. Their survey results showed that "The people and resources factors received the highest mean value in terms of importance" (p. 201). Likewise, Krevit and Crays (2007) list some of the nontechnical aspects they found important in addition to the technical skills as they set-up an IR: understanding the political climate of the institution, partnering with political or influential allies, establishing clearly defined goals, and delivering a consistent message (p. 122).
Technical and non-technical expertise can be found in a variety of places by finding interested partners. Bruns, Knight-Davis, Corrigan, and Brantley (2014) discuss participation and cooperation across library departments including "resources from Library Administration, personnel from the Circulation Department and Library Technology Services, and content from the University Archives" (p. 246). University archives often have digital content ready to contribute as many archives are already collecting faculty and student scholarship and as "major reports, publications, departmental newsletters, and key announcements increasingly appear only in electronic format" (Bicknese, 2004, p. 89). Bicknese goes on to suggest "Archivists' experience in selecting records of enduring value should be of interest to committees trying to decide what is appropriate to allow into the on-line repository" (p. 88).
Other articles discuss how partners were identified outside of the library. Krevit and Crays (2007) discuss a partnership between the library and the School of Nursing (SON) where the "Library supplied digital archiving and cataloging expertise and initial funding; the SON supplied faculty participation, staff support, and the test data" (p. 119). SON became a partner to further goals of the school.
To address critical nursing shortages and extend their reach, faculty, researchers, and clinicians at the SON are creating a new pedagogical environment that includes e-learning, archived streaming video, and other educational technologies. The concept of the institutional repository fits this new paradigm very well. (2007, p. 119) Tosaka, Weng, and Beh (2013) discuss their process hiring students to provide the programming skills needed to set-up the repository.
the content recruited relied heavily on archival materials including those traditionally collected by university archives. It is clear from these articles that librarians, archivists, and technology professionals make good partners when it comes to institutional repositories.

Purpose
Possible goals for an institutional repository discussed at the University of Arkansas included: increasing access to the institution's scholarship, fulfilling external funding requirements, archiving internal records, and preservation of materials. The main purpose for this initial project was increasing access to the institution's scholarship and providing access to digital university records. Librarians and archivists wanted to provide access to a set of scholarship-undergraduate honors theses-which only had limited access through individual colleges. Also, as more and more items from the University Archives collection policy were only available in digital format, it was essential to the University Archives to have a method to manage these internal records. Previously, the University of Arkansas Libraries used multiple avenues to create digital exhibits of digitized items through the creation of web pages and the use of CONTENTdm. Since there were almost 100 different cataloged titles of university records along with other groups of uncataloged born-digital materials that needed to be managed, using CONTENTdm would have either required setting up individual collections for each title or putting unrelated titles together in the same project. The University Archivist therefore explored alternative options for maintaining and providing access to born-digital content.

Forming Partnerships
Since the inception of the University Archives, faculty and staff members-from the library and from across campus-have approached the University Archives seeking assistance with managing born digital content. The University Archives also partnered with library-and campus-based faculty and staff during the creation of a digital repository. Each partner played a distinctive role in the process from providing content to technical support to assistance with content management.
One of the first partners was the Engineering and Mathematics Librarian, who had been collecting undergraduate honors theses in digital format from the College of Engineering for several years prior to the start of the University Archives. Electronic theses and dissertations are a common place to start and usually comprise a large portion of IRs. As Yakel et al. note in 2008 "they are 'low-hanging fruit' for IRs" (p. 337). Among institutions surveyed by these authors "theses and dissertations (along with other student work) account for 45% of all documents in all IRs" (p. 338). While graduate theses and dissertations had always been collected by the Libraries, undergraduate honors theses had previously been maintained by the individual colleges. Librarians determined that collecting and providing access to undergraduate theses could both help future students in writing their own theses and to share the scholarship of our students with other scholars outside the university.
About the same time, the head of the Libraries' Systems Department approached the University Archivist with a proposal to test a DSpace platform. The University Archivist saw this as the perfect opportunity to evaluate this platform for managing born digital content. The parties involved agreed that undergraduate theses would be the first test group. While the Libraries' Systems staff set up the DSpace platform, the University Archivist and the assistant to the Engineering and Mathematics Librarian formatted the initial digital files. Most of the files were PDF documents. Word documents were converted to PDF. Less than half a dozen of the theses were in paper format, and they were scanned into PDF form. An OCR layer was added using ABBYY FineReader to make the documents full-text searchable. After consulting with other librarians, the University Archivist worked with Systems staff to customize the DSpace interface and to set up communities, sub-communities, and collections to group the files by originating department.
The University Archives next partnered with librarians in the Technical Services Department for the creation of metadata. DSpace primarily uses Dublin Core, but both the Archives and Technical Services wanted the metadata to coordinate with the Libraries' existing online catalog and with the master's theses catalog records. To achieve this, librarians and the archivist co-wrote guidelines for creation of Dublin Core and the mapping of fields to MARC 21 for the invidiual records.
In the case of the previously collected digital theses, the catalog record and the metadata both had to be created from scratch. These metadata guidelines, along with the organization of communities and sub-communities, were altered slightly after the implementation of Vireo (discussed in the following section), a software designed to aid in thesis submission. Vireo uses metadata standards from the Networked Digital Library of Theses and Dissertations (NDLTD). This metadata schema is a version of Dublin Core with additional non-Dublin Core elements, details of which can be found on the organization's website (NDLTD, 2010).
Only a few test files were loaded into the first DSpace instance. The full backlog of undergraduate theses would not be uploaded until after the addition of key partnerships from outside the Libraries (discussed in the following section). After the backlog was uploaded into the second DSpace instance, the University Archivist formed another key partnership with the Libraries' Web Services Department. Staff in the Web Services Department were able to format the Vireo and DSpace user interface to match other University of Arkansas Libraries web pages.

Partnerships outside the Libraries
The University Archivist also established many partnerships outside the Libraries to assist with various tasks. The most important partnership formed as another partnership was dissolving. Early in the project, before the backlog of undergraduate theses was loaded into DSpace, the two members of the Libraries' Systems Department who were working on this project left the University at nearly the same time. Before leaving, the head of the Systems Department made it a point to introduce staff from the campuswide University Information Technology Services (UITS) and staff from the Center for Advanced Spatial Technologies (CAST) to the University Archivist. These two other campus units also were interested in creating a digital repository. Rather than building on the previously set-up installation, UITS quickly set up a new installation of DSpace with a handle system to assign persistent digital identifiers, a component which had not previously been achieved. Past experience with the previous DSpace installation allowed for an easy set-up of collections in the new instance.
Members of UITS met with the University Archivist to discuss the Libraries' goals and the types of electronic records involved. Members at UITS agreed to manage the software and hardware required while members in the Libraries would provide content and metadata and determine the organization of the content. Members from both UITS and the Libraries would work together on the functionality of the user interface. Content would consist of faculty and student scholarship and university records. After learning about our project to provide access to student theses, members of UITS suggested using Vireo software for uploading the undergraduate theses. Vireo is a submission and management system for electronic theses and dissertations "which addresses all steps of the ETD process, from submission to approval by the graduate office to publication in one or more institutional repositories" (Texas Digital Library, 2016). After weighing the available import options, it was decided that the backlog of engineering theses would be loaded directly into DSpace, and, going forward, students would submit theses online using Vireo software. Organization of files was a challenging issue to resolve. Because the original version of Vireo funneled all of the theses into one collection in DSpace, sorting the theses by college was abandoned. The University Archivist still wanted users to be able to view theses by department. To compensate for this, a member of UITS was able to create a custom search screen in DSpace for theses. As a result, patrons were able to search theses by author, advisor, committee member, department, major, and graduation date, in addition to full text.
Making this venture successful required the University Archivist to recruit participation from additional campus partners. One of Vireo's features enables administrators to customize lists and licenses. The Office of the Registrar was able to provide a list of all of the current departments, majors, and degrees on campus. This information was displayed in three separate pick lists for students. In addition, the campus General Counsel became involved by writing a non-exclusive distribution license, which each student has to sign before his or her thesis can be published in the repository.
The service was opened up to all honors students; therefore, partnerships had to be formed with each of the six colleges and schools which administers an undergraduate honors program on campus in addition to the Honors College. While graduate theses all go through the Graduate School, undergraduate theses are administered by each individual college; therefore, the details for each program differ slightly within each college. The Honors College was able to help the University Archives in promoting this new program and connecting the University Archivist with the honors program directors for each college and school on campus. Each honors program director appointed an administrator who was responsible for approving and publishing the theses for the students within that college or school. Administrators within the college have knowledge of the individual requirements for a thesis to be approved by that college or school. Participation from the students is optional in most of the colleges and schools on campus.
The University Archivist conducted training sessions and wrote instructions for the administrators with steps for approving a thesis in Vireo. Group training sessions were held with a few individual sessions for those who could not attend the group sessions. Notably, the individual sessions were more productive than the larger group sessions because each college already had an established procedure for approving theses, and questions had to be addressed with each individual college about how best to integrate this new system with existing practices. By conducting training at an administrator's personal computer, the individual could immediately configure settings in Vireo on his or her own computer to make the process easier.

Workflow, Honors Theses
Two different groups of electronic honors theses had to be addressed during this process. First, there was the group of theses which had been collected by the Engineering and Mathematics Librarian before the electronic submission system was established. Second, there were the theses which would come through Vireo.
The older theses were addressed first. The batch import process was found to be time consuming to format, so a student was employed to upload each thesis digital file into DSpace with title as the only metadata field. The titles and corresponding URLs were exported from DSpace into a spreadsheet which was then formatted to contain all of the fields required for the metadata. Members in the Technical Services Department then filled in the remaining fields on the spreadsheet. The spreadsheet was sent back to the University Archivist, and the complete metadata was imported into DSpace. Members of the Technical Services Department then added additional fields and uploaded the records into the catalog.
In contrast, the workflow for uploading new theses through Vireo was less time consuming for the Libraries. However, the entire process from start to finish involved the participation of students, faculty, and administrators across campus. Each undergraduate honors student who chose to participate navigated to the online submission system, where he or she filled in an online form, which would become the metadata, and then uploaded his or her thesis. In an effort to keep the metadata uniform, the metadata fields requiring controlled vocabulary used pick lists within Vireo so that students chose from a list for fields such as college, department, major, and degree. Library staff also created a training video which walked through each step of the process. Once the student finished the submission, the system automatically sent an email to the student's thesis advisor with a link to approve the version submitted. Administrators within the colleges collected paper non-exclusive distribution licenses from the students. The thesis was published to DSpace by the college administrator and was then publicly available. The administrators sent the paper licenses to the University Archives to maintain. The University Archivist reviewed the theses which were under an embargo through the Vireo administrative interface and published these once the embargos expired. When the theses were approved for a particular semester, the University Archivist exported the metadata from DSpace and sent a spreadsheet to the Technical Services Department, where a librarian transformed the metadata into MARC records, which were then imported into the online catalog.
Since procedures differ slightly in colleges and schools across campus, each college and school is responsible for contacting their honors students who will be graduating each semester with instructions for submitting an honors thesis. University Archives provided a consistent web page with instructions and an instructional video.

Workflow, Digital Serials
When it was discovered that DSpace could not accommodate the advanced file types for geospatial data used by CAST, UITS began to look for software to accommodate these needs. They suggested testing Islandora, which is a Fedora repository with a Drupal interface. After reviewing the new platform, the University Archivist discovered several additional features Islandora offered which were unavailable with DSpace, including an easy batch upload process, the ability to use different metadata schemas, automatic conversion of audio and video to compressed formats for access purposes with the ability to stream au-dio and video, and the ability to handle advanced file types and specialized metadata of geospatial images. The University Archivist worked with UITS to customize the interface, and a second repository test began. CAST then directly uploaded a series of geospatial images.
Islandora was found to be useful for additional projects, including digital serials. The Serials unit within the Libraries had collected more than 1,000 files across approximately 100 different serial titles dating back to 2003. The University Archivist chose a small group of files from three different titles to test. Again, the University Archivist worked with librarians from Technical Services to determine metadata standards based on the collection level catalog record for each title.
Islandora has a relatively easy batch import process. A template metadata record was created by the University Archivist for each individual title. Then a student was able to create metadata records for additional issues using the individual title templates. The digital objects and metadata records were then uploaded in batches into Islandora. Once the workflow was established with the test batches, the University Archivist and a student assistant uploaded the backlog of files. The XML metadata template for each title was saved, and then each year new titles may be added using this template. By 2016, there were more than 1,400 separate issues across approximately 90 separate titles available online.

Workflow, Concert Recordings
While the serials project was still underway, yet another large project needed to be addressed. The Music Department resolved that they would no longer provide the Libraries with CDs of student and faculty concerts on campus but would only be providing digital files that the Music Department wanted available online. After uploading test records of the recordings in both DSpace and Islandora, it was established that Islandora would be the most appropriate platform based on several features. A key feature of the software for this project was that Islandora created an MP3 file from the WAV file automatically upon ingest. Also, the files could be streamed through the Islandora interface, and the straightforward batch uploading process was an asset.
An informal working group was formed between the University Archivist, librarians and staff from the Performing Arts and Media Library, and librarians from the Technical Services Department to create a workflow for ingestion. As previous concerts on physical media were already being cataloged, the MARC fields already in use were mapped to MODS to create a metadata template. Since each song was recorded as a separate file, each concert was grouped together in an online folder containing all the song tracks along with a pdf of the concert program. Once the workflow was established and the metadata template created, staff in the Performing Arts and Media Library would create the metadata and then work with Technical Services to transform the metadata to MARC and load the records into the catalog.

Importance of Partnerships
These projects could not have been completed without the assistance of many partners using their own areas of expertise. With no dedicated staff for assistance, it was important for the University Archivist to identify others who could assist with either technical developments, collecting, or ingestion of materials. Set-up of the repositories was the most time-consuming set of tasks: working with UITS to design the user interface, learning the software, determining the capabilities of the different programs, establishing organizational schemes, and creating metadata standards. It was important to identify individuals who could provide needed expertise or share the responsibility for important tasks.
For collections that would be added to regularly, it was important to find others to assist with ingestion of files. The Vireo software worked ideally for undergraduate honors theses, allowing students to upload their own files and provide information used for metadata. Then administrators from each college were given access to the administrative side of Vireo. This not only assisted with uploading of materials, but it gave the colleges a certain amount of ownership to the project and allowed each college to continue with customized requirements and workflows. In the case of concert recordings, the Performing Arts and Media Library had already been accepting and processing CDs of campus concert recordings. Staff was able to switch over to the new workflow of processing digital files. Assistance from the Technical Services Department aided with metadata creation and turning that metadata into MARC records to create another access point through the Libraries' online catalog.

Promoting the Repository
One less successful aspect during this phase of the project was promotion and marketing to increase use of the repositories. The weight of effort was spent on setting up and building collections because specific people asked for assistance with providing access to digital materials; however, these groups were only a small portion of the campus. Despite the fact that by 2016 there were more than 1,000 records and objects in the DSpace instance and more than 4,000 objects in the Islandora instance, few knew about the repositories beyond the patrons that requested the services originally. As the University Archives and the digital repositories were both in beginning phases at the same time, the repositories also could have been used to promote digital archiving of university resources and therefore to attempt to recruit additional materials related to the University Archives collection policy.
The Honors College did assist with promotion, publishing a notice in an Honors College publication when the DSpace system was first released in December 2011. The repository saw an increase in views, mainly from on campus, the following April. The views of the undergraduate honors theses have consistently increased around April and May when the repository also sees the greatest number of submissions. Additional promotion could aid in providing information to students to recruit more submissions. In a study of Nigerian agriculture faculty, Bamigbola (2014) stated "one of the major challenges to the realization of its full potentials is content recruitment" (p. 506). Promotion can create more use, and more use of the repository aids to justify the allocation of resources for its maintenance.
The Islandora repository has never been officially promoted because the future of the repository's sustainability remains uncertain; it remains a relatively unknown resource, unfortunately. If this resource had been promoted, it might have been more widely used even while decisions were being made concerning long term sustainability. A digital repository is a living resource that will continually grow and change. Promotional efforts do not need to wait on a finished product because, as an evolving collection, it should never be finished.

Preservation
Another aspect that should have played a larger role from the inception of the project was digital preservation. In the initial phase, access was the main focus with only minimal steps taken toward preservation. Since access is the more visible and in-demand aspect, it has proven more difficult to implement digital preservation practices in a later phase-to acquire the additional tools and server space needed for tasks which are primarily unseen. Ideally, the system should have been designed around the Open Archival Information System (OAIS) model described by Lavoie (2000). OAIS is a high level model depicting the combination of tasks carried out by humans and technology for the purpose of digital archiving, including digital preservation. Preservation and access should have been developed together to function as a unified process.
Very basic steps toward preservation were taken from the beginning of this project. The archivist assessed the formats of all files which had already been transferred to the Libraries. Files which were not already in a preservation format were converted. In the case where new files were being created, such as with the concert recordings, the archivist requested that the files be transferred as a WAV file for preservation purposes. Since the repository was being managed by the university's central IT department, it had the advantage of being on the same back-up system as other campus data. However, digital preservation requires more than just a back-up. "Preservation of electronic records requires a commitment to active preservation practices including migration, refreshing, and integrity and authenticity checks of stored digital records" (Peters, 2006, p. 22-23). Only the files in DSpace were being monitored at the bit level for changes by periodically validating checksums. Williams and Berilla (2015) noted: With limited funds and staff…practicality often trumps theory, and a middle ground of digital content management must be contemplated. Because of their own idiosyncrasies, institutions must cherry pick among best practices for what works for them. In essence, every institution must develop a unique plan. (p. 88) A full digital preservation plan still is needed to work toward the OAIS model and to monitor the files over time. More importantly, preservation requires a commitment by the institution to maintain funding needed to preserve digital objects into the future. Depending on the preservation methods being used, funding could be needed for staff, equipment, storage space, and/or outsourced services.

NEXT STEPS
While other smaller projects were completed in addition to those described in this article, such as adding some faculty scholarship to DSpace, the repository as a whole was not adopted as the institutional repository. The two developers for this project in UITS who were managing all technical aspects of the repository are no longer with the University. While UITS has continued to maintain these repositories, support was decreased from two full time staff members to being added to the duties of another staff member on top of his regular duties. Nevertheless, support from UITS remains a valuable and essential part of the initiative. While the repositories continue to grow in content, the pace of updates to the software or new technological improvements have decreased with the decrease in staffing. The ability to sustain the open source platform with decreased support remains in question.
In late 2014, library administration decided that proprietary software might be a better fit for our campus, and the next year, steps were taken to begin the set-up of a Bepress repository. The University Archivist tested some of the content from the DSpace repository in the new Bepress repository. After a successful test, the University Archivist and another staff member migrated all content from the DSpace platform to the Bepress platform. In the spring semester of 2016, undergraduate honors students began uploading theses directly to Bepress rather than uploading in Vireo to publish in DSpace. Also in 2016, two staff members were hired for a new department, the Office of Scholarly Communications, to manage the institutional repository. This new department is physically located within the Libraries, and the head of that office reports to both the Dean of Libraries and to the Vice Provost for Research and Economic Development. The University Archivist continued to be involved during the setup of the new repository.
In addition to the materials transferred from DSpace, the staff managing the new Bepress repository are adding graduate theses and dissertations and expanding faculty and student scholarship. The DSpace and Vireo instances are being retired. The University Archivist extracted digital distribution licenses from the authors, metadata of files, and submission logs from DSpace and Vireo. These files will be stored on a server for reference. It was determined that digital university records will remain separate at present, so these records have remained in the Islandora repository with major decisions for the program still pending. The University Archivist continues to work with partners across campus in the changing environment.

CONCLUSION AND LESSONS LEARNED
While the open source repositories using DSpace and Islandora that were adopted during this early phase did not become the campus-wide institutional repository, they did fulfill an immediate need, providing access to born-digital materials for five years. Also, they provided valuable experience in managing an institutional repository, allowed staff to form valuable connections both inside and outside the libraries, and helped the campus evaluate the sustainability of using open-source software.
While the campus was still making decisions on issues such as platform, management, and location of the repository in the organizational scheme, the University Archives was able to fill an immediate need for several groups on campus trying to manage digital material. Honors programs had a method to collect and provide access to digital honors theses for the first time. The Music Department was able to transition to digital recordings of concerts. Multiple faculty members were able to provide access to research outside of traditional publishing. Digital university serials that had only existed on a server became available worldwide. These materials demonstrated a need for this type of service and the multiple types of materials that would need to be accommodated. They also provided a base of files and metadata to seed the campus-wide institutional repository.
This initial phase provided valuable experience to show areas where improvements could be made as well as areas that were successful. Procedures could be improved in the area of promoting the repository and in digital preservation policies. One of the strongest outcomes from this project was the partnerships formed, which will carry over into the new repository. Forming partnerships and establishing communication channels between departments to create cross-campus workflows was a much bigger challenge and a more significant accomplishment for the long run than learning to use software. Within the Libraries, the University Archivist found partners with Systems, Technical Services (Cataloging and Serials), the Performing Arts and Media Library, and the Engineering and Mathematics Librarian. Outside the Library, the University Archivist formed partners with University Information Technology Services, the Center for Advanced Spatial Technologies, the Music Department, the Registrar's Office, the Honors College, University Council, and administrators of the honors programs in every college and school on campus. The partnerships are only growing as repository staff goes into the next phase of the project.
These partnerships with content creators, technical support, catalogers, and university staff and administrators were by far the most useful aspect to come out of this initial phase of the project. People, not technologies, are the greatest asset in any project of this size and complexity. Identifying people with the skills needed to make a system work-metadata, cataloging, organization of information, promotion, marketing, and digital archiving-was key to this project's initial success.