Construction of an International Digital Sharing Platform of Dongba Manuscripts and Dongba Hieroglyphs

With the aim of protecting, bequeathing, and sharing globally the Dongba manuscripts of the Chinese Naxi minority, the memory and heritage of which is under threat, this paper proposes ideas and plans for building a digital sharing platform to fulﬁl this aim using computer technology, information processing, online dissemination, multimedia display and other technologies to build an international digital platform for the sharing of Dongba manuscripts. This platform provides digital resources comprising Dongba manuscripts and related literature, tools for deciphering Dongba manuscripts, an environment for undertaking and sharing research, and dynamic information on the research ﬁndings and inheritance. The platform is also equipped with interpretation material base, database, and knowledge base on Dongba manuscripts. It also provides access to UNESCO’s Memory of the World (MOW) database and a stratiﬁed information sharing mechanism in accordance with international intellectual property laws. This platform provides: a channel for resource sharing and academia exchange of a widely scattered collection of Dongba manuscripts among a number of researchers; a means of digitally preserving Dongba manuscripts for posterity, the passing down of Dongba hieroglyphs, and sharing of cooperative research and deliverables on Dongba culture; and a model of reference for protecting and advocating cultural heritage at home and abroad.


INTRODUCTION
The Naxi are a Chinese ethnic group that has its own pictographic script, the Dongba Hieroglyphs, shown in Figure 1. The Dongba Hieroglyphs is the only one of its kind still being today. The classic works of literature written in Dongba Hieroglyphs are collectively referred to as the Dongba manuscripts.
There are more than 1,400 types of extant Dongba manuscripts, comprising over 30,000 volumes. These manuscripts have recorded the historical progress of mankind from the infant stage of human civilization to the highly  An international digital sharing platform of Dongba manuscripts In order to protect and pass on to posterity the Naxi Dongba culture and its manuscripts, thanks to cross-disciplinary research and international cooperation, an international digital sharing platform of Dongba manuscripts was built based on modern information technologies including digital and network-based processing, retrieval, storage and communication. This platform can be used for picture recognition, voice recognition, content deciphering and the handling and management of information related to the form, pronunciation and meaning of Dongba manuscripts files. Moreover, the management of these three types of information can be performed simultaneously. This platform provides a channel for resource sharing and academia exchange among research bodies dealing with a widely scattered collection of Dongba manuscripts, offering not only new channels for protecting and studying Dongba manuscripts, but also a universal reference for the digital protection and dissemination of cultural heritage at home and abroad. [3] 2.
HOLISTIC Computer, information processing, network communication and multimedia technologies are applied in the building of the international digital platform for the sharing of Dongba manuscripts. Figure 2 shows the entire structure of this platform, consisting of a digital literature system, a deciphering system for Dongba manuscripts written in hieroglyphs, and an information management system. The digital literature system is used for storing digital resources such as Dongba manuscripts and related literature. The deciphering system is used for computer-assisted identification and translation of Dongba manuscripts written in hieroglyphs. The information management system is used for searching and managing large amounts of information on Dongba manuscripts. Through the Internet, this sharing platform computer systems science & engineering establishes channels for resource sharing and academia exchange among research institutions and a widely scattered collection of manuscripts. It also provides access to UNESCO's Memory of the World (MOW) database. Moreover, this platform enables user interaction and content display.

RESEARCH INTO THE DIGITAL LITERATURE SYSTEM OF DONGBA MANUSCRIPTS
In order provide local resource files for the database of this sharing platform, relevant work was carried out to explore, collect, study, categorise and develop information-based processes for Dongba manuscripts and related material. This work is described below.

The Exploration and Digital Collection of Dongba Manuscripts and Related Literature
Following in-depth research into the geographic distribution and number of collections at home and abroad, the digital collection (using digital photo images to record paper literature and convert those images to digital formats suitable for computer storage) of Dongba manuscripts scattered around the world was carried out. A total of 3378 volumes (of which 1229 were from overseas) were collected from 11 overseas collection institutions including the British Library, National Library of France, National Library of German, and Library of U.S. Congress; the ten domestic collection institutions included the Lijiang Dongba Culture Institute and National Library of China. Various valuable historical literature works and files on Dongba manuscripts were also digitally collected.

Translation, Deciphering and Audio and Video Recordings of Dongba Manuscripts
Senior Dongba priests were invited to decipher and translate the classic pieces of collected Dongba manuscripts. In order to faithfully preserve the heritage, audio and video recordings of the whole process (using multiple cameras at different angles to simultaneously record facial expressions and tone of oral expressions, content and setting) were made. Documentary videos were made using source materials, and numerous photos were taken depicting various aspects of Dongba culture (customs, art, traditional art efacts, sacrifices and rituals).

Compilation of Dongba Manuscripts
Various methods were proposed to review and revise the compiled information on Dongba manuscripts. A total of 4524 Dongba manuscript volumes from 18 domestic and overseas collection institutions were compiled. The contents included: abstract, application, penholder , collection institution and time of collection, historical backgrounds, source (region), and writing characteristics.
To sum up, this system stores abundant digital resources of Dongba manuscripts and related material, achieving long-lasting digital storage and international sharing, and brought back to China large volumes of Dongba manuscripts (digital version) formerly stored overseas.

RESEARCH INTO BUILDING THE DECIPHERING SYSTEM OF DONGBA HIEROGLYPHS
The senior Dongba priests who can accurately decipher Dongba manuscripts are generally in their 70s, and it is not feasible to rely on oral communication between generations of Dongba priests in order to decipher the manuscripts and pass on the culture. The research into deciphering Dongba hieroglyphs and computerassisted identification and translation systems will help to revive and safe guard these endangered "living hieroglyphs" in modern society.

The Building and Utilization of Dongba Characters Identification Database and Knowledge Base
Based on Naxi Pictograph by Fang Guoyu and Moso Hieroglyphs Dictionary by Li Lincan, and on the existing systems like the Naxi Dongba character computer language platform, relevant codes for words, sentences, and event names were developed, providing the means for finding the English correctly corresponding to the Chinese. The deciphering databases for Dongba manuscripts were proposed and built. These databases included: meaning (character, word), sentences (verb and object) and events (history and stories often contained in Dongba manuscripts). Selection and compilation were carried out according to deciphering rules. [5] Smart supporting tools were provided for the deciphering database. A Dongba character deduction knowledge base was built with an expert system at the core related to the deciphering system. This knowledge base compiles the databases of words, sentences and events and manages the information integration rules thereof using smart deduction engines. For example, separate researches were conducted and rules extraction of the Kautztheroy based on knowledge characteristics, rules streamlining of the Dynamic particle theory and knowledge deduction of the Rete algorithm were built to achieve in-depth deduction, and to search and access information in the database.

Computer Identification Technology of Dongba Hieroglyphs
Dongba hieroglyphs are a primitive pictographic script, and the existing English and Chinese identification system translation systems cannot be applied to picture identification, content deciphering, and the deciphering of the form, pronunciation and meaning of Dongba hieroglyphs, which are highly complex in that one character may have multiple forms, pronunciations and meanings, or characters of different forms can have the same meaning. According to statistical research on Dongba hieroglyphs, the specific things to be expressed by curves tend to emphasize "hieroglyphs". Given the characteristics of Dongba hieroglyphs, to extract their basic features, researchers have proposed an information processing method that applies topological characteristic processing following the projection feature extraction. To further identify hieroglyphs, a character identification method that combines model matching and nervous network feedback is developed.
[6] Figure 3 shows the topological characteristics structure for the process of extracting the features of Dongba hieroglyphs. The left part is a Dongba hieroglyph that means "tomorrow morning", and the right part shows its topological characteristics. This hieroglyph has 5 end points, 22 cross points (including 15 threeway points, 4 four-way points, and 3 two-way points), 3 blocks, and 12 sections. O means end points, 2 means cross points, → means block, × means sections. These data will assist with the identification of corresponding Dongba characters. Because only a few Dongba hieroglyphs have the same topological features, the identification rate of the topological method is 80%. For the remainder that cannot be accurately identified, the projection method is applied. Research shows that, based on the characteristics of Dongba hieroglyphs, the extraction method combining topological features and projection is feasible and withstands external distractions, enabling a detailed, efficient and more accurate classification and identification of Dongba hieroglyphs.
In the process of further identifying hieroglyphs, by extracting features of hieroglyphs, corresponding features can be acquired, on the basis of which character identification can be carried out. The model matching method has a simple algorithm and is fast, has no complex iteration assessment issues, and can be used to identify the characters in well-preserved ancient scripts. The nervous network feedback method has the features of smart learning, high accuracy rate, high anti-interference capability and great adaptability, and can be used to identify characters of not-so-well preserved ancient scripts. After research, a method combining model matching and nervous network feedback was developed, which allows different methods to be optimally selected based on the quality of ancient scripts. Experiments on 1400 Dongba hieroglyphs show that this combined method can ensure both efficiency and accuracy. In other words, for poorly preserved ancient scripts with faded colors, partial loss and stains, the identification rate can be over 90%.

RESEARCH INTO BUILDING THE INFORMATION MANAGEMENT SYSTEM OF DONGBA HIEROGLYPHS
Through data mining and processing, search engine design, and network information interactive operation, the information management system of Dongba hieroglyphs can utilize the relevant literature in a systematic, visualized, and network-based way. This system also supports functions such as user access management, secure visit, information disclosure, search and statistics, and character processing in Naxi, Geba, Chinese and English. [7]

The Metadata Rules System of Literature Information of Dongba Hieroglyphs
With reference to the 10 plus metadata rules including Design Guide on Specific Metadata, and Metadata Rules on Ancient Scripts, and the widely used Dublin Core element set, the matadata rules system of literature information for Dongba hieroglyphs was developed, in which labeling methods for digital metadata and practical cases were proposed, and informationsharing interfaces based on metadata rules were provided. The metadata rules are based on the Dongba manuscripts and consist of four resource categories: paper, scanned copy, audio and video. Because source materials are labelled, the computer system can automatically identifythe material and process the meaning. The design of the metadata is based on the China Minority Ancient Books Catalogue Summary (Naxi volume) (hereinafter referred to as the Catalogue Summary), and the Naxi Dongba Manuscript Translates Pouring Complete Works (Total 100 Volumes) (hereinafter referred to as the Complete Works), so that it can be applied universally in the research field of Dongba manuscripts. [8] computer systems science & engineering Figure 4 The Catalogue Summary after information extraction.

Digital Processing Technology of Dongba Manuscripts
This paper proposes a means of achieving digital processing, information extraction, and non-structural information storage and search of Dongba manuscripts and related literature, and the storage and search of digital metadata in the data processing and information extraction of the Catalogue Summary and the Complete Works.
The information processing of Dongba manuscripts requires these steps: first, make a scanned copy of the paper version of the Catalogue Summary, and save it as a PDF file; second, use the optical character recognition system of non-contact laser digital scanner to make text files and photo files. The text files contain the title in both Chinese and Dongba characters, and the contents of manuscripts. The photo files contain Dongba characters. Third, check the text files and delete errors. Last, based on the information extraction method of regular expression model, extract the text files into semi-structured XML files. Figure 4 shows the Catalogue Summary after information extraction.

The Information Management Method of Dongba Manuscripts
Given the structural and non-structural information storage and search demands of Dongba manuscripts, not only is the structural content of Dongba manuscripts data needed, but also required is the special content that supports the Dongba characters that appear frequently in Dongba manuscripts data and international phonograms. An information management system focused on ancient scripts storage based on the eXist model is built. This system is an open source original SML data management system, and supports standard XQuery inquiry language. It provides an automatic index and full text searching capability, and is closely connected with the existing XML development tools which will help to develop and maintain the system. This information management system has three collections that store the Catalogue Summary data, the Complete Works data, and the management system'suser information. This information system can store different categories of Dongba manuscripts from various collections, thus providing categorized management and search of different manuscripts, and better management and search efficiency.
[9,10] The information management system of Dongba manuscripts also contains a sub-system for managing Dongba manuscripts and a sub-system for user management. The former focuses on operations performed on manuscripts, including the uploading of files, adding single entries, document search (simple and advanced), deleting and updating, etc.; the user management sub-system mainly manages the accounts of managers which involves adding manager accounts, freezing accounts, and browsing the files uploaded by logged-in users. This system can control different users in order to maintain system security. The information management system for Dongba manuscripts also supports functions such as managing the access of users, secure visit, information disclosure, and search and statistics. It also supports multimedia content, and character processing of Dongba (including the Geba character that originated from Dongba), Chinese and English.

Network Information Processing and Network Protocol of Dongba Manuscripts
To enable the co-sharing of digital resources relevant to Dongba manuscripts, the Dongba manuscripts network information processing system based on B/S structure was built. This enables searching and browsing of digital resources of Dongba manuscripts, including pictures of Dongba manuscripts, videos of translation between Dongba and Chinese,and manuscripts deciphering. The network information processing system provides three ways of searching via: key words, multiple conditions, and XQuery search inquiry, thereby meeting different searching demands of users. An inter operability network protocol based on Web service is designed to achieve better sharing of resources. Users can access data on Dongba manuscripts by writing programs according to this protocol, and communicating with other digital resource bases. A network visit protocol based on OAI-PMH protocol is developed to provide solutions to inter operability issues of metadata. Users can access digital resources without being limited by system platforms, applications or field of subjects.
Common OAI-PMH protocols define search only by key words and provide only six command verbs. The parameters for command verbs are simple and few. Based on the features of digital resources of Dongba manuscripts and in order to achieve efficient and accurate search, the command verbs of OAI-PMH protocols were expanded. For example, to help users to efficiently and accurately find the records they need, XQuery search verbs in OAI-PMH protocols were added, significantly enhancing the information access capability of this protocol. The search results after expanding the OAI-PMH protocol is shown in Figure 5.

Information Mining and Visualization of Academic Resources of Dongba Manuscripts and Dongba Culture
In order to fully share the academic research outcomes related to Dongba manuscripts and Dongba culture, the mining and dissemination of information about major resources, including relevant web sites and academic papers (covering CNKI), were needed. Through information mining and processing, search engine design, and network information inter operability, the relevant literature can be utilized in a systematic, visualized, and network-based way.
For the mining of network information, based on the search for and analysis of academic resources of Dongba manuscripts and Dongba culture, web information using Apache Nutch tools was accessed, and statistics were obtained using the Lucene categorized index. When mining information in academic papers, PDF formats were converted into texts, generating titles, key words, year of publish, and the published journal of files, and an additional index was established. When searching for resource infor-mation, full text searches were conducted, and multiple domain searches and advanced searches for local papers were added.
To achieve the visualization of resource information, based on the search of resource information of Dongba manuscripts and Dongba culture, a multi functional search and display system of information resources for Dongba manuscripts and Dongba culture was built. This system can display statistical data on various aspects of the searched resource, including statistical data on the author, geographical area, related scholars and hot spots.

RESEARCH ON THE CONSTRUCTION OF AN INTERNATIONAL DIGITAL SHARING PLATFORM OF DONGBA MANUSCRIPTS AND DONGBA CULTURE
The utilization of information on the sharing platform is shown in Figure 6.

Collection and Interaction of Digital Information
The international digital sharing platform of Dongba manuscripts is connected to the digital information collection and interaction unit through a co-shared interface of digital resources, realizing the connection and collection of digital resources of Dongba manuscripts. The digital information collected and interacted with on this shared platform include: collection, research and protection The interface for interconnecting and sharing digital environment experience

The digital information unit for collection and interactive (The local and remote access by website, web service and cloud service etc.)
The digital resources of individuals ( Collectors, Academics, Tourists, etc.) Figure 6 The information utilization mode of an international digital sharing platform of Dongba manuscripts.
institutions, Dongba culture, UNESCO Memory of the World database, etc. This unit facilitates the remote interconnection and exchange of digital resources of Dongba manuscripts. Information is collected by means of anon-contact laser digital scanner of Dongba manuscripts, professional digital cameras, and professional digital video cameras; the sharing platform allows both local and remote access; remote access is based on Hyper Text Transfer Protocol; the means for accessing resources are provided through web sites, web services and cloud services etc.

The Digital Information Unit for Spreading and Displaying
The international digital sharing platform of Dongba manuscripts connects with the digital information collection and interaction unit through the interface of digital environment experience and sharing. The system integrates a display system of digitally visualized information, an interactive, international, open-platform environment, and a digital remote network. The types of information for transmission and play include: pictures, audios, texts, images, animation and so on. For example, video and audio information word-for-word and sentence-by-sentence as it is being chanted from a manuscript by Dongba, can be presented dynamically.
The environments and means of displaying and disseminating information include: WWW, wireless mobile communication terminal, virtual reality, augmented reality, multimedia, streaming media, internet of things with scene perception, and so on. [11]

The Interface for Interconnecting and Sharing Digital Resources
The interface for interconnecting and sharing digital resources enables remote interconnection between the international open platform and the collection institutions and individual collectors at home and abroad through cable, wireless communication and other communication means; with the help of UNESCO, the platform is linked to UNESCO's three databases (lost memory, endangered memory, and current activities) for Memory of the World (MOW).

The Interface of Digital Environment Experience and Sharing
This interface enables remote interconnection between the platform and the digital information dissemination and display unit, facilitating an all-round experience and the use of Dongba manuscripts, Dongba hieroglyphic pictures, pronunciation and content; it also promotes the dissemination and display of information related to the Dongba manuscripts and Dongba hieroglyphs including forms, pronunciations and meanings. The display and experience environment enables the users of the platform to understand and 'feel' the collections, exhibits, landscape, clothing, activities, rituals, and even scenes of the Naxi Dongba culture.

Virtual Exhibition and Re-Build of Dongba Manuscripts and Donga Cultural Scenes
Some Dongba manuscripts and aspects of the Dongba culture have been passed down through historical events and stories. To show the hitherto lost cultural scenes and the digital images of Dongba manuscripts, virtual scene modeling and visualization displays based on 2D and 3D technology were provided. An exhibition method based on virtual reality and augmented reality were proposed; demonstration models showing historical figures and events in Naxi villages were designed; a virtual museum of Dongba manuscripts and Dongba culture based on OpenGL ES mobile platform and a virtual art gallery of Dongba manuscripts and Dongba culture based on Unity3D technology were researched and designed, among other virtual exhibition environments on the sharing platform. [12,13]

CONCLUSION
To protect, share and bequeath to posterity the endangered Memory of the World heritage "Naxi Dongba manuscripts", with the support of the major project of the National Social Science Foundation of China (12&ZD234), an international digital sharing platform for Dongba manuscripts was built following in-depth research and a digital collection through a non-disciplinary study and wide international cooperation. Prior to building this platform, abundant digital resources based on the Dongba manuscripts and its culture were compiled; eventually, these were made available on a sharing platform. Various modern information technologies including computer, information processing, online spreading, and multimedia display, were studied and applied. Image processing of complex Dongba hieroglyphs and assisted translation were studied and proposed. Channels for resource sharing and information exchange based on widely scattered collections and several research institutions were established, and new ways of achieving long-lasting digital protection and bequeathing of Dongba manuscripts and Dongba hieroglyphs were provided.
Many thanks go to the domestic and overseas collection and research institutions and relevant scholars and experts for their tremendous support from during the process of collecting and studying Dongba manuscripts and related material.