  • 學位論文


Applying Text Mining Techniques in Integration of Documentations and GIS -- A Case Study of Research Papers of YMS National Park

指導教授 : 朱子豪


近年來隨著資訊爆炸的現象的重要性逐漸浮現,相關研究也迅速擴張,另外GIS領域研究亦日益充分,兩者的重要性不言可喻,但可惜的是研究方向分離,缺乏整合體系將知識管理與GIS做完善的結合。一般使用者在使用文字資料時,常需要空間資料的輔助,透過空間資料的傳達則可以在使用文字知識時有更明確的對照。以需求面來說,文字資訊的精鍊可以提供使用者更完整且多樣化的知識內容,而將知識與相關資料連接,就是資料善用與新知識快速產生的關鍵所在。就供給面來看,如何在不熟悉使用GIS的使用者所習慣的文字體系中引發GIS需求,亦為重點之一。本研究的目的即是透過文件中空間資訊需求分析關連GIS的空間定位與瀏覽功能,以整合文件與空間資料庫。 將文句斷詞進而針對文字的語意進行語言分析,焦點則在於詞與詞之間的關係。本研究主要素材為研究報告,其語句多為單純敘述性但歧異性較大,所以特別針對其中較易出現的文句型態,以整理歸納方式找出詞與詞之間的構成,探討解析一般文句中的空間意涵,推敲可能代表之空間語意。根據文字中的空間條件及屬性條件之數量,可歸納出不同情況下使用者可能會產生之需求,以及相對應提供的GIS資料項目。 本研究建立了資料描述、詞彙庫、關鍵詞庫及地名庫,並以屬性條件與空間條件之判定為研發重點,開發結合文件與GIS的整體架構。系統實做部分則完成文字系統與GIS資料的整合,提供使用者更便利的文圖閱讀系統,使用者將可透過系統由文字資料串連至GIS資料,將資訊更有效地呈現,有助於一般使用者在閱讀文章時,滿足其對於GIS的可能需求,並提供文句中相關連之GIS資料,使閱讀的方式除了心象圖的產生之外,更有實際客觀的資料可加以輔助及驗證,空間知識不再是紙上談兵,而是可以具現化的呈現,過去傳統的文字閱讀亦展開革命性的轉變。


More and more scientists put their focus on the phenomenon of “information explosion.” recently. Moreover, the studies of Geographical Information Systems (GIS) have also achieved a more mature level. However, the linkage between integrate-information management and GIS has not received much attention. The aid of spatial information usually needed when users reading text data files. With the help of spatial information, the knowledge behind the text files can be delivered easier. For information demand side, the concentration of text information can provide plenty and diverse knowledge to users. Besides, the forming of knowledge usually happened when exist knowledge has been linked with related information. For the supply side, one of the key points is how to induce the needs of GIS when users are using text-based system. This study evaluates the linking GIS spatial locating and browse function by analyzing the spatial information requirements within the text file. The major purpose is trying to combine spatial and text database. The meanings of the sentences have to be analyzed first and the relationships between words are the priority. Research reports are the primary study targets. The meanings of the sentences are simple but the diversities are large in these kinds of reports. In order to evaluate the spatial meanings in the sentences, the most frequent sentences have been identified first and the structure relationships between words have been summarized in this study. According to the different amounts of spatial and attribute criteria, the possible demands from the users and the corresponding GIS data can be summarized. The database of date description, words, key words and name of location have been built. Meanwhile, the identification of spatial and attribute criteria was used to construct the framework of combining the text files and GIS data. This combinational framework of the text files and GIS data was ready for practical use which can provide users a more convenient “text-figure” reading system. Through this framework, users can link the text file to GIS data and present the information more efficiently. This procedure should be able to satisfy the potential needs of GIS data when users are reading the text files. The related GIS data was provided to give users a more objective guideline rather than only a mental image. By this proposed framework, spatial knowledge can be presented concretely and the traditional text reading would have a dramatic change.


陳光華 (1998) 資訊的組織與擷取,圖書館學刊,12 p.127-141。
陳稼興、謝佳倫、許芳誠 (2000) 以遺傳演算法為基礎的中文斷詞研究,資訊管理研究,2:2 p.27-44。
林明璋 (2002) 電腦輔助記憶系統之研究與製作∼諧音研究與文章斷詞分析,國立台灣師範大學資訊教育研究所碩士論文。
黃如鈺 (2005) 自然語言式GIS查詢介面—以大安區餐飲服務為例,國立台灣大學地理環境資源研究所碩士論文。
蔡純純 (2003) 中文新聞文件空間資訊擷取之研究—以火災、搶劫、車禍事件為例,國立台灣大學地理環境資源所碩士論文。


