  • 學位論文


The Research and Implementation of Semantic Based RDF Tagging and Webpage Searching Web Service

指導教授 : 許輝煌


近幾年來,網路資訊技術不斷的進步與成熟,常透過網路來搜尋欲尋找的資料或文件,大多數這些文字的資訊搜尋方法,依賴於使用者的要求的詞彙以及搜尋目標物的詞彙之間的詞彙比對(關鍵字比對)。通常只有在使用者的查詢中包含一個或更多與目標物共同的詞彙才會被判斷為有關。然而以詞彙為基礎的檢索搜尋系統離理想有一段差距。主要的因素不外是語意上的判別以及詞彙間關連性等問題。   目前致力開發語意網(Semantic Web)的單位相當多,此項技術也正慢慢開始發展中。W3C(World Wide Web Consortium)明確的告訴我們,Web的未來是具有語意化的(Semantic Web),今天的Web可以自如地生成、傳遞和展現各式各樣的資訊,但它還只是一個資訊的"容器"(Container),很難揭示出資訊本身的內容和特性。與此相對的是,未來的語意化Web是一種懂得資訊內容的Web,是真正的"資訊管理員"。W3C的研究小組提出了RDF(Resource Description Framework)標準規格。RDF在XML語法的基礎上,規定了Metadata的儲存結構和相關的技術標準。隨著語意化Web的誕生和發展,Web開發技術也必將經歷更為重大的變革。這些都是為了讓更多的人獲得更有價值的資訊服務,故資訊共用的最高目標,語意網也因此逐步成為未來網路的主流。 因此,本研究將利用我們先前所提出的「以語意方式搜尋網頁之系統」,加上利用RDF資源描述語言的格式,用WordNet中的Synset(同義字集)的觀念(concept)來註解/標示網頁的前後文的語意內容,將查詢結果訂定在同一個標準下,並使用RDF語言的特性,我們可以用統一的、可交換的格式來表示出資訊本身的各種特性。在RDF語言的幫助下,我們能讓Web上的資訊內容變得更容易理解、更便於交換和共用。此外,我們研究在Web Service中加入語意的技術。以及N3(Three Triples)的特性將各個網路/網頁資源(Resources)以及與其他資源的關聯性標記/架構出來,可透過其查詢語言來找尋欲查找的資料,最後建置一語意網路服務(Semantic Web Service)的機制,以達到可供語意查詢和分享的目的。


In recent years, the network and information technology is more and more ripe we always search for data or files through the Internet. The resources on the World Wide Web are increased day by day. The amount of information exchange and speed of update are also grown. Many methods depend on the match of vocabularies between information request and searched objects. For example, we usually adopt some keywords and Boolean operators to form the query to search the Internet for interested information. Unfortunately, users do not always use proper words and operators to form the query for the search. The result is related or unrelated information is both retrieved .The retrieval method becomes critical for us to get more accurate results. In order to improve the searching process, a RDF-based mechanism integrated with word sense disambiguation technique is proposed to semantically index and retrieve web pages on the World Wide Web. The approach is to describe the resources by RDF(S) metadata and store them. The proposed method also has additional advantage of the ability of further integrated into the Semantic Web Service.


Semantic Web WordNet RDF Web Service


[1]W3C, World Wide Web Consortium htttp://www.w3.org/
[10]Jason C. Hung, Ching-Sheng Wang, Che-Yu Yang, Mao-Shuen Chiu, George Yee, “Applying Word Sense Disambiguation to Question Answering System for E-Learning”, International Conference on Advanced Information Networking and Applications (AINA 2005).
[11]Boris Katz, Jimmy Lin, and Sue Felshin. 2001. “Gathering knowledge for a question answering system from heterogeneours information sources. ” In Proceedings of the ACL 2000 Workshop on Human Language Technology and Knowledge Management.
[14]H. Alani, S. Kim, D.E. Millard, M.J. Weal, W. Hall, P.H. Lewis, and N.R. Shadbolt, “Automatic ontology-based knowledge extraction from web documents, ” IEEE Intelligent Systems, vol.18, no. 1, Jan/Feb 2003, pp. 14-21
[16]M.Klein, “XML, RDF, and Relatives,” IEEE Intelligent Systems, vol. 16, no.2, Mar/Apr 2001, pp. 26-28
