ABSTRACT
Recently, the credibility of information available on the Web has been regarded as an important issue. Sender name is one of the important indicators of the credibility of the information. In this paper, we propose a new method for extracting sender name. The proposed method use the named entity recognition method, and reducing the DOM node using Web page Layout for preprocessing. Experimental result shows that our proposed method can effectively extract sender names when the preprocessing is successful.
- Cenk Kaynak and Ethem Alpaydin, "Multistage Cascading of Multiple Classifiers: One Man's Noise is Another Man's Data", Proceedings of the 17th International Conference of Machine Learning, 2000. Google ScholarDigital Library
- Bo Pang and Lillian Lee, "A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts", Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004. Google ScholarDigital Library
- Hironobu Tsukahara and Jun Nishimura and Rintaro Miyazaki and Naoto Maeda and Tatsunori Mori and Hiroyuki Kobayashi and Yusuke Ishikawa and Yuya Tanaka and Shorei O, "Improvement of the performance of attribute-value extraction from description of exhibits in net auction system", Information Processing Society of Japan SIG Technical Report 2008-NL-186, 2008 (in Japanese)Google Scholar
- Yoshikiyo Kato and Daisuke Kawahara and Kentaro Inui and Sadao Kurohasi and Tomohide Shibata, "Extracting the Author of Web pages", Proceedings of the Second Workshop on Information Credibility on the Web, 2008. Google ScholarDigital Library
- Masanobu Tsuruta and Hiroyuki Sakai and Shigeru Masuyama, "An Informative DOM Subtree Identification Method from Web Pages in Unfamiliar Web Sites", The Institute of Electronics, Information and Communication Engineers (IEICE) Transactions Information and System, vol. E91-D, no. 4, 2008 Google ScholarDigital Library
Index Terms
- Using web page layout for extraction of sender names
Recommendations
AUTOMATIC ANNOTATION OF AMBIGUOUS PERSONAL NAMES ON THE WEB
Personal name disambiguation is an important task in social network extraction, evaluation and integration of ontologies, information retrieval, cross-document coreference resolution and word sense disambiguation. We propose an unsupervised method to ...
On assigning place names to geography related web pages
JCDL '05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital librariesIn this paper, we attempt to give spatial semantics to web pages by assigning them place names. The entire assignment task is divided into three sub-problems, namely place name extraction, place name disambiguation and place name assignment. We propose ...
From names to entities using thematic context distance
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge managementName ambiguity arises from the polysemy of names and causes uncertainty about the true identity of entities referenced in unstructured text. This is a major problem in areas like information retrieval or knowledge management, for example when searching ...
Comments