skip to main content
10.1145/1667780.1667818acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiucsConference Proceedingsconference-collections
research-article

Using web page layout for extraction of sender names

Authors Info & Claims
Published:03 December 2009Publication History

ABSTRACT

Recently, the credibility of information available on the Web has been regarded as an important issue. Sender name is one of the important indicators of the credibility of the information. In this paper, we propose a new method for extracting sender name. The proposed method use the named entity recognition method, and reducing the DOM node using Web page Layout for preprocessing. Experimental result shows that our proposed method can effectively extract sender names when the preprocessing is successful.

References

  1. Cenk Kaynak and Ethem Alpaydin, "Multistage Cascading of Multiple Classifiers: One Man's Noise is Another Man's Data", Proceedings of the 17th International Conference of Machine Learning, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bo Pang and Lillian Lee, "A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts", Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Hironobu Tsukahara and Jun Nishimura and Rintaro Miyazaki and Naoto Maeda and Tatsunori Mori and Hiroyuki Kobayashi and Yusuke Ishikawa and Yuya Tanaka and Shorei O, "Improvement of the performance of attribute-value extraction from description of exhibits in net auction system", Information Processing Society of Japan SIG Technical Report 2008-NL-186, 2008 (in Japanese)Google ScholarGoogle Scholar
  4. Yoshikiyo Kato and Daisuke Kawahara and Kentaro Inui and Sadao Kurohasi and Tomohide Shibata, "Extracting the Author of Web pages", Proceedings of the Second Workshop on Information Credibility on the Web, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Masanobu Tsuruta and Hiroyuki Sakai and Shigeru Masuyama, "An Informative DOM Subtree Identification Method from Web Pages in Unfamiliar Web Sites", The Institute of Electronics, Information and Communication Engineers (IEICE) Transactions Information and System, vol. E91-D, no. 4, 2008 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using web page layout for extraction of sender names

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      IUCS '09: Proceedings of the 3rd International Universal Communication Symposium
      December 2009
      404 pages
      ISBN:9781605586410
      DOI:10.1145/1667780
      • General Chair:
      • Kazumasa Enami

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 December 2009

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader