research-article

Using web page layout for extraction of sender names

Authors:
Rintaro Miyazaki

Yokohama National University, Hodogaya-ku, Yokohama, Japan

Yokohama National University, Hodogaya-ku, Yokohama, Japan
View Profile

,
Ryo Momose

Yokohama National University, Hodogaya-ku, Yokohama, Japan

Yokohama National University, Hodogaya-ku, Yokohama, Japan
View Profile

,
Hideyuki Shibuki

Yokohama National University, Hodogaya-ku, Yokohama, Japan

Yokohama National University, Hodogaya-ku, Yokohama, Japan
View Profile

,
Tatsunori Mori

Yokohama National University, Hodogaya-ku, Yokohama, Japan

Yokohama National University, Hodogaya-ku, Yokohama, Japan
View Profile

IUCS '09: Proceedings of the 3rd International Universal Communication SymposiumDecember 2009Pages 181–186https://doi.org/10.1145/1667780.1667818

Published:03 December 2009Publication History

IUCS '09: Proceedings of the 3rd International Universal Communication Symposium

Pages 181–186

ABSTRACT

Recently, the credibility of information available on the Web has been regarded as an important issue. Sender name is one of the important indicators of the credibility of the information. In this paper, we propose a new method for extracting sender name. The proposed method use the named entity recognition method, and reducing the DOM node using Web page Layout for preprocessing. Experimental result shows that our proposed method can effectively extract sender names when the preprocessing is successful.

References

Cenk Kaynak and Ethem Alpaydin, "Multistage Cascading of Multiple Classifiers: One Man's Noise is Another Man's Data", Proceedings of the 17th International Conference of Machine Learning, 2000. Google ScholarDigital Library
Bo Pang and Lillian Lee, "A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts", Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004. Google ScholarDigital Library
Hironobu Tsukahara and Jun Nishimura and Rintaro Miyazaki and Naoto Maeda and Tatsunori Mori and Hiroyuki Kobayashi and Yusuke Ishikawa and Yuya Tanaka and Shorei O, "Improvement of the performance of attribute-value extraction from description of exhibits in net auction system", Information Processing Society of Japan SIG Technical Report 2008-NL-186, 2008 (in Japanese)Google Scholar
Yoshikiyo Kato and Daisuke Kawahara and Kentaro Inui and Sadao Kurohasi and Tomohide Shibata, "Extracting the Author of Web pages", Proceedings of the Second Workshop on Information Credibility on the Web, 2008. Google ScholarDigital Library
Masanobu Tsuruta and Hiroyuki Sakai and Shigeru Masuyama, "An Informative DOM Subtree Identification Method from Web Pages in Unfamiliar Web Sites", The Institute of Electronics, Information and Communication Engineers (IEICE) Transactions Information and System, vol. E91-D, no. 4, 2008 Google ScholarDigital Library

Index Terms

Using web page layout for extraction of sender names
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

AUTOMATIC ANNOTATION OF AMBIGUOUS PERSONAL NAMES ON THE WEB

Personal name disambiguation is an important task in social network extraction, evaluation and integration of ontologies, information retrieval, cross-document coreference resolution and word sense disambiguation. We propose an unsupervised method to ...
Read More
On assigning place names to geography related web pages
JCDL '05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries

In this paper, we attempt to give spatial semantics to web pages by assigning them place names. The entire assignment task is divided into three sub-problems, namely place name extraction, place name disambiguation and place name assignment. We propose ...
Read More
From names to entities using thematic context distance
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

Name ambiguity arises from the polysemy of names and causes uncertainty about the true identity of entities referenced in unstructured text. This is a major problem in areas like information retrieval or knowledge management, for example when searching ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IUCS '09: Proceedings of the 3rd International Universal Communication Symposium
December 2009
404 pages
ISBN:9781605586410
DOI:10.1145/1667780
General Chair:
Kazumasa Enami
National Institute of Information and Communications Technology (NICT), Japan
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 December 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
information credibility
natural language processing
sender name
web page layout
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 81
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Using web page layout for extraction of sender names

IUCS '09: Proceedings of the 3rd International Universal Communication Symposium

ABSTRACT

References

Cited By

Index Terms

Recommendations

AUTOMATIC ANNOTATION OF AMBIGUOUS PERSONAL NAMES ON THE WEB

On assigning place names to geography related web pages

From names to entities using thematic context distance

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Using web page layout for extraction of sender names

IUCS '09: Proceedings of the 3rd International Universal Communication Symposium

ABSTRACT

References

Cited By

Index Terms

Recommendations

AUTOMATIC ANNOTATION OF AMBIGUOUS PERSONAL NAMES ON THE WEB

On assigning place names to geography related web pages

From names to entities using thematic context distance

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media