人工知能学会論文誌
Online ISSN : 1346-8030
Print ISSN : 1346-0714
ISSN-L : 1346-0714
論文
語の共起の統計情報に基づく文書からのキーワード抽出アルゴリズム
松尾 豊石塚 満
著者情報
ジャーナル フリー

2002 年 17 巻 3 号 p. 217-223

詳細
抄録

We present a new keyword extraction algorithm that applies to a single document without using a large corpus. Frequent terms are extracted first, then a set of co-occurrence between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. The distribution of co-occurrence shows the importance of a term in the document as follows. If the probability distribution of co-occurrence between term a and the frequent terms is biased to a particular subset of the frequent terms, then term a is likely to be a keyword. The degree of the biases of the distribution is measured by χ²-measure. We show our algorithm performs well for indexing technical papers.

著者関連情報
© 2002 JSAI (The Japanese Society for Artificial Intelligence)
前の記事 次の記事
feedback
Top