Abstract
We discuss Topic Detection, a sub-task of the Topic Detection and Tracking (TDT) Project, and present a system that uses domain-informed techniques to group news reports into clusters that capture the narrative of events in the news domain. We present an initial evaluation of this system, and describe an application of these techniques for the clustering of live news feeds. We conclude that these approaches promise more coherent and useful clusters and suggest some areas of future work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Coates, T.: (Weblogs and) The Mass Amateurisation of (Nearly) Everything. Exposure: From Friction to Freedom (2003)
UNESCO: UNESCO Statistical Yearbook 1999. UNESCO Publishing and Bernan Press (1999)
Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic Detection and Tracking Pilot Study. In: Proc. of the DARPA Broadcast News Workshop (1998)
Arampatzis, A.T., van der Weide, T., Koster, C., van Bommel, P.: Term Selection for Filtering based on Distribution of Terms over Time. In: Proc. of the 6th Conference on Content-Based Multimedia Information Access (RIAO 2000), Paris, France, pp. 1221–1237 (2000)
Hatzivassiloglu, V., Gravano, L., Maganti, A.: An Investigation of Linguistic Features and Clustering Algorithms for Topical Document Clustering. In: Proc. of the 23rd International ACM SIGIR Conference (2000)
Yang, Y., Carbonell, J., Brown, R., Pierce, T., Archibald, B.T., Liu, X.: Learning Approaches for Detecting and Tracking News Events. IEEE Intelligent Systems 14, 32–43 (1999)
Carbonell, J., Yang, Y., Lafferty, J., Brown, R.D., Pierce, T., Liu, X.: CMU Report on TDT-2: Segmentation, Detection and Tracking. In: Proc. of the DARPA Broadcast News Workshop, San Francisco, CA, USA, pp. 117–120 (1999)
Cutting, D.R., Karger, D.R., Pedersen, J.O., Tukey, J.W.: Scatter/Gather: a Cluster-based Approach to Browsing Large Document Collections. In: Proc. of the 15th ACM SIGIR Conference, pp. 318–329 (1992)
Bell, A.: The Language of News Media. Blackwell Publishing, Oxford (1991)
van Rijsbergen, C.: Information Retrieval, 2nd edn. Butterworths, London (1979)
NIST: The Topic Detection and Tracking 2004 (TDT-2004) Evaluation Project (2004), http://www.nist.gov/speech/tests/tdt/tdt2004/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Flynn, C., Dunnion, J. (2004). Event Clustering in the News Domain. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-30120-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23049-6
Online ISBN: 978-3-540-30120-2
eBook Packages: Springer Book Archive