Document Type

Conference Proceeding

Publication Date

6-2007

Abstract

This work deals with determination of meaningful and terse cluster labels for News document clusters. We analyze a number of alternatives for selecting headlines and/or sentences of document in a document cluster (obtained as a result of an entity-event-duration query), and formalize an approach to extracting a short phrase from well-supported headlines/sentences of the cluster that can serve as the cluster label. Our technique maps a sentence into a set of significant stems to approximate its semantics, for comparison. Eventually a cluster label is extracted from a selected headline/sentence as a contiguous sequence of words, resuscitating word sequencing information lost in the formalization of semantic equivalence.

Comments

Presented at the 12th International Conference on Applications of Natural Language to Information Systems, Paris, France, June 27-29, 2007.

Attached is the unpublished, author's version of this work. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-540-73351-5_11.

Repository Citation

Thirunarayan, K., Immaneni, T., & Shaik, M. V. (2007). Selecting Labels for News Document Clusters. Lecture Notes in Computer Science, 4592, 119-130.
https://corescholar.libraries.wright.edu/knoesis/881

DOI

10.1007/978-3-540-73351-5_11

Download

Included in

Bioinformatics Commons, Communication Technology and New Media Commons, Databases and Information Systems Commons, OS and Networks Commons, Science and Technology Studies Commons

COinS

Kno.e.sis Publications

Selecting Labels for News Document Clusters

Document Type

Publication Date

Abstract

Comments

Repository Citation

DOI

Included in

Search

Browse

About

SelectedWorks Sites

Kno.e.sis Publications

Selecting Labels for News Document Clusters

Authors

Document Type

Publication Date

Abstract

Comments

Repository Citation

DOI

Included in

Share

Search

Browse

About

SelectedWorks Sites