Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Automatic text summarization and small-world networks

Balinsky, Helen, Balinsky, Alexander ORCID: https://orcid.org/0000-0002-8151-4462 and Simske, Steven 2011. Automatic text summarization and small-world networks. Presented at: 11th ACM symposium on Document engineering, Mountain View, CA, USA, 19-22 September 2011. Proceedings of the 11th ACM symposium on Document engineering. New York, NY: Association for Computing Machinery, pp. 175-184. 10.1145/2034691.2034731

Full text not available from this repository.

Abstract

Automatic text summarization is an important and challenging problem. Over the years, the amount of text available electronically has grown exponentially. This growth has created a huge demand for automatic methods and tools for text summarization. We can think of automatic summarization as a type of information compression. To achieve such compression, better modelling and understanding of document structures and internal relations is required. In this article, we develop a novel approach to extractive text summarization by modelling texts and documents as small-world networks. Based on our recent work on the detection of unusual behavior in text, we model a document as a one-parameter family of graphs with its sentences or paragraphs defining the vertex set and with edges defined by Helmholtz's principle. We demonstrate that for some range of the parameters, the resulting graph becomes a small-world network. Such a remarkable structure opens the possibility of applying many measures and tools from social network theory to the problem of extracting the most important sentences and structures from text documents. We hope that documents will be also a new and rich source of examples of complex networks.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: Mathematics
Subjects: Q Science > QA Mathematics
Uncontrolled Keywords: Computing Methodologies; Document and text processing; Artificial Intelligence; Natural language processing; Pattern recognition; I.5.4 Applications
Publisher: Association for Computing Machinery
ISBN: 9781450308632
Last Modified: 19 Oct 2022 10:38
URI: https://orca.cardiff.ac.uk/id/eprint/25039

Citation Data

Cited 16 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item