Balinsky, Alexander ![]() |
Abstract
Keyword and feature extraction is a fundamental problem in data mining and document processing. A majority of applications directly depend on the quality and speed of keyword and feature extraction pre-processing results. In the current paper we present novel algorithms for feature extraction and change detection in unstructured data, primarily in textual and sequential data. Our approach is based on ideas from image processing and especially on the Helmholtz Principle from the Gestalt Theory of human perception. The improvements due to the novel feature extraction technique are demonstrated on several key applications: classification for strengthening document security and storage optimization, automatic summarization and segmentation for problems of information overload. The developed algorithms and applications are the result of research collaboration between Cardiff University School of Mathematics and HP Laboratories.
Item Type: | Book Section |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Mathematics |
Subjects: | Q Science > QA Mathematics |
Publisher: | Springer |
ISBN: | 9783319254524 |
Last Modified: | 31 Oct 2022 10:50 |
URI: | https://orca.cardiff.ac.uk/id/eprint/86401 |
Actions (repository staff only)
![]() |
Edit Item |