Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

On the Helmholtz Principle for data mining

Balinsky, Helen, Balinsky, Alexander ORCID: and Simske, Steven 2010. On the Helmholtz Principle for data mining. Hewlett Packard Development Company.

Full text not available from this repository.


We present novel algorithms for feature extraction and change detection in unstructured data, primarily in textual and sequential data. Keyword and feature extraction is a fundamental problem in text data mining and document processing. A majority of document processing applications directly depend on the quality and speed of keyword extraction algorithms. In this article, a novel approach to rapid change detection in data streams and documents is developed. It is based on ideas from image processing and especially on the Helmholtz Principle from the Gestalt Theory of human perception. Applied to the problem of keywords extraction, it delivers fast and effective tools to identify meaningful keywords using parameter-free methods. We also define a level of meaningfulness of the keywords which can be used to modify the set of keywords depending on application needs.

Item Type: Other
Date Type: Publication
Status: Published
Schools: Mathematics
Subjects: Q Science > QA Mathematics
Uncontrolled Keywords: extraction; feature extraction; unusual behavior detection; Helmholtz principle; mining textual; unstructured datasets
Publisher: Hewlett Packard Development Company
Last Modified: 19 Oct 2022 10:37

Citation Data

Actions (repository staff only)

Edit Item Edit Item