Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Virtual unrolling and information recovery from scanned scrolled historical documents

Samko, Oksana, Lai, Yu-Kun ORCID: https://orcid.org/0000-0002-2094-5680, Marshall, David ORCID: https://orcid.org/0000-0003-2789-1395 and Rosin, Paul L. ORCID: https://orcid.org/0000-0002-4965-3884 2014. Virtual unrolling and information recovery from scanned scrolled historical documents. Pattern Recognition 47 (1) , pp. 248-259. 10.1016/j.patcog.2013.06.015

[thumbnail of ROSIN- Virtual unrolling.pdf]
Preview
PDF - Accepted Post-Print Version
Download (2MB) | Preview

Abstract

The objective of our work is to enable the reading of fragile scrolled historical parchments without the need to physically unravel them, thus providing valuable information to a wide range of scholarly disciplines. This problem has not been investigated by the computer vision community properly yet due to the need for parchment scanning technology: standard X-ray equipment is not sufficient as there is a requirement to extract out parchment ink in addition to the parchment's underlying structure. Effective data recovery is also compromised as content from historical scrolled documents is inaccessible due to the deterioration of the parchment. We create a 3D volumetric model of a scrolled parchment's underlying geometry and perform digital unwrapping of the parchment, producing a readable image of the text as an output. The proposed recovery framework consists of structure preserving anisotropic filtering in combination with robust segmentation, surface modelling and ink projection. We demonstrate with real examples how our algorithm is able to recover the underlying text and to solve the major challenge for scrolled parchment analysis, namely segmentation of connected layers and processing the data without user interaction.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Uncontrolled Keywords: Parchment restoration; Digital unwrapping; Document processing; Text retrieval; Volumetric scanning
Additional Information: Pdf uploaded in accordance with publisher's policy at http://www.sherpa.ac.uk/romeo/issn/0031-3203/ (accessed 03/07/14).
Publisher: Elsevier
ISSN: 0031-3203
Funders: EPSRC
Date of First Compliant Deposit: 30 March 2016
Date of Acceptance: 11 June 2013
Last Modified: 20 Nov 2024 02:30
URI: https://orca.cardiff.ac.uk/id/eprint/58522

Citation Data

Cited 31 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics