Kimmig, Angelika ![]() ![]() |
Preview |
PDF
- Accepted Post-Print Version
Download (526kB) | Preview |
Abstract
We propose a probabilistic approach to the problem of schema mapping. Our approach is declarative, scalable, and extensible. It builds upon recent results in both schema mapping and probabilistic reasoning and contributes novel techniques in both fields. We introduce the problem of schema mapping selection, that is, choosing the best mapping from a space of potential mappings, given both metadata constraints and a data example. As selection has to reason holistically about the inputs and the dependencies between the chosen mappings, we define a new schema mapping optimization problem which captures interactions between mappings as well as inconsistencies and incompleteness in the input. We then introduce Collective Mapping Discovery (CMD), our solution to this problem using state-of-the-art probabilistic reasoning techniques. Our evaluation on a wide range of integration scenarios, including several real-world domains, demonstrates that CMD effectively combines data and metadata information to infer highly accurate mappings even with significant levels of noise.
Item Type: | Article |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics |
Publisher: | Institute of Electrical and Electronics Engineers (IEEE) |
ISSN: | 1041-4347 |
Date of First Compliant Deposit: | 20 August 2018 |
Date of Acceptance: | 31 July 2018 |
Last Modified: | 04 Dec 2024 10:30 |
URI: | https://orca.cardiff.ac.uk/id/eprint/114258 |
Citation Data
Cited 5 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
![]() |
Edit Item |