Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Subgraph pattern matching over uncertain graphs with identity linkage uncertainty

Moustafa, Walaa Eldin, Kimmig, Angelika ORCID: https://orcid.org/0000-0002-6742-4057, Deshpande, Amol and Getoor, Lise 2014. Subgraph pattern matching over uncertain graphs with identity linkage uncertainty. Presented at: IEEE International Conference on Data Engineering, Chicago, IL, USA, 31 March- 4 April 2014. 2014 IEEE 30th International Conference on Data Engineering. IEEE, 10.1109/ICDE.2014.6816710

[thumbnail of moustafa-icde14.pdf]
Preview
PDF - Accepted Post-Print Version
Download (1MB) | Preview

Abstract

There is a growing need for methods that can represent and query uncertain graphs. These uncertain graphs are often the result of an information extraction and integration system that attempts to extract an entity graph or a knowledge graph from multiple unstructured sources [25], [7]. Such an integration typically leads to identity uncertainty, as different data sources may use different references to the same underlying real-world entities. Integration usually also introduces additional uncertainty on node attributes and edge existence. In this paper, we propose the notion of a probabilistic entity graph (PEG), a formal model that uniformly and systematically addresses these three types of uncertainty. A PEG is a probabilistic graph model that defines a distribution over possible graphs at the entity level. We introduce a general framework for constructing a PEG given uncertain data at the reference level and develop efficient algorithms to answer subgraph pattern matching queries in this setting. Our algorithms are based on two novel ideas: context-aware path indexing and reduction by join-candidates, which drastically reduce the query search space. A comprehensive experimental evaluation shows that our approach outperforms baseline implementations by orders of magnitude.

Item Type: Conference or Workshop Item (Paper)
Date Type: Published Online
Status: Published
Schools: Computer Science & Informatics
Publisher: IEEE
ISBN: 9781479925551
Date of First Compliant Deposit: 5 August 2019
Date of Acceptance: 4 November 2013
Last Modified: 26 Oct 2022 07:23
URI: https://orca.cardiff.ac.uk/id/eprint/124717

Citation Data

Cited 22 times in Scopus. View in Scopus. Powered By Scopus® Data

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics