Top-k keyword search over probabilistic XML data
Files
(Restricted Access)
Date
2011
Authors
Li, J.
Liu, C.
Zhou, R.
Wang, W.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Proceedings of the International Conference on Data Engineering, 2011, pp.673-684
Statement of Responsibility
Jianxin Li, Chengfei Liu, Rui Zhou, Wei Wang
Conference Name
2011 IEEE 27th International Conference on Data Engineering (ICDE 2011) (11 Apr 2011 - 16 Apr 2011 : Hannover)
Abstract
Despite the proliferation of work on XML keyword query, it remains open to support keyword query over probabilistic XML data. Compared with traditional keyword search, it is far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we firstly define the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabilities of existence. And then we propose two efficient algorithms. The first algorithm PrStack can find k SLCA results with the k highest probabilities by scanning the relevant keyword nodes only once. To further improve the efficiency, we propose a second algorithm EagerTopK based on a set of pruning properties which can quickly prune unsatisfied SLCA candidates. Finally, we implement the two algorithms and compare their performance with analysis of extensive experimental results.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
Copyright © 2011 IEEE.