Please use this identifier to cite or link to this item:
Scopus Web of Science® Altmetric
Type: Conference paper
Title: CITPM: A cluster-based iterative topical phrase mining framework
Author: Li, B.
Wang, B.
Zhou, R.
Yang, X.
Liu, C.
Citation: Database Systems for Advanced Applications, 2016 / vol.9642, pp.197-213
Publisher: Springer
Issue Date: 2016
Series/Report no.: Lecture Notes in Computer Science
ISBN: 9783319320243
ISSN: 0302-9743
Conference Name: International Conference of Database Systems for Advanced Applications (DASFAA) (16 Apr 2016 - 19 Apr 2016 : Dallas, USTX)
Statement of
Bing Li, BinWang, Rui Zhou, Xiaochun Yang, B, and Chengfei Liu
Abstract: A phrase is a natural, meaningful, essential semantic unit. In topic modeling, visualizing phrases for individual topics is an effective way to explore and understand unstructured text corpora. Unfortunately, existing approaches predominately rely on the general distributional features between topics and phrases on an entire corpus, while ignore the impact of domain-level topical distribution. This often leads to losing domain-specific terminologies, and as a consequence, weakens the coherence of topical phrases. In this paper, we present a novel framework CITPM for topical phrase mining. Our framework views a corpus as a mixture of clusters (domains), and each cluster is characterized by documents sharing similar topical distributions. The CITPM framework iteratively performs phrase mining, topical inferring and cluster updating until a satisfactory final result is obtained. The empirical verification demonstrates our framework outperforms state-of-the-art works in both aspects of interpretability and efficiency.
Keywords: Topical phrase; Phrase mining; Document clustering
Rights: © Springer International Publishing Switzerland 2016
RMID: 0030068762
DOI: 10.1007/978-3-319-32025-0_13
Grant ID:
Appears in Collections:Computer Science publications

Files in This Item:
File Description SizeFormat 
RA_hdl_109414.pdfRestricted Access984.06 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.