Please use this identifier to cite or link to this item: http://hdl.handle.net/2440/115616
Citations
Scopus Web of Science® Altmetric
?
?
Type: Conference paper
Title: A topic model based on poisson decomposition
Author: Jiang, H.
Zhou, R.
Zhang, L.
Wang, H.
Zhang, Y.
Citation: Proceedings of the ACM Conference on Information and Knowledge Management (CIKM 2017), 2017 / vol.Part F131841, pp.1489-1498
Publisher: Association for Computing Machinery
Publisher Place: New York, NY, USA
Issue Date: 2017
ISBN: 9781450349185
Conference Name: ACM Conference on Information and Knowledge Management (CIKM 2017) (06 Nov 2017 - 10 Nov 2017 : Singapore, SINGAPORE)
Statement of
Responsibility: 
Haixin Jiang, Rui Zhou, Limeng Zhang, Hua Wang, Yanchun Zhang
Abstract: Determining appropriate statistical distributions for modeling text corpora is important for accurate estimation of numerical charac- teristics. Based on the validity of the test on a claim that the data conforms to Poisson distribution we propose Poisson decomposi- tion model (PDM), a statistical model for modeling count data of text corpora, which can straightly capture each document’s mul- tidimensional numerical characteristics on topics. In PDM, each topic is represented as a parameter vector with multidimensional Poisson distribution, which can be easily normalized to multino- mial term probabilities and each document is represented as mea- surements on topics and thereby reduced to a measurement vec- tor on topics. We use gradient descent methods and sampling al- gorithm for parameter estimation. We carry out extensive experi- ments on the topics produced by our models. The results demon- strate our approach can extract more coherent topics and is com- petitive in document clustering by using the PDM-based features, compared to PLSI and LDA.
Keywords: Topic model; Poisson decomposition; statistical testing; text classi- fication; topic coherence
Description: Session 8B: Text Analysis
Rights: © 2017 Association for Computing Machinery.
RMID: 0030080714
DOI: 10.1145/3132847.3132942
Grant ID: http://purl.org/au-research/grants/arc/DP170104747
Appears in Collections:Computer Science publications

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.