Please use this identifier to cite or link to this item:
https://hdl.handle.net/2440/115616
Citations | ||
Scopus | Web of Science® | Altmetric |
---|---|---|
?
|
?
|
Type: | Conference paper |
Title: | A topic model based on poisson decomposition |
Author: | Jiang, H. Zhou, R. Zhang, L. Wang, H. Zhang, Y. |
Citation: | Proceedings of the ACM Conference on Information and Knowledge Management (CIKM 2017), 2017, vol.Part F131841, pp.1489-1498 |
Publisher: | Association for Computing Machinery |
Publisher Place: | New York, NY, USA |
Issue Date: | 2017 |
ISBN: | 9781450349185 |
Conference Name: | ACM Conference on Information and Knowledge Management (CIKM 2017) (6 Nov 2017 - 10 Nov 2017 : Singapore, SINGAPORE) |
Statement of Responsibility: | Haixin Jiang, Rui Zhou, Limeng Zhang, Hua Wang, Yanchun Zhang |
Abstract: | Determining appropriate statistical distributions for modeling text corpora is important for accurate estimation of numerical charac- teristics. Based on the validity of the test on a claim that the data conforms to Poisson distribution we propose Poisson decomposi- tion model (PDM), a statistical model for modeling count data of text corpora, which can straightly capture each document’s mul- tidimensional numerical characteristics on topics. In PDM, each topic is represented as a parameter vector with multidimensional Poisson distribution, which can be easily normalized to multino- mial term probabilities and each document is represented as mea- surements on topics and thereby reduced to a measurement vec- tor on topics. We use gradient descent methods and sampling al- gorithm for parameter estimation. We carry out extensive experi- ments on the topics produced by our models. The results demon- strate our approach can extract more coherent topics and is com- petitive in document clustering by using the PDM-based features, compared to PLSI and LDA. |
Keywords: | Topic model; Poisson decomposition; statistical testing; text classi- fication; topic coherence |
Description: | Session 8B: Text Analysis |
Rights: | © 2017 Association for Computing Machinery. |
DOI: | 10.1145/3132847.3132942 |
Grant ID: | http://purl.org/au-research/grants/arc/DP170104747 |
Published version: | http://dx.doi.org/10.1145/3132847.3132942 |
Appears in Collections: | Aurora harvest 3 Computer Science publications |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.