Please use this identifier to cite or link to this item: http://hdl.handle.net/2440/108612
Citations
Scopus Web of Science® Altmetric
?
?
Full metadata record
DC FieldValueLanguage
dc.contributor.authorZhang, W.en
dc.contributor.authorSheng, Q.en
dc.contributor.authorAbebe, E.en
dc.contributor.authorAli Babar, M.en
dc.contributor.authorZhou, A.en
dc.date.issued2016en
dc.identifier.citationAdvanced Data Mining and Applications, 2016 / vol.10086 LNAI, pp.664-676en
dc.identifier.isbn9783319495859en
dc.identifier.issn0302-9743en
dc.identifier.issn1611-3349en
dc.identifier.urihttp://hdl.handle.net/2440/108612-
dc.descriptionLNCS, volume 10086en
dc.description.abstractDevelopers nowadays can leverage existing systems to build their own applications. However, a lack of documentation hinders the process of software system reuse. We examine the problem of mining topics (i.e., topic extraction) from source code, which can facilitate the comprehension of the software systems. We propose a topic extraction method, Embedded Topic Extraction (EmbTE), that considers word semantics, which are never considered in mining topics from source code, by leveraging word embedding techniques. We also adopt Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) to extract topics from source code. Moreover, an automated term selection algorithm is proposed to identify the most contributory terms from source code for the topic extraction task. The empirical studies on Github (https://github.com/) Java projects show that EmbTE outperforms other methods in terms of providing more coherent topics. The results also indicate that method name, method comments, class names and class comments are the most contributory types of terms to source code topic extraction.en
dc.description.statementofresponsibilityWei Emma Zhang, Quan Z. Sheng, Ermyas Abebe, M. Ali Babar, and Andi Zhouen
dc.language.isoenen
dc.publisherSpringeren
dc.rights© Springer International Publishing AG 2016en
dc.subjectSource code mining; Topic model; Word embeddingen
dc.titleMining source code topics through topic model and words embeddingen
dc.typeConference paperen
dc.identifier.rmid0030059762en
dc.contributor.conferenceInternational Conference on Advanced Data Mining and Applications (ADMA) (12 Dec 2016 - 15 Dec 2016 : Gold Coast, Qld)en
dc.identifier.doi10.1007/978-3-319-49586-6_47en
dc.identifier.pubid281156-
pubs.library.collectionComputer Science publicationsen
pubs.library.teamDS07en
pubs.verification-statusVerifieden
pubs.publication-statusPublisheden
dc.identifier.orcidZhang, W. [0000-0002-0406-5974]en
Appears in Collections:Computer Science publications

Files in This Item:
File Description SizeFormat 
RA_hdl_108612.pdfRestricted Access473.27 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.