Fast top-k similarity search in large dynamic attributed networks

dc.contributor.authorMeng, Z.
dc.contributor.authorShen, H.
dc.date.issued2019
dc.description.abstractIn this paper, we study the problem of retrieving top-k nodes that are similar to a given query node in large dynamic attributed networks. To tackle this problem, we propose a fast Attribute augmented Single-source Path similarity algorithm (ASP). Our ASP constructs an attribute augmented network that integrates both node structure and attribute similarities through similarity scores computed by an efficient single-source path sampling scheme. It also contains simple and effective updating schemes to maintain similarity scores for dynamic edge insertions and deletions. We provide an upper bound of the sampling size of ASP for obtaining an ϵ-approximation estimation of similarity scores with probability at least . We theoretically prove that the sampling size and computational complexity of ASP are significantly lower than that of deploying the currently known most effective method Panther. Implementation results from extensive experiments in both synthetic networks and real-world social networks show that our ASP achieves better convergence performance, runs faster than the baselines, and effectively captures both structure and attribute similarities of the nodes. Specifically, we show that our algorithm can return top-k similar nodes for any query node in the real-world network at least 44x faster than the state-of-the-art methods, and update similarity scores in about 1 second for a batch of network modifications in a network of size 1,000,000.
dc.description.statementofresponsibilityZaiqiao Meng, Hong Shen
dc.identifier.citationInformation Processing and Management, 2019; 56(6):102074-1-102074-18
dc.identifier.doi10.1016/j.ipm.2019.102074
dc.identifier.issn0306-4573
dc.identifier.issn1873-5371
dc.identifier.orcidShen, H. [0000-0002-3663-6591] [0000-0003-0649-0648]
dc.identifier.urihttp://hdl.handle.net/2440/122578
dc.language.isoen
dc.publisherElsevier
dc.relation.granthttp://purl.org/au-research/grants/arc/DP150104871
dc.rights© 2019 Elsevier Ltd. All rights reserved.
dc.source.urihttp://dx.doi.org/10.1016/j.ipm.2019.102074
dc.subjectSimilarity search; top-k search; attributed network; path sampling; random walk
dc.titleFast top-k similarity search in large dynamic attributed networks
dc.typeJournal article
pubs.publication-statusPublished

Files