Bridging causal relevance and pattern discriminability: mining emerging patterns from high-dimensional data

dc.contributor.authorYu, K.
dc.contributor.authorDing, W.
dc.contributor.authorWang, H.
dc.contributor.authorWu, X.
dc.date.issued2013
dc.descriptionLink to a related website: http://www.cs.umb.edu/%7Eding/papers/tkde2012.pdf, Open Access via Unpaywall
dc.description.abstractIt is a nontrivial task to build an accurate emerging pattern (EP) classifier from high-dimensional data because we inevitably face two challenges 1) how to efficiently extract a minimal set of strongly predictive EPs from an explosive number of candidate patterns, and 2) how to handle the highly sensitive choice of the minimal support threshold. To address these two challenges, we bridge causal relevance and EP discriminability (the predictive ability of emerging patterns) to facilitate EP mining and propose a new framework of mining EPs from high-dimensional data. In this framework, we study the relationships between causal relevance in a causal Bayesian network and EP discriminability in EP mining, and then reduce the pattern space of EP mining to direct causes and direct effects, or the Markov blanket (MB) of the class attribute in a causal Bayesian network. The proposed framework is instantiated by two EPs-based classifiers, CE-EP and MB-EP, where CE stands for direct Causes and direct Effects, and MB for Markov Blanket. Extensive experiments on a broad range of data sets validate the effectiveness of the CE-EP and MB-EP classifiers against other well-established methods, in terms of predictive accuracy, pattern numbers, running time, and sensitivity analysis.
dc.identifier.citationIEEE Transactions on Knowledge and Data Engineering, 2013; 25(12):2721-2739
dc.identifier.doi10.1109/TKDE.2012.218
dc.identifier.issn1041-4347
dc.identifier.urihttps://hdl.handle.net/11541.2/120131
dc.language.isoen
dc.publisherIEEE
dc.relation.fundingNational 863 Program of China 2012AA011005
dc.relation.fundingNational 973 Program of China 2013CB329604
dc.relation.fundingNational Natural Science Foundation of China 61229301
dc.relation.fundingNational Natural Science Foundation of China 61070131
dc.relation.fundingNational Natural Science Foundation of China 61175051
dc.relation.fundingNational Natural Science Foundation of China 61005007
dc.relation.fundingUS National Science Foundation CCF-0905337
dc.relation.fundingUS NASA Research Award NNX09AK86G
dc.rightsCopyright 2014 IEEE
dc.source.urihttps://doi.org/10.1109/TKDE.2012.218
dc.subjectassociation rules
dc.subjectBayesian methods
dc.subjectdata mining
dc.subjectitemsets
dc.subjectpattern recognition
dc.subjectMarkov processes
dc.titleBridging causal relevance and pattern discriminability: mining emerging patterns from high-dimensional data
dc.typeJournal article
pubs.publication-statusPublished
ror.mmsid9916027051001831

Files

Collections