Optimizing clustering to promote data diversity when generating an ensemble classifier

Jan, Z.M.; Verma, B.; Fletcher, S.

doi:10.1145/3205651.3208245

Optimizing clustering to promote data diversity when generating an ensemble classifier

Date

2018

Authors

Jan, Z.M.

Verma, B.

Fletcher, S.

Type:

Conference paper

Citation

Proceedings of the 2018 Genetic and Evolutionary Computation Conference Companion - GECCO 2018 Companion, 2018, pp.1402-1409

Conference Name

Genetic and Evolutionary Computation Conference, GECCO 2018 (15 Jul 2018 - 19 Jul 2018 : Kyoto, Japan)

DOI

10.1145/3205651.3208245

Abstract

In this paper, we propose a method to generate an optimized ensemble classifier. In the proposed method, a diverse input space is created by clustering training data incrementally within a cycle. A cycle is one complete round that includes clustering, training, and error calculation. In each cycle, a random upper bound of clustering is chosen and data clusters are generated. A set of heterogeneous classifiers are trained on all generated clusters to promote structural diversity. An ensemble classifier is formed in each cycle and generalization error of that ensemble is calculated. This process is optimized to find the set of classifiers which can have the lowest generalization error. The process of optimization terminates when generalization error can no longer be minimized. The cycle with the lowest error is then selected and all trained classifiers of that particular cycle are passed to the next stage. Any classifier having lower accuracy than the average accuracy of the pool is discarded, and the remaining classifiers form the proposed ensemble classifier. The proposed ensemble classifier is tested on classification benchmark datasets from UCI repository. The results are compared with existing state-of-the-art ensemble classifier methods including Bagging and Boosting. It is demonstrated that the proposed ensemble classifier performs better than the existing ensemble methods.

Rights

Published Version

https://doi.org/10.1145/3205651.3208245

Persistent link to this record

https://hdl.handle.net/11541.2/145825

Full item page

Optimizing clustering to promote data diversity when generating an ensemble classifier

Date

Authors

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Citation

Statement of Responsibility

Conference Name

DOI

Abstract

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

License

Grant ID

Published Version

Call number

Persistent link to this record