Approximate Fisher information matrix to characterise the training of deep neural networks

Liao, Z.; Drummond, T.; Reid, I.; Carneiro, G.

doi:10.1109/TPAMI.2018.2876413

Approximate Fisher information matrix to characterise the training of deep neural networks

Date

2020

Authors

Liao, Z.

Drummond, T.

Reid, I.

Carneiro, G.

Type:

Journal article

Citation

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020; 42(1):15-26

Statement of Responsibility

Zhibin Liao, Tom Drummond, Ian Reid, Gustavo Carneiro,

DOI

10.1109/TPAMI.2018.2876413

Abstract

In this paper, we introduce a novel methodology for characterising the performance of deep learning networks (ResNets and DenseNet) with respect to training convergence and generalisation as a function of mini-batch size and learning rate for image classification. This methodology is based on novel measurements derived from the eigenvalues of the approximate Fisher information matrix, which can be efficiently computed even for high capacity deep models. Our proposed measurements can help practitioners to monitor and control the training process (by actively tuning the mini-batch size and learning rate) to allow for good training convergence and generalisation. Furthermore, the proposed measurements also allow us to show that it is possible to optimise the training process with a new dynamic sampling training approach that continuously and automatically change the mini-batch size and learning rate during the training process. Finally, we show that the proposed dynamic sampling training approach has a faster training time and a competitive classification accuracy compared to the current state of the art.

Description

Date of Publication: 16 October 2018

Rights

Grant ID

http://purl.org/au-research/grants/arc/DP180103232
http://purl.org/au-research/grants/arc/CE140100016
http://purl.org/au-research/grants/arc/FL130100102

Published Version

https://doi.org/10.1109/tpami.2018.2876413

Persistent link to this record

http://hdl.handle.net/2440/117297

Full item page

Approximate Fisher information matrix to characterise the training of deep neural networks

Date

Authors

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Citation

Statement of Responsibility

Conference Name

DOI

Abstract

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

License

Grant ID

Published Version

Call number

Persistent link to this record