Please use this identifier to cite or link to this item: http://hdl.handle.net/2440/107956
Type: Conference paper
Title: Encoding high dimensional local features by sparse coding based fisher vectors
Author: Liu, L.
Shen, C.
Wang, L.
Van Den Hengel, A.
Wang, C.
Citation: Proceedings of the 27th International Conference on Neural Information Processing Systems, 2014 / vol.2, iss.January, pp.1143-1151
Publisher: MIT Press
Publisher Place: Online
Issue Date: 2014
Series/Report no.: Advances in Neural Information Processing Systems
ISSN: 1049-5258
Conference Name: 27th International Conference on Neural Information Processing Systems (NIPS'14) (08 Dec 2014 - 13 Dec 2014 : Montreal, Canada)
Statement of
Responsibility: 
Lingqiao Liu, Chunhua Shen, Lei Wang, Anton van den Hengel, Chao Wang
Abstract: Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) to characterize the generation process of local features. This choice has shown to be sufficient for traditional low dimensional local features, e.g., SIFT; and typically, good performance can be achieved with only a few hundred Gaussian distributions. However, the same number of Gaussians is insufficient to model the feature space spanned by higher dimensional local features, which have become popular recently. In order to improve the modeling capacity for high dimensional features, it turns out to be inefficient and computationally impractical to simply increase the number of Gaussians. In this paper, we propose a model in which each local feature is drawn from a Gaussian distribution whose mean vector is sampled from a subspace. With certain approximation, this model can be converted to a sparse coding procedure and the learning/inference problems can be readily solved by standard sparse coding methods. By calculating the gradient vector of the proposed model, we derive a new fisher vector encoding strategy, termed Sparse Coding based Fisher Vector Coding (SCFVC). Moreover, we adopt the recently developed Deep Convolutional Neural Network (CNN) descriptor as a high dimensional local feature and implement image classification with the proposed SCFVC. Our experimental evaluations demonstrate that our method not only significantly outperforms the traditional GMM based Fisher vector encoding but also achieves the state-ofthe- art performance in generic object recognition, indoor scene, and fine-grained image classification problems.
Rights: MIT Press Cambridge, MA, USA © 2014
RMID: 0030033174
Grant ID: http://purl.org/au-research/grants/arc/FT120100969
http://purl.org/au-research/grants/arc/LP120200485
Published version: http://dl.acm.org/citation.cfm?id=2968954
Appears in Collections:Computer Science publications

Files in This Item:
File Description SizeFormat 
RA_hdl_107956.pdfRestricted Access245.9 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.