Yuan, ChunfengLi, XiHu, WeimingLing, HaibinMaybank, Steve2014-05-062014-05-062013Proceedings, 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2013, 23-28 June 2013, Portland, Oregon, USA: pp. 724-73097807695498971063-6919http://hdl.handle.net/2440/82695Spatio-temporal interest points serve as an elementary building block in many modern action recognition algorithms, and most of them exploit the local spatio-temporal volume features using a Bag of Visual Words (BOVW) representation. Such representation, however, ignores potentially valuable information about the global spatio-temporal distribution of interest points. In this paper, we propose a new global feature to capture the detailed geometrical distribution of interest points. It is calculated by using the R transform which is defined as an extended 3D discrete Radon transform, followed by applying a two-directional two-dimensional principal component analysis. Such R feature captures the geometrical information of the interest points and keeps invariant to geometry transformation and robust to noise. In addition, we propose a new fusion strategy to combine the R feature with the BOVW representation for further improving recognition accuracy. We utilize a context-aware fusion method to capture both the pairwise similarities and higher-order contextual interactions of the videos. Experimental results on several publicly available datasets demonstrate the effectiveness of the proposed approach for action recognition.en© 2013 IEEE3D R transform on spatio-temporal interest points for action recognitionConference paper002013325210.1109/CVPR.2013.99