Unimore logo AImageLab

Effective Codebooks for Human Action Categorization

Abstract: In this paper we propose a new method for human action categorization by using an effective combination of novel gradient and optic flow descriptors, and creating a more effective codebook modeling the ambiguity of feature assignment in the traditional bag-of-words model. Recent approaches have represented video sequences using a bag of spatio-temporal visual words, following the successful results achieved in object and scene classification. Codebooks are usually obtained by k-means clustering and hard assignment of visual features to the best representing codeword. Our main contribution is two-fold. First, we define a new 3D gradient descriptor that combined with optic flow outperforms the state-of-the-art, without requiring fine parameter tuning. Second, we show that for spatio-temporal features the popular k-means algorithm is insufficient because cluster centers are attracted by the denser regions of the sample distribution, providing a non-uniform description of the feature space and thus failing to code other informative regions. Therefore, we apply a radius-based clustering method and a soft assignment that considers the information of two or more relevant candidates. This approach generates a more effective codebook resulting in a further improvement of classification performances. We extensively test our approach on standard KTH and Weizmann action datasets showing its validity and outperforming other recent approaches.


Citation:

Lamberto, Ballan; Marco, Bertini; Alberto Del, Bimbo; Lorenzo, Seidenari; Serra, Giuseppe "Effective Codebooks for Human Action Categorization" Proc. of ICCV International Workshop on Video-oriented Object and Event Classification (VOEC), Kyoto, jpn, pp. 506 -513 , Sept. 27 2009-Oct. 4 2009, 2009 DOI: 10.1109/ICCVW.2009.5457658

 not available