抄録
H-008
Disentanglement Approach for Video Action Recognition
Charles Lima Sanches・Yaman Dang・Takashi Kanemaru・Yuichi Nonaka・Yuto Komatsu(日立)
Since the development of deep learning, video action recognition models have been widely used to ensure safety in public areas. However, a model trained in specific settings (with a given background, camera angle, illumination, etc.) will struggle to keep its performance in different settings. Disentanglement is a recent technique used to encode separately different factors of the data and seems to be a promising approach to make the models more robust to setting changes. In this paper we investigate the effect of background disentanglement on the action recognition task. We show that for simple datasets the classification accuracy can reach 91% but the performance drops to 78% for more complex datasets. We propose a new approach to improve the performance up to 94% for complex datasets.