Boyu Mu

MS Student

Institute of Cyber-Systems and Control, Zhejiang University, China

Biography

I am pursuing my M.S. degree in College of Control Science and Engineering, Zhejiang University, Hangzhou, China. My major research interests include Deep Learning, Neural network quantification and object detection.

Research and Interests

Neural network quantification
Computer Vision

Publications

Jiazheng Xing, Mengmeng Wang, Boyu Mu, and Yong Liu. Revisiting the Spatial and Temporal Modeling for Few-Shot Action Recognition. In 37th AAAI Conference on Artificial Intelligence (AAAI), 2023.
[BibTeX] [Abstract] [DOI]

Spatial and temporal modeling is one of the most core aspects of few-shot action recognition. Most previous works mainly focus on long-term temporal relation modeling based on high-level spatial representations, without considering the crucial low-level spatial features and short-term temporal relations. Actually, the former feature could bring rich local semantic information, and the latter feature could represent motion characteristics of adjacent frames, respectively. In this paper, we propose SloshNet, a new framework that revisits the spatial and temporal modeling for few-shot action recognition in a finer manner. First, to exploit the low-level spatial features, we design a feature fusion architecture search module to automatically search for the best combination of the low-level and high-level spatial features. Next, inspired by the recent transformer, we introduce a long-term temporal modeling module to model the global temporal relations based on the extracted spatial appearance features. Meanwhile, we design another short-term temporal modeling module to encode the motion characteristics between adjacent frame representations. After that, the final predictions can be obtained by feeding the embedded rich spatial-temporal features to a common frame-level class prototype matcher. We extensively validate the proposed SloshNet on four few-shot action recognition datasets, including Something-Something V2, Kinetics, UCF101, and HMDB51. It achieves favorable results against state-of-the-art methods in all datasets.

@inproceedings{xing2023rst,
title = {Revisiting the Spatial and Temporal Modeling for Few-Shot Action Recognition},
author = {Jiazheng Xing and Mengmeng Wang and Boyu Mu and Yong Liu},
year = 2023,
booktitle = {37th AAAI Conference on Artificial Intelligence (AAAI)},
doi = {10.48550/arXiv.2301.07944},
abstract = {Spatial and temporal modeling is one of the most core aspects of few-shot action recognition. Most previous works mainly focus on long-term temporal relation modeling based
on high-level spatial representations, without considering the crucial low-level spatial features and short-term temporal relations. Actually, the former feature could bring rich local semantic information, and the latter feature could represent motion characteristics of adjacent frames, respectively. In this paper, we propose SloshNet, a new framework that revisits the spatial and temporal modeling for few-shot action recognition in a finer manner. First, to exploit the low-level spatial features, we design a feature fusion architecture search module to automatically search for the best combination of the low-level and high-level spatial features. Next, inspired by the recent transformer, we introduce a long-term temporal modeling module to model the global temporal relations based on
the extracted spatial appearance features. Meanwhile, we design another short-term temporal modeling module to encode the motion characteristics between adjacent frame representations. After that, the final predictions can be obtained by feeding the embedded rich spatial-temporal features to a common frame-level class prototype matcher. We extensively validate the proposed SloshNet on four few-shot action recognition datasets, including Something-Something V2, Kinetics, UCF101, and HMDB51. It achieves favorable results against state-of-the-art methods in all datasets.}
}

Address

Boyu Mu

Biography

Research and Interests

Publications

Latest Events

APRIL实验室刘勇教授荣获首届浙江省知识产权奖发明专利一等奖

APRIL实验室联合腾讯优图实验室斩获CVPR 2023视觉异常检测（VAND）挑战赛冠军

APRIL实验室刘勇教授入选全球前2%顶尖科学家榜单