Bofeng Jiang
MS Student
Institute of Cyber-Systems and Control, Zhejiang University, China
Biography
I am pursuing my M.S. degree in College of Control Science and Engineering, Zhejiang University, Hangzhou, China. My major research interest isnetwork quantization and FPGA.
Research and Interests
- Deep Learning
- Network compression
- Network quantization
Publications
- Bofeng Jiang, Jun Chen, and Yong Liu. Single-Shot Pruning and Quantization for Hardware-Friendly Neural Network Acceleration. Engineering Applications of Artificial Intelligence, 126:106816, 2023.
[BibTeX] [Abstract] [DOI] [PDF]Applying CNN on embedded systems is challenging due to model size limitations. Pruning and quantization can help, but are time-consuming to apply separately. Our Single-Shot Pruning and Quantization strategy addresses these issues by quantizing and pruning in a single process. We evaluated our method on CIFAR-10 and CIFAR-100 datasets for image classification. Our model is 69.4% smaller with little accuracy loss, and runs 6-8 times faster on NVIDIA Xavier NX hardware.
@article{jiang2023ssp, title = {Single-Shot Pruning and Quantization for Hardware-Friendly Neural Network Acceleration}, author = {Bofeng Jiang and Jun Chen and Yong Liu}, year = 2023, journal = {Engineering Applications of Artificial Intelligence}, volume = 126, pages = {106816}, doi = {10.1016/j.engappai.2023.106816}, abstract = {Applying CNN on embedded systems is challenging due to model size limitations. Pruning and quantization can help, but are time-consuming to apply separately. Our Single-Shot Pruning and Quantization strategy addresses these issues by quantizing and pruning in a single process. We evaluated our method on CIFAR-10 and CIFAR-100 datasets for image classification. Our model is 69.4% smaller with little accuracy loss, and runs 6-8 times faster on NVIDIA Xavier NX hardware.} }
- Junyu Zhu, Lina Liu, Bofeng Jiang, Feng Wen, Hongbo Zhang, Wanlong li, and Yong Liu. Self-Supervised Event-Based Monocular Depth Estimation Using Cross-Modal Consistency. In 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7704-7710, 2023.
[BibTeX] [Abstract] [DOI] [PDF]An event camera is a novel vision sensor that can capture per-pixel brightness changes and output a stream of asynchronous “events”. It has advantages over conventional cameras in those scenes with high-speed motions and challenging lighting conditions because of the high temporal resolution, high dynamic range, low bandwidth, low power consumption, and no motion blur. Therefore, several supervised monocular depth estimation from events is proposed to address scenes difficult for conventional cameras. However, depth annotation is costly and time-consuming. In this paper, to lower the annotation cost, we propose a self-supervised event-based monocular depth estimation framework named EMoDepth. EMoDepth constrains the training process using the cross-modal consistency from intensity frames that are aligned with events in the pixel coordinate. Moreover, in inference, only events are used for monocular depth prediction. Additionally, we design a multi-scale skip-connection architecture to effectively fuse features for depth estimation while maintaining high inference speed. Experiments on MVSEC and DSEC datasets demonstrate that our contributions are effective and that the accuracy can outperform existing supervised event-based and unsupervised frame-based methods.
@inproceedings{zhu2023sse, title = {Self-Supervised Event-Based Monocular Depth Estimation Using Cross-Modal Consistency}, author = {Junyu Zhu and Lina Liu and Bofeng Jiang and Feng Wen and Hongbo Zhang and Wanlong li and Yong Liu}, year = 2023, booktitle = {2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)}, pages = {7704-7710}, doi = {10.1109/IROS55552.2023.10342434}, abstract = {An event camera is a novel vision sensor that can capture per-pixel brightness changes and output a stream of asynchronous “events”. It has advantages over conventional cameras in those scenes with high-speed motions and challenging lighting conditions because of the high temporal resolution, high dynamic range, low bandwidth, low power consumption, and no motion blur. Therefore, several supervised monocular depth estimation from events is proposed to address scenes difficult for conventional cameras. However, depth annotation is costly and time-consuming. In this paper, to lower the annotation cost, we propose a self-supervised event-based monocular depth estimation framework named EMoDepth. EMoDepth constrains the training process using the cross-modal consistency from intensity frames that are aligned with events in the pixel coordinate. Moreover, in inference, only events are used for monocular depth prediction. Additionally, we design a multi-scale skip-connection architecture to effectively fuse features for depth estimation while maintaining high inference speed. Experiments on MVSEC and DSEC datasets demonstrate that our contributions are effective and that the accuracy can outperform existing supervised event-based and unsupervised frame-based methods.} }