Jiateng Wei

MS Student

Institute of Cyber-Systems and Control, Zhejiang University, China

Biography

I am pursuing my master’s degree in Control Engineering, Zhejiang University, Hangzhou, China. My major research interests lie in network pruning, quantization, and deployment.

Research and Interests

Network Pruning
Neural Network Deployment

Publications

Jiateng Wei, Quan Lu, Ning Jiang, Siqi Li, Jingyang Xiang, Jun Chen, and Yong Liu. Structured Optimal Brain Pruning for Large Language Models. In The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 13991-14007, 2024.
[BibTeX] [Abstract] [DOI]

language=”eng” data-ev-field=”abstract”>The massive parameters and computational demands hinder the widespread application of Large Language Models (LLMs). Network pruning provides a practical solution to this problem. However, existing pruning works for LLMs mainly focus on unstructured pruning or necessitate post-pruning fine-tuning. The former relies on special hardware to accelerate computation, while the latter may need substantial computational resources. In this paper, we introduce a retraining-free structured pruning method called SoBP (Structured Optimal Brain Pruning). It leverages global first-order information to select pruning structures, then refines them with a local greedy approach, and finally adopts module-wise reconstruction to mitigate information loss. We assess the effectiveness of SoBP across 14 models from 3 LLM families on 8 distinct datasets. Experimental results demonstrate that SoBP outperforms current state-of-the-art methods.

@inproceedings{wei2024sob,
title = {Structured Optimal Brain Pruning for Large Language Models},
author = {Jiateng Wei and Quan Lu and Ning Jiang and Siqi Li and Jingyang Xiang and Jun Chen and Yong Liu},
year = 2024,
booktitle = {The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
pages = {13991-14007},
doi = {10.18653/v1/2024.emnlp-main.775},
abstract = {language="eng" data-ev-field="abstract">The massive parameters and computational demands hinder the widespread application of Large Language Models (LLMs). Network pruning provides a practical solution to this problem. However, existing pruning works for LLMs mainly focus on unstructured pruning or necessitate post-pruning fine-tuning. The former relies on special hardware to accelerate computation, while the latter may need substantial computational resources. In this paper, we introduce a retraining-free structured pruning method called SoBP (Structured Optimal Brain Pruning). It leverages global first-order information to select pruning structures, then refines them with a local greedy approach, and finally adopts module-wise reconstruction to mitigate information loss. We assess the effectiveness of SoBP across 14 models from 3 LLM families on 8 distinct datasets. Experimental results demonstrate that SoBP outperforms current state-of-the-art methods.}
}

Address

Jiateng Wei

Biography

Research and Interests

Publications

Latest Events

APRIL实验室斩获ATEC 2025科技精英赛冠军，具身智能技术实现真实场景重大突破

喜报！APRIL实验室硕士生侯典泳荣获IROS 2025“移动操作领域最佳论文提名奖”

喜报！APRIL实验室在IROS 2025四足机器人挑战赛上荣获最佳自主导航奖