Yuzhang Shang

yuzhang.png
Office: UCF Global 216

I am an Assistant Professor in the Department of Computer Science and Institute of Artificial Intelligence at University of Central Florida (UCF). I am directing the Efficient, Scalable and Accelerated Learning Laboratory (EXceL-Lab).

I obtained my Ph.D. in Computer Science at the Illinois Institute of Technology (2021-2025) advised by Prof. Yan Yan (now at University of Illinois Chicago, UIC) and co-advised by Prof. Peng-Jun Wan.

During my Ph.D. journey, I was awarded MLCommons Rising Star in 2025. I was a visiting student at the University of Wisconsin-Madison hosted by Prof. Yong Jae Lee (lab famous for LLaVA). I interned as a researcher at Google DeepMind working with Daniele Moro, Weijun Wang and Andrew Howard (team famous for MobileNet), and I interned as a research scientist at Cisco Research working with Gaowen Liu and Ramana Kompella. Before Ph.D., I worked as research assistants at Shandong University and Hong Kong University of Science and Technology (HKUST) under the supervision of Prof. Liqiang Nie and Prof. Dan Xu, respectively. I received my bachelor’s degrees in Applied Mathematics and Economics (dual degrees) at Wuhan University.

[Hiring!] I am actively looking for postdocs, Ph.D. students, visiting scholars/students, and master/undergraduate researchers. If you are interested in working with me, feel free to contact me via yuzhang.hire@gmail.com. UCF@CSRankings: Computer Vision (8th), AI (44th), Overall (54th). 

Besides research, I am a contract photographer for Shutterstock Images and Getty Images (my portfolio).

Selected Publications

For a full list of publications, please refer to this page.

  1. LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
    Yuzhang Shang, Mu Cai, Bingxin Xu, Yong Jae Lee, and Yan Yan
    International Conference on Computer Vision (ICCV) 2025
    The first token reduction method for accelerating Multimodal LLM.
  2. PB-LLM: Partially Binarized Large Language Models
    Yuzhang Shang, Zhihang Yuan, and Zhen Dong
    International Conference on Learning Representations (ICLR) 2024
    The first binarization exploration for LLMs.
  3. PTQ4DM: Post-training Quantization on Diffusion Models
    Yuzhang Shang, Zhihang Yuan, Bin Xie, Bingzhe Wu, and Yan Yan
    Computer Vision and Pattern Recognition (CVPR) 2023
    The first network compression method for diffusion models.
  4. ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
    Zhihang Yuan*, Yuzhang Shang*, Yue Song, Qiang Wu, Yan Yan, and Guangyu Sun
    arXiv Dec. 2023
    The first low-rank decomposition method for LLMs. The concept of low-rank attention was later adopted in DeepSeek-v2 (six months afterward).