About me

I am a 3rd-year Ph.D student, supervised by Prof. Tat-Seng Chua in NExT++, School of Computing, National University of Singapore. Before that, I spent 1 year in THUNLP in Tsinghua University, working as a research assistant supervised by Prof. Zhiyuan Liu. During my bachelor learning in Nanjing University, I was a member in MAGUS and got supervised by Prof. Tongwei Ren. My research interests include scene graph generation and pre-trained models for vision and language understanding.

Chat with the [VL-Vicuna] built with our VPGTrans!

Preprint

* indicates equal contribution.

  • CPT: Colorful prompt tuning for pre-trained vision-language models. [arxiv] [code]
    Yuan Yao*, Ao Zhang*, Zhengyan Zhang, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun.

  • Pre-Trained Models: Past, Present and Future. [arxiv]
    Xu Han*, Zhengyan Zhang*, Ning Ding*, Yuxian Gu*, Xiao Liu*, Yuqi Huo*, Jiezhong Qiu, Yuan Yao, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu.

Publications

2023

  • NExT-Chat: An LMM for Chat, Detection and Segmentation. [website] [arxiv] [code]
    Ao Zhang, Yuan Yao#, Wei Ji, Zhiyuan Liu, and Tat-Seng Chua. (#Correspondence) Arxiv

  • Transfer Visual Prompt Generator across LLMs. [demo] [arxiv] [code]
    Ao Zhang, Hao Fei#, Yuan Yao#, Wei Ji, Li Li, Zhiyuan Liu, and Tat-Seng Chua. (#Correspondence) Conference on Neural Information Processing Systems (NeurIPS 2023)

2022

  • Fine-Grained Scene Graph Generation with Data Transfer. [arxiv] [code]
    Ao Zhang*, Yuan Yao*, Qianyu Chen, Wei Ji, Zhiyuan Liu, Maosong Sun, Tat-Seng Chua.
    European Conference on Computer Vision (ECCV 2022)
    (Oral Presentation, 2.7%)

  • PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models. [arxiv] [code]
    Yuan Yao, Qianyu Chen, Ao Zhang, Wei Ji, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun.
    Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)

2021

  • Visual Distant Supervision for Scene Graph Generation. [arxiv] [code]
    Yuan Yao*, Ao Zhang*, Xu Han, Mengdi Li, Cornelius Weber, Zhiyuan Liu, Stefan Wermter, Maosong Sun.
    International Conference on Computer Vision (ICCV 2021).

Previous

  • Monocular image based 3D model retrieval. [paper]
    Wenhui Li, Anan Liu, Weizhi Nie, Dan Song, Yuqian Li, Weijie Wang, Shu Xiang, Heyu Zhou, Ngoc-Minh Bui, Yunchi Cen, Zenian Chen, Huy-Hoang Chung-Nguyen, Gia-Han Diep, Trong-Le Do, Eugeni L. Doubrovski, Anh-Duc Duong, Jo M. P. Geraedts, Haobin Guo, Trung-Hieu Hoang, Yichen Li, Xing Liu, Zishun Liu, Duc-Tuan Luu, Yunsheng Ma, Vinh-Tiep Nguyen, Jie Nie, Tongwei Ren, Mai-Khiem Tran, Son-Thanh Tran-Nguyen, Minh-Triet Tran, The-Anh Vu-Le, Charlie C. L. Wang, Shijie Wang, Gangshan Wu, Caifei Yang, Meng Yuan, Hao Zhai, Ao Zhang, Fan Zhang, and Sicheng Zhao.
    Eurographics Workshop on 3D Object Retrieval (EGW’19-3DOR), Genoa, Italy, 2019.

Book Chapter

  • RGB-D salient object detection: a review. Chapter of book “RGB-D image analysis and processing”, edited by Paul Rosin, Yu-Kun Lai, Ling Shao, and Yonghuai Liu, 2019. [link]
    Tongwei Ren, and Ao Zhang.