Selected Publications
Check out the full publication list at my Google Scholar profile.
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation
ICML 2026
Dongxing Maoโ , Alex Jinpeng Wangโ , Jiawei Zhang, Weiming Han, Zhuobai Dong, Linjie Li, Yiqi Lin, Zhengyuan Yang, Libo Qin, Fuwei Zhang, Lijuan Wang, Min Li.(โ equal contribution)
[Project Page][Datasets][arXiv]
[Github]
๐๐100K+ downloads on Hugging Face๐๐
Residual Decoder Adapter: ID-Preserving Tokenizer Adaption for Autoregressive Text Rendering
CVPR 2026
Dongxing Maoโ , Alex Jinpeng Wangโ , Jiahao Tang, Kevin Qinghong Lin, Linjie Li, Zhengyuan Yang, Lijuan Wang, Min Li, Jingru Tan.
[Project Page][arXiv]
[Github]
TextGround4M: A Prompt-Aligned Dataset for Layout-Aware Text Rendering
AAAI 2026
Dongxing Mao, Yilin Wang, Linjie Li, Zhengyuan Yang, Alex Jinpeng Wang.
[Project Page][Datasets][arXiv]
[Github]
VCode: A Multimodal Coding Benchmark with SVG as Symbolic Visual Representation
CVPR 2026, Visual Concepts Workshop Oral
Kevin Qinghong Linโ , Yuhao Zhengโ , Hangyu Ranโ , Dongxing Mao, Linjie Li, Philip Torr, Alex Jinpeng Wang(โ equal contribution)
[Project Page][arXiv][Github]
- AssistGUI: Task-oriented Desktop Graphical User Interface Automation, Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou, CVPR 2024.
- VideoLLM-online: Towards Large Video Language Model for Streaming Video, Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, JiaWei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou, CVPR 2024.
- AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant, Stan Weixian Lei, Yuxuan Wang, Dongxing Mao, Difei Gao, Mike Zheng Shou, EMNLP 2022.
- AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant, Benita Wong, Joya Chen, You Wu, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou, ECCV 2022.



