Sixun Dong

Sixun Dong

PhD Student in Computer Science

Arizona State University

I am a PhD student at Arizona State University, supervised by Professor Yanjie Fu. I completed my Master's at ShanghaiTech University under Professor Shenghua Gao.

My research focuses on multimodal AI systems that bridge computer vision, natural language processing, and machine learning, with different applications.

Recent News

May 2025 Started GenAI Research Internship at Zoom Inc. focusing on efficient vision-language modeling
Aug 2024 Started PhD program at Arizona State University
Feb 2024 Paper on MLLM-Tool accepted to WACV 2024
Mar 2023 Paper on WeakSVR accepted to CVPR 2023
Mar 2022 Paper on TransRAC accepted as πŸ† oral presentation to CVPR 2022

Selected Publications

Teaching Time Series to See and Speak: Forecasting with Aligned Visual and Textual Perspectives

Under Review Teaching Time Series to See and Speak: Forecasting with Aligned Visual and Textual Perspectives

Sixun Dong, Wei Fan, Teresa Wu, Yanjie Fu

TimesFrame: Multi-Variable Time Series is a Video of Numerical Data

Under Review TimesFrame: Multi-Variable Time Series is a Video of Numerical Data

Sixun Dong, Nanxu Gong, Haoyue Bai, Xinyuan Wang, Wangyang Ying, Wei Fan, Yanjie Fu

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

WACV MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

Chenyu Wang, Weixin Luo, Sixun Dong, Xiaohua Xuan, Zhengxin Li, Lin Ma, Shenghua Gao

Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos

CVPR 2023 Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos

Sixun Dong*, Huazhang Hu*, Dongze Lian, Weixin Luo, Yicheng Qian, Shenghua Gao

TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting

CVPR 2022πŸ† Oral TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting

Huazhang Hu*, Sixun Dong*, Yiqun Zhao, Dongze Lian, Zhengxin Li, Shenghua Gao

Experience

GenAI Research Intern

Zoom Inc., GenAI Research Group

May 2025 - Present

Working on VLM and LLM Agent.

Research Intern (Team Leader)

DGene, Digital Human Algorithm Department

Aug 2023 - Jan 2024

Led digital human projects: co-speech gesture generation and 3D human body reconstruction with <7% measurement error.

Research Intern (Team Leader)

Transsion Holdings, Audio-Video Generation Department

Apr 2023 - Aug 2023

Led audio-driven talking head video generation research, achieving SoTA performance in commercial and academic benchmarks.

Academic Service

Reviewer

Conferences: CVPR 2023-2025, ICCV 2023-2025, NeurIPS 2025, ECCV 2024, ACCV 2024, ACM MM 2023-2025, KDD 2024

Journals: TMM, Neural Networks, TKDD

Education

2024 - Present

PhD in Computer Science

Arizona State University, USA

Focus: Multimodal Learning, Computer Vision, LLM Agent

2021 - 2024

M.S. in Computer Science

ShanghaiTech University, China

SVIP-Lab, Advisor: Prof. Shenghua Gao

2016 - 2020

B.E. in Computer Science (Dual Degree)

Dalian University of Technology, China

2016 - 2020

B.E. in Process Equipment and Control Engineering

Dalian University of Technology, China