
Sixun Dong (Ironieser)
PhD Student in Computer Science
Arizona State University
I am studying at Arizona State University. I completed my Master's at ShanghaiTech University under Professor Shenghua Gao.
My research focuses on multimodal AI systems that bridge computer vision, natural language processing, and machine learning, with different applications.
Recent News
Selected Publications

arXiv'2508 MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs

arXiv'2508 LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
Paper / Dataset (Coming Soon)


Experience

GenAI Research Intern
Zoom Inc., GenAI Research Group
May 2025 - Present
Working on VLM and LLM Agent.

Research Intern (Team Leader)
DGene, Digital Human Algorithm Department
Aug 2023 - Jan 2024
Led digital human projects: co-speech gesture generation and 3D human body reconstruction with <7% measurement error.

Research Intern (Team Leader)
Transsion Holdings, Audio-Video Generation Department
Apr 2023 - Aug 2023
Led audio-driven talking head video generation research, achieving SoTA performance in commercial and academic benchmarks.
Academic Service
Reviewer
Conferences: CVPR 2023-2025, ICCV 2023-2025, NeurIPS 2025, ECCV 2024, ACCV 2024, ACM MM 2023-2025, KDD 2024
Journals: TMM, Neural Networks, TKDD
Education
Computer Science
Arizona State University, USA
Focus: Multimodal Learning, Computer Vision, LLM Agent
M.S. in Computer Science
ShanghaiTech University, China
SVIP-Lab, Advisor: Prof. Shenghua Gao
B.E. in Computer Science (Dual Degree)
Dalian University of Technology, China
B.E. in Process Equipment and Control Engineering
Dalian University of Technology, China