Sixun Dong (Ironieser)
Multimodal Learning, VLM, LLM Agent
Independent Researcher, AZ, USA
Currently, I am an independent researcher. I completed my Master's at ShanghaiTech University under Professor Shenghua Gao.
My research focuses on multimodal AI systems that bridge computer vision, natural language processing, and machine learning, with different applications.
Recent News
Selected Publications
Experience
GenAI Research Intern
Zoom Inc., GenAI Research Group
May 2025 - Aug 2025
Worked on VLM and LLM Agent. Published one first-author paper on efficient VLM inference and two collaborative papers on LLM evaluation.
Research Intern (Team Leader)
DGene, Digital Human Algorithm Department
Aug 2023 - Jan 2024
Led digital human projects: co-speech gesture generation and 3D human body reconstruction with <7% measurement error.
Research Intern (Team Leader)
Transsion Holdings, Audio-Video Generation Department
Apr 2023 - Aug 2023
Led audio-driven talking head video generation research, achieving SoTA performance in commercial and academic benchmarks.
Academic Service
Reviewer
Conferences: CVPR 2023+, ICCV 2023+, ECCV 2024+, NeurIPS 2025+, ICLR 2026+, ACCV 2024, ACM MM 2023-2025, KDD 2024
Journals: TMM, Neural Networks, TKDD
Education
M.S. in Computer Science
ShanghaiTech University, China
SVIP-Lab, Advisor: Prof. Shenghua Gao
B.E. in Computer Science (Dual Degree)
Dalian University of Technology, China
B.E. in Process Equipment and Control Engineering
Dalian University of Technology, China