Publications
Currently, my research focuses on multimodal AI systems that bridge computer vision, natural language processing, and machine learning. I work on developing intelligent agents that can understand and generate content across multiple modalities, with applications in video analysis, time series forecasting, and feature transformation.
2025

arXiv'2508 MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs
Paper (Coming Soon) / Code (Coming Soon) / Homepage

arXiv'2508 LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
Paper / Dataset (Coming Soon)


Under Review TimesFrame: Multi-Variable Time Series is a Video of Numerical Data
Paper (Coming Soon)

arXiv'2505 Agentic Feature Augmentation: Unifying Selection and Generation with Teaming, Planning, and Memories


Under Review MECT: From Multimodal Knowledge Acquisition To Contrastive Embedding Construction For Generative Feature Transformation
Paper (Coming Soon)
2024
2023
2022
Survey Papers
Other Publications Auto-updated based on Google Scholar (Last synced: Aug 11, 2025)

arXiv'2506 LLM-ML Teaming: Integrated Symbolic Decoding and Gradient Search for Valid and Stable Generative Feature Transformation

arXiv'2506 Efficient Post-Training Refinement of Latent Reasoning in Large Language Models

arXiv'2505 Brownian Bridge Augmented Surrogate Simulation and Injection Planning for Geological CO Storage

arXiv'2505 Bridging the domain gap in equation distillation with reinforcement feedback