Publications - Jiaqi Tang

*: Equal Contribution, †: Corresponding Author.

Research Papers

2026

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?
Jiaqi Tang*, Jianmin Chen*, Youyang Zhai*, Wei Wei†, Runtao Liu, Mengjie Zhao, Xiangyu Wu, Qingfa Xiao, Qifeng Chen†
International Conference on Machine Learning (ICML), 2026
arXiv / code / model / demo / bibtex

Media Report: QbitAI (量子位) / HuggingFace Daily Papers

A unified MLLM that self-recovers corrupted visual content and reasons over it for robust visual understanding.

LongVideoAgent: Multi-Agent Reasoning with Long Videos
Runtao Liu, Ziyi Liu, Jiaqi Tang, Yue Ma, Renjie Pi, Jipeng Zhang, Qifeng Chen
Annual Meeting of the Association for Computational Linguistics (ACL), 2026
arXiv / code / project page / bibtex

Media Report: HuggingFace Daily Papers

Multi-agent reasoning with long videos.

Response-G1: Explicit Scene Graph Modeling for Proactive Streaming Video Understanding
Ke Ma*, Jiaqi Tang*, Bin Guo, Xueting Han, Ruonan Xu, Qingfeng He, Ziheng Wang, Xu Wang, Qifeng Chen, Zhiwen Yu, Yunhao Liu
Annual Meeting of the Association for Computational Linguistics (ACL), 2026
arXiv / code / bibtex

Media Report: Synced (机器之心)

Explicit scene graph modeling for proactive streaming video understanding.

LPO: Towards Accurate GUI Agent Interaction via Location Preference Optimization
Jiaqi Tang*, Yu Xia*, Yi-Feng Wu*, Yuwei Hu*, Yuhui Chen, Qing-Guo Chen, Xiaogang Xu†, Xiangyu Wu, Hao Lu, Yanqing Ma, Shiyin Lu, Qifeng Chen†
Findings of the Association for Computational Linguistics (ACL Findings), 2026
arXiv / code / bibtex

Location preference optimization for accurate GUI agent interaction.

ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
Jiayu Yang, Yuxuan Fan, Songning Lai, Shengen Wu, Jiaqi Tang, Chun Kang, Zhijiang Guo, Yutao Yue†
International Conference on Learning Representations (ICLR), 2026
arXiv / code / bibtex

Attribution-controlled knowledge editing for multi-hop factual recall.

Adaptive Debiasing Tsallis Entropy for Test-Time Adaptation
Xiangyu Wu, Dongming Jiang, Yueying Tian, Feng Yu, Qing-Guo Chen, Jiaqi Tang, Yang Yang, Jianfeng Lu†
International Conference on Learning Representations (ICLR), 2026
code / bibtex

Adaptive debiasing approach using Tsallis entropy for test-time adaptation.

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding
Jiaqi Tang*, Jianmin Chen*, Wei Wei†, Xiaogang Xu, Runtao Liu, Xiangyu Wu, Qipeng Xie, Jiafei Wu, Lei Zhang, Qifeng Chen†
AAAI Conference on Artificial Intelligence (AAAI), 2026 (Oral)
arXiv / code / model / data / demo / talk / bibtex

Media Report: AI Era (新智元) / Synced (机器之心) / CVer / HuggingFace Daily Papers / VALSE 2026

Degradation-aware reasoning for robust visual understanding.

Sage Deer: A Super-Aligned Driving Generalist Is Your Copilot
Hao Lu*, Jiaqi Tang*, Jiyao Wang, Yunfan LU, Xu Cao, Qingyong Hu, Yin Wang, Yuting Zhang, Tianxin Xie, Yunpeng Zhang, Yong Chen, Jiayu. Gao, Bin Huang, Dengbo He, Shuiguang Deng, Hao Chen, Ying-Cong Chen
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2026 (Best Paper Honorable Mention)
arXiv / bibtex

A super-aligned driving generalist as copilot.

Gene-M1: Advancing Cross-Species Genomic Discovery via Taxon-Specific Mixture-of-Experts
Yuhang Li*, Jiaqi Tang*, Jianmin Chen, Yourui Han, Xuequn Shang, Bolin Chen
International Symposium on Bioinformatics Research and Applications (ISBRA), 2026
bibtex

Cross-species genomic discovery via taxon-specific mixture-of-experts.

Responsive Test-Time Model Adaptation for Mobile Applications via Runtime-efficient Sparse Updates
Cheng Fang, Bin Guo, Sicong Liu, Zimu Zhou, Jiaqi Tang, Ke Ma, Shiyan Luo, Geyang Song, Zhiwen Yu
IEEE Transactions on Mobile Computing (TMC), 2026
bibtex

Runtime-efficient sparse updates for responsive test-time model adaptation in mobile applications.

2025

RhythmGuassian: Repurposing Generalizable Gaussian Model For Remote Physiological Measurement
Hao Lu*, Yuting Zhang*, Jiaqi Tang, Bowen Fu, Wenhang Ge, Wei Wei, Kaishun Wu, Ying-Cong Chen
IEEE/CVF International Conference on Computer Vision (ICCV), 2025 (Highlight)
bibtex

Repurposing generalizable Gaussian model for remote physiological measurement.

Co-Painter: Fine-Grained Controllable Image Stylization via Implicit Decoupling and Adaptive Injection
Bowen Fu, Wei Wei, Jiaqi Tang, Jiangtao Nie, Xiaogang Xu, Ying-Cong Chen, Lei Zhang
IEEE/CVF International Conference on Computer Vision (ICCV), 2025
bibtex

Fine-grained controllable image stylization via implicit decoupling and adaptive injection.

SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Sparsity Activation
Ke Ma, Jiaqi Tang, Fan Dang, Bin Guo†, Sicong Liu, Cheng Fang, Zhui Zhu, Lei Wu, Ying-Cong Chen, Zhiwen Yu, Yunhao Liu†
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025 (Highlight)
code / bibtex

Memory-adaptive test-time adaptation method that dynamically activates sparsity for efficient adaptation.

Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference
Qingfa Xiao, Jiachuan Wang, Haoyang Li, Cheng Deng, Jiaqi Tang, Shuangyin Li, Yongqi Zhang, Jun Wang, Lei Chen
arXiv preprint, 2025
arXiv / bibtex

Effective key-value retrieval for long-context LLMs inference.

2024

AdaShadow: Responsive Test-time Adaptation for Non-stationary Mobile Environments
Cheng Fang, Sicong Liu, Zimu Zhou, Bin Guo†, Jiaqi Tang, Ke Ma, Zhiwen Yu
ACM Conference on Embedded Networked Sensor Systems (SenSys), 2024 (Best Paper Honorable Mention, Top 7/313)
bibtex

Responsive test-time adaptation framework for non-stationary mobile environments.

Hawk: Learning to Understand Open-World Video Anomalies
Jiaqi Tang*, Hao Lu*, Ruizheng Wu, Xiaogang Xu, Ke Ma, Cheng Fang, Bin Guo, Jiangbo Lu, Qifeng Chen, Ying-Cong Chen†
Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
code / website / bibtex

Learning to understand open-world video anomalies through multi-modal large language models.

Learning to Remove Wrinkled Transparent Film with Polarized Prior
Jiaqi Tang, Ruizheng Wu, Xiaogang Xu, Sixing Hu, Ying-Cong Chen†
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
code / project page / bibtex

Learning-based method to remove wrinkled transparent film using polarized prior information.

Scaling Multi-Camera 3D Object Detection through Weak-to-Strong Eliciting
Hao Lu, Jiaqi Tang, Xinli Xu, Xu Cao, Yunpeng Zhang, Guoqing Wang, Dalong Du, Hao Chen, Ying-Cong Chen†
arXiv, 2024
arXiv / code

Scaling multi-camera 3D object detection through weak-to-strong eliciting approach.

An Incremental Unified Framework for Small Defect Inspection
Jiaqi Tang, Hao Lu, Xiaogang Xu, Ruizheng Wu, Sixing Hu, Tong Zhang, Tsz Wa Cheng, Ming Ge, Ying-Cong Chen†, Fugee Tsung
18th European Conference on Computer Vision (ECCV), 2024
code / project page / bibtex

An incremental unified framework for small defect inspection in industrial applications.

2023

High Dynamic Range Image Reconstruction via Deep Explicit Polynomial Curve Estimation
Jiaqi Tang, Xiaogang Xu, Sixing Hu, Ying-Cong Chen†
26th European Conference on Artificial Intelligence (ECAI), 2023 (Long Oral)
code / arXiv / talk / bibtex

High dynamic range image reconstruction through deep explicit polynomial curve estimation.

2021

NTIRE 2021 multi-modal aerial view object classification challenge
Jerrick Liu, Nathan Inkawhich, Oliver Nina, Radu Timofte, Sahil Jain, Bob Lee, Yuru Duan, Wei Wei, Lei Zhang, Songzheng Xu, Jiaqi Tang, and others
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2021 (Winner Award, 1st Rank)
arXiv / bibtex

Multi-modal aerial view object classification challenge winner.

Surveys

2026

Intelligent Remote Sensing Agents: A Survey
Jiaqi Tang*, Yingying Yan*, Qianzhou Wang*, Yuyang Xia*, Botong Geng*, Jianmin Chen*, Ke Ma, Youyang Zhai, Qingfeng He, Weigeng Shao, Yunjin Sun, Junwei Dai, Chuxi Chen, Xiaogang Xu, Kelu Yao, Lei Zhang, Wei Wei†, Qifeng Chen†, Antonio Plaza, Yanning Zhang
Survey, 2026
paper / repository

Media Report: Synced (机器之心) / X / Reddit

Curated survey repository with 100+ papers on intelligent remote sensing agents.

2024

GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing
Hao Lu*, Xuesong Niu*, Jiyao Wang*, Yin Wang*, Qingyong Hu*, Jiaqi Tang*, Yuting Zhang, Kaishen Yuan, Bin Huang, Zitong Yu, Dengbo He, Shuiguang Deng, Hao Chen, Ying-Cong Chen†, Shiguang Shan
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024 (Oral)
code / bibtex

Preliminary evaluations for GPT-4V on visual affective computing tasks.

← Back to Home