Kunchang Li
Cited by
Cited by
Tip-adapter: Training-free clip-adapter for better vision-language modeling
R Zhang, R Fang, P Gao, W Zhang, K Li, J Dai, Y Qiao, H Li
ECCV2022, 2021
PointCLIP: Point Cloud Understanding by CLIP
R Zhang, Z Guo, W Zhang, K Li, X Miao, B Cui, Y Qiao, P Gao, H Li
CVPR2022, 2021
UniFormer: Unifying convolution and self-attention for visual recognition
K Li, Y Wang, J Zhang, P Gao, G Song, Y Liu, H Li, Y Qiao
TPAMI, 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
K Li, Y Wang, P Gao, G Song, Y Liu, H Li, Y Qiao
ICLR2022, 2022
VideoChat: Chat-Centric Video Understanding
K Li, Y He, Y Wang, Y Li, W Wang, P Luo, Y Wang, L Wang, Y Qiao
arXiv preprint arXiv:2305.06355, 2023
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Y Wang*, K Li*, Y Li*, Y He*, B Huang*, Z Zhao*, H Zhang, J Xu, Y Liu, ...
arXiv preprint arXiv:2212.03191, 2022
Illumination Adaptive Transformer
Z Cui, K Li, L Gu, S Su, P Gao, Z Jiang, Y Qiao, T Harada
BMVC2022, 2022
Uniformerv2: Spatiotemporal learning by arming image vits with video uniformer
K Li, Y Wang, Y He, Y Li, Y Wang, L Wang, Y Qiao
ICCV2023, 2022
Unmasked teacher: Towards training-efficient video foundation models
K Li, Y Wang, Y Li, Y Wang, Y He, L Wang, Y Qiao
ICCV2023 Oral, 2023
CT-Net: Channel tensorization network for video classification
K Li, X Li, Y Wang, J Wang, Y Qiao
ICLR2021, 2021
Internvid: A large-scale video-text dataset for multimodal understanding and generation
Y Wang, Y He, Y Li, K Li, J Yu, X Ma, X Li, G Chen, X Chen, Y Wang, C He, ...
ICLR2024, 2023
Internchat: Solving vision-centric tasks by interacting with chatbots beyond language
Z Liu, Y He, W Wang, W Wang, Y Wang, S Chen, Q Zhang, Y Yang, Q Li, ...
arXiv preprint arXiv:2305.05662, 2023
MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video
DJ Zhang*, K Li*, Y Chen, Y Wang, S Chandra, Y Qiao, L Liu, MZ Shou
ECCV2022, 2021
InternVideo-Ego4D: A Pack of Champion Solutions to Ego4D Challenges
G Chen, S Xing, Z Chen, Y Wang, K Li, Y Li, Y Liu, J Wang, YD Zheng, ...
ECCVW2022, 2022
Grounded sam: Assembling open-world models for diverse visual tasks
T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen, X Huang, Y Chen, F Yan, ...
arXiv preprint arXiv:2401.14159, 2024
Videomamba: State space model for efficient video understanding
K Li, X Li, Y Wang, Y He, Y Wang, L Wang, Y Qiao
arXiv preprint arXiv:2403.06977, 2024
Mvbench: A comprehensive multi-modal video understanding benchmark
K Li, Y Wang, Y He, Y Li, Y Wang, Y Liu, Z Wang, J Xu, G Chen, P Luo, ...
CVPR2024, 2023
Self-slimmed vision transformer
Z Zong*, K Li*, G Song, Y Wang, Y Qiao, B Leng, Y Liu
ECCV2022, 2022
A Progressive Difference Method for Capturing Visual Tempos on Action Recognition
X Sheng, K Li, Z Shen, G Xiao
TCSVT, 2022
Video mamba suite: State space model as a versatile alternative for video understanding
G Chen, Y Huang, J Xu, B Pei, Z Chen, Z Li, J Wang, K Li, T Lu, L Wang
arXiv preprint arXiv:2403.09626, 2024
The system can't perform the operation now. Try again later.
Articles 1–20