Sparse MLP for image recognition: Is self-attention really necessary? C Tang, Y Zhao, G Wang, C Luo, W Xie, W Zeng Proceedings of the AAAI conference on artificial intelligence 36 (2), 2344-2351, 2022 | 100 | 2022 |
Joint time-frequency and time domain learning for speech enhancement C Tang, C Luo, Z Zhao, W Xie, W Zeng Proceedings of the twenty-ninth international conference on international …, 2021 | 77 | 2021 |
Detect or track: Towards cost-effective video object detection/tracking H Luo, W Xie, X Wang, W Zeng Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 8803-8810, 2019 | 71 | 2019 |
Unifying layout generation with a decoupled diffusion model M Hui, Z Zhang, X Zhang, W Xie, Y Wang, Y Lu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 35 | 2023 |
Learning to Update for Object Tracking with Recurrent Meta-learner B Li, W Xie, W Zeng, W Liu IEEE Transactions on Image Processing, 2019 | 34 | 2019 |
Unsupervised visual representation learning by tracking patches in video G Wang, Y Zhou, C Luo, W Xie, W Zeng, Z Xiong Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 27 | 2021 |
Cross-view feature learning for scalable social image analysis W Xie, Y Peng, J Xiao Proceedings of the AAAI Conference on Artificial Intelligence 28 (1), 2014 | 17 | 2014 |
Weakly-supervised image parsing via constructing semantic graphs and hypergraphs W Xie, Y Peng, J Xiao Proceedings of the 22nd ACM international conference on Multimedia, 277-286, 2014 | 16 | 2014 |
Responsible task automation: Empowering large language models as responsible task automators Z Zhang, X Zhang, W Xie, Y Lu arXiv preprint arXiv:2306.01242, 2023 | 13 | 2023 |
Graph-based multimodal semi-supervised image classification W Xie, Z Lu, Y Peng, J Xiao Neurocomputing 138, 167-179, 2014 | 13 | 2014 |
Semantic graph construction for weakly-supervised image parsing W Xie, Y Peng, J Xiao Proceedings of the AAAI Conference on Artificial Intelligence 28 (1), 2014 | 13 | 2014 |
Motion detection of object W Xie, C Lan, W Zeng US Patent 10,460,456, 2019 | 12 | 2019 |
Reinforced ui instruction grounding: Towards a generic ui task automation api Z Zhang, W Xie, X Zhang, Y Lu arXiv preprint arXiv:2310.04716, 2023 | 8 | 2023 |
Multimodal semi-supervised image classification by combining tag refinement, graph-based learning and support vector regression W Xie, Z Lu, Y Peng, J Xiao 2013 IEEE International Conference on Image Processing, 4307-4311, 2013 | 2 | 2013 |
Slot-VLM: SlowFast Slots for Video-Language Modeling J Xu, C Lan, W Xie, X Chen, Y Lu arXiv preprint arXiv:2402.13088, 2024 | 1 | 2024 |
Retrieval-based Video Language Model for Efficient Long Video Question Answering J Xu, C Lan, W Xie, X Chen, Y Lu arXiv preprint arXiv:2312.04931, 2023 | 1 | 2023 |
Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis T Bi, X Zhang, Z Zhang, W Xie, C Lan, Y Lu, N Zheng Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | | 2024 |
Slot-VLM: Object-Event Slots for Video-Language Modeling J Xu, C Lan, W Xie, X Chen, Y Lu The Thirty-eighth Annual Conference on Neural Information Processing Systems, 0 | | |
Supplementary Material: Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis T Bi, X Zhang, Z Zhang, W Xie, C Lan, Y Lu, N Zheng | | |
Supplementary Material: Unifying Layout Generation with a Decoupled Diffusion Model M Hui, Z Zhang, X Zhang, W Xie, Y Wang, Y Lu | | |