Follow
Yiheng Xu
Yiheng Xu
Verified email at cs.hku.hk - Homepage
Title
Cited by
Cited by
Year
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
Y Xu, M Li, L Cui, S Huang, F Wei, M Zhou
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020
8052020
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Y Xu, Y Xu, T Lv, L Cui, F Wei, G Wang, Y Lu, D Florencio, C Zhang, ...
Proceedings of the 59th Annual Meeting of the Association for Computational …, 2020
5242020
DocBank: A Benchmark Dataset for Document Layout Analysis
M Li, Y Xu, L Cui, S Huang, F Wei, Z Li, M Zhou
Proceedings of the 28th International Conference on Computational …, 2020
2072020
Graph Convolutional Networks with Markov Random Field Reasoning for Social Spammer Detection
Y Wu, D Lian, Y Xu, L Wu, E Chen
Proceedings of the AAAI Conference on Artificial Intelligence 34 (01), 1054-1061, 2020
2052020
DiT: Self-Supervised Pre-training for Document Image Transformer
J Li, Y Xu, T Lv, L Cui, C Zhang, F Wei
Proceedings of the 30th ACM International Conference on Multimedia, 2022
1562022
LayoutXLM: Multimodal Pre-training for Multilingual Visually-Rich Document Understanding
Y Xu, T Lv, L Cui, G Wang, Y Lu, D Florencio, C Zhang, F Wei
arXiv preprint arXiv:2104.08836, 2021
1272021
Document AI: Benchmarks, Models and Applications
L Cui, Y Xu, T Lv, F Wei
arXiv preprint arXiv:2111.08609, 2021
822021
XFUND: a benchmark dataset for multilingual visually rich form understanding
Y Xu, T Lv, L Cui, G Wang, Y Lu, D Florencio, C Zhang, F Wei
Findings of the Association for Computational Linguistics: ACL 2022, 3214-3224, 2022
642022
LayoutReader: Pre-training of Text and Layout for Reading Order Detection
Z Wang, Y Xu, L Cui, J Shang, F Wei
Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021
642021
MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding
J Li, Y Xu, L Cui, F Wei
Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022
562022
Openagents: An open platform for language agents in the wild
T Xie, F Zhou, Z Cheng, P Shi, L Weng, Y Liu, TJ Hua, J Zhao, Q Liu, C Liu, ...
arXiv preprint arXiv:2310.10634, 2023
542023
Lemur: Harmonizing natural language and code for language agents
Y Xu, H Su, C Xing, B Mi, Q Liu, W Shi, B Hui, F Zhou, Y Liu, T Xie, ...
The Twelfth International Conference on Learning Representations (ICLR 2024), 2024
53*2024
Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments
T Xie, D Zhang, J Chen, X Li, S Zhao, R Cao, TJ Hua, Z Cheng, D Shin, ...
arXiv preprint arXiv:2404.07972, 2024
452024
In-context learning with many demonstration examples
M Li, S Gong, J Feng, Y Xu, J Zhang, Z Wu, L Kong
arXiv preprint arXiv:2302.04931, 2023
172023
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Y Xu, Z Wang, J Wang, D Lu, T Xie, A Saha, D Sahoo, T Yu, C Xiong
arXiv preprint arXiv:2412.04454, 2024
2024
Reading order detection in a document
L Cui, XU Yiheng, Y Xu, F Wei, Z Wang
US Patent App. 18/563,002, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–16