Vision-language pre-training with triple contrastive learning J Yang, J Duan, S Tran, Y Xu, S Chanda, L Chen, B Zeng, T Chilimbi, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 207 | 2022 |
Multi-modal alignment using representation codebook J Duan, L Chen, S Tran, J Yang, Y Xu, B Zeng, T Chilimbi Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 46 | 2022 |
CTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models A Muhamed, I Keivanloo, S Perera, J Mracek, Y Xu, Q Cui, S Rajagopalan, ... NeurIPS Efficient Natural Language and Speech Processing Workshop, 2021 | 32 | 2021 |
Simpler, faster, stronger: Breaking the log-k curse on contrastive learners with flatnce J Chen, Z Gan, X Li, Q Guo, L Chen, S Gao, T Chung, Y Xu, B Zeng, W Lu, ... arXiv preprint arXiv:2107.01152, 2021 | 21 | 2021 |
Why do we need large batchsizes in contrastive learning? a gradient-bias perspective C Chen, J Zhang, Y Xu, L Chen, J Duan, Y Chen, S Tran, B Zeng, ... Advances in Neural Information Processing Systems 35, 33860-33875, 2022 | 15 | 2022 |
Top-down attention in end-to-end spoken language understanding Y Chen, W Lu, A Mottini, LE Li, J Droppo, Z Du, B Zeng ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 11 | 2021 |
Understanding and constructing latent modality structures in multi-modal representation learning Q Jiang, C Chen, H Zhao, L Chen, Q Ping, SD Tran, Y Xu, B Zeng, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 10 | 2023 |
Efficient and effective training of language and graph neural network models VN Ioannidis, X Song, D Zheng, H Zhang, J Ma, Y Xu, B Zeng, T Chilimbi, ... arXiv preprint arXiv:2206.10781, 2022 | 10 | 2022 |
Magic pyramid: Accelerating inference with early exiting and token pruning X He, I Keivanloo, Y Xu, X He, B Zeng, S Rajagopalan, T Chilimbi arXiv preprint arXiv:2111.00230, 2021 | 9 | 2021 |
Graph-aware language model pre-training on a large graph corpus can help multiple graph applications H Xie, D Zheng, J Ma, H Zhang, VN Ioannidis, X Song, Q Ping, S Wang, ... Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and …, 2023 | 7 | 2023 |
MLIM: Vision-and-language model pre-training with masked language and image modeling T Arici, MS Seyfioglu, T Neiman, Y Xu, S Train, T Chilimbi, B Zeng, I Tutar arXiv preprint arXiv:2109.12178, 2021 | 6 | 2021 |
Web-scale semantic product search with large language models A Muhamed, S Srinivasan, CH Teo, Q Cui, B Zeng, T Chilimbi, ... Pacific-Asia Conference on Knowledge Discovery and Data Mining, 73-85, 2023 | 4 | 2023 |
Semantic aligned multi-modal transformer for vision-language understanding: A preliminary study on visual QA H Ding, E Li, Z Hu, Y Xu, D Hakkani-Tür, Z Du, B Zeng | 3 | 2021 |
ReAugKD: Retrieval-augmented knowledge distillation for pre-trained language models J Zhang, A Muhamed, A Anantharaman, G Wang, C Chen, K Zhong, ... Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 2 | 2023 |
OssCSE: Overcoming Surface Structure Bias in Contrastive Learning for Unsupervised Sentence Embedding Z Shi, G Wang, K Bai, J Li, X Li, Q Cui, B Zeng, T Chilimbi, X Zhu Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 1 | 2023 |
DCAF-BERT: A Distilled Cachable Adaptable Factorized Model For Improved Ads CTR Prediction A Muhamed, J Singh, S Zheng, I Keivanloo, S Perera, J Mracek, Y Xu, ... Companion Proceedings of the Web Conference 2022, 110-115, 2022 | 1 | 2022 |
VidLA: Video-Language Alignment at Scale MN Rizve, F Fei, J Unnikrishnan, S Tran, BZ Yao, B Zeng, M Shah, ... arXiv preprint arXiv:2403.14870, 2024 | | 2024 |
VidLA: Video-Language Alignment at Scale M Nayeem Rizve, F Fei, J Unnikrishnan, S Tran, BZ Yao, B Zeng, M Shah, ... arXiv e-prints, arXiv: 2403.14870, 2024 | | 2024 |
Robust Multi-Task Learning with Excess Risks Y He, S Zhou, G Zhang, H Yun, Y Xu, B Zeng, T Chilimbi, H Zhao arXiv preprint arXiv:2402.02009, 2024 | | 2024 |
Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective Y Xing, X Lin, Q Song, Y Xu, B Zeng, G Cheng arXiv preprint arXiv:2401.15248, 2024 | | 2024 |