Sledovat
Deyao Zhu
Deyao Zhu
Research Scientist, ByteDance
E-mailová adresa ověřena na: bytedance.com - Domovská stránka
Název
Citace
Citace
Rok
MiniGPT-4: Enhancing vision-language understanding with advanced large language models
D Zhu, J Chen, X Shen, X Li, M Elhoseiny
International Conference on Learning Representations 2024, 2023
14092023
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
J Chen, D Zhu, X Shen, X Li, Z Liu, P Zhang, R Krishnamoorthi, ...
2nd MMFM Workshop in CVPR2024, 2023
2702023
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
D Zhu, J Chen, K Haydarov, X Shen, W Zhang, M Elhoseiny
Transactions on Machine Learning Research (TMLR), 2023
732023
Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation
A Mohamed, D Zhu, W Vu, M Elhoseiny, C Claudel
European Conference on Computer Vision (ECCV) 2022, 2022
512022
Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions
J Chen, D Zhu, K Haydarov, X Li, M Elhoseiny
arXiv preprint arXiv:2304.04227, 2023
262023
Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only
J Chen, D Zhu, G Qian, B Ghanem, Z Yan, C Zhu, F Xiao, SC Culatana, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision, 699-710, 2023
23*2023
Motion forecasting with unlikelihood training in continuous space
D Zhu, M Zahran, LE Li, M Elhoseiny
Conference on Robot Learning, 1003-1012, 2022
152022
RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition
J Chen, A Agarwal, S Abdelkarim, D Zhu, M Elhoseiny
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
15*2022
Minigpt4-video: Advancing multimodal llms for video understanding with interleaved visual-textual tokens
K Ataallah, X Shen, E Abdelrahman, E Sleiman, D Zhu, J Ding, ...
2nd MMFM Workshop in CVPR2024, 2024
92024
HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents
D Zhu, M Zahran, LE Li, M Elhoseiny
International Conference on Learning Representations, 2021, 2021
72021
Guiding Online Reinforcement Learning with Action-Free Offline Pretraining
D Zhu, Y Wang, J Schmidhuber, M Elhoseiny
arXiv preprint arXiv:2301.12876, 2023
62023
Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning
D Zhu, LE Li, M Elhoseiny
International Conference on Learning Representations 2023, 2022
52022
Learning to disentangle latent physical factors for video prediction
D Zhu, M Munderloh, B Rosenhahn, J Stückler
Pattern Recognition: 41st DAGM German Conference, DAGM GCPR 2019, Dortmund …, 2019
42019
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos
K Ataallah, X Shen, E Abdelrahman, E Sleiman, M Zhuge, J Ding, D Zhu, ...
European Conference on Computer Vision (ECCV) 2024, 2024
2024
MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis
A Alkhaldi, R Alnajim, L Alabdullatef, R Alyahya, J Chen, D Zhu, A Alsinan, ...
arXiv preprint arXiv:2407.04106, 2024
2024
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–15