Sledovat
Shixiang Shane Gu
Shixiang Shane Gu
Google DeepMind
E-mailová adresa ověřena na: google.com - Domovská stránka
Název
Citace
Citace
Rok
Categorical reparameterization with gumbel-softmax
E Jang, S Gu, B Poole
arXiv preprint arXiv:1611.01144, 2016
56032016
Large language models are zero-shot reasoners
T Kojima, SS Gu, M Reid, Y Matsuo, Y Iwasawa
Advances in neural information processing systems 35, 22199-22213, 2022
18172022
Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates
S Gu, E Holly, T Lillicrap, S Levine
2017 IEEE international conference on robotics and automation (ICRA), 3389-3396, 2017
18172017
Scaling instruction-finetuned language models
HW Chung, L Hou, S Longpre, B Zoph, Y Tay, W Fedus, Y Li, X Wang, ...
Journal of Machine Learning Research 25 (70), 1-53, 2024
15142024
Continuous deep q-learning with model-based acceleration
S Gu, T Lillicrap, I Sutskever, S Levine
International conference on machine learning, 2829-2838, 2016
12122016
Continuous deep q-learning with model-based acceleration
S Gu, T Lillicrap, I Sutskever, S Levine
International conference on machine learning, 2829-2838, 2016
12122016
Towards deep neural network architectures robust to adversarial examples
S Gu, L Rigazio
arXiv preprint arXiv:1412.5068, 2014
9862014
Data-efficient hierarchical reinforcement learning
O Nachum, SS Gu, H Lee, S Levine
Advances in neural information processing systems 31, 2018
8982018
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...
arXiv preprint arXiv:2206.04615, 2022
7202022
Gpt-4 technical report
J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ...
arXiv preprint arXiv:2303.08774, 2023
6832023
A minimalist approach to offline reinforcement learning
S Fujimoto, SS Gu
Advances in neural information processing systems 34, 20132-20145, 2021
5372021
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
4072023
Dynamics-aware unsupervised discovery of skills
A Sharma, S Gu, S Levine, V Kumar, K Hausman
arXiv preprint arXiv:1907.01657, 2019
4032019
Q-prop: Sample-efficient policy gradient with an off-policy critic
S Gu, T Lillicrap, Z Ghahramani, RE Turner, S Levine
arXiv preprint arXiv:1611.02247, 2016
3892016
Human-centric dialog training via offline reinforcement learning
N Jaques, JH Shen, A Ghandeharioun, C Ferguson, A Lapedriza, ...
arXiv preprint arXiv:2010.05848, 2020
361*2020
Temporal difference models: Model-free deep rl for model-based control
V Pong, S Gu, M Dalal, S Levine
arXiv preprint arXiv:1802.09081, 2018
2802018
A divergence minimization perspective on imitation learning methods
SKS Ghasemipour, R Zemel, S Gu
Conference on robot learning, 1259-1277, 2020
2542020
Sequence tutor: Conservative fine-tuning of sequence generation models with kl-control
N Jaques, S Gu, D Bahdanau, JM Hernández-Lobato, RE Turner, D Eck
International Conference on Machine Learning, 1645-1654, 2017
250*2017
Large language models can self-improve
J Huang, SS Gu, L Hou, Y Wu, X Wang, H Yu, J Han
arXiv preprint arXiv:2210.11610, 2022
2402022
Near-optimal representation learning for hierarchical reinforcement learning
O Nachum, S Gu, H Lee, S Levine
arXiv preprint arXiv:1810.01257, 2018
2182018
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–20