Sledovat
Zhuoran Yang
Zhuoran Yang
E-mailová adresa ověřena na: yale.edu
Název
Citace
Citace
Rok
Multi-agent reinforcement learning: A selective overview of theories and algorithms
K Zhang, Z Yang, T Başar
Handbook of reinforcement learning and control, 321-384, 2021
8372021
Fully decentralized multi-agent reinforcement learning with networked agents
K Zhang, Z Yang, H Liu, T Zhang, T Basar
International Conference on Machine Learning, 5872-5881, 2018
4962018
Provably efficient reinforcement learning with linear function approximation
C Jin, Z Yang, Z Wang, MI Jordan
Conference on Learning Theory, 2137-2143, 2020
4952020
A Theoretical Analysis of Deep Q-Learning. arXiv 2019
J Fan, Z Wang, Y Xie, Z Yang
arXiv preprint arXiv:1901.00137, 1901
475*1901
Provably efficient exploration in policy optimization
Q Cai, Z Yang, C Jin, Z Wang
International Conference on Machine Learning, 1283-1294, 2020
2132020
Is pessimism provably efficient for offline rl?
Y Jin, Z Yang, Z Wang
International Conference on Machine Learning, 5084-5096, 2021
1982021
Neural policy gradient methods: Global optimality and rates of convergence
L Wang, Q Cai, Z Yang, Z Wang
arXiv preprint arXiv:1909.01150, 2019
1872019
Multi-agent reinforcement learning via double averaging primal-dual optimization
HT Wai, Z Yang, Z Wang, M Hong
Advances in Neural Information Processing Systems 31, 2018
1672018
A two-timescale framework for bilevel optimization: Complexity analysis and application to actor-critic
M Hong, HT Wai, Z Wang, Z Yang
arXiv preprint arXiv:2007.05170, 2020
1442020
Provably global convergence of actor-critic: A case for linear quadratic regulator with ergodic cost
Z Yang, Y Chen, M Hong, Z Wang
Advances in neural information processing systems 32, 2019
1032019
Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games
K Zhang, Z Yang, T Basar
Advances in Neural Information Processing Systems 32, 2019
1022019
Provably efficient safe exploration via primal-dual policy optimization
D Ding, X Wei, Z Yang, Z Wang, M Jovanovic
International Conference on Artificial Intelligence and Statistics, 3304-3312, 2021
1012021
Neural proximal/trust region policy optimization attains globally optimal policy
B Liu, Q Cai, Z Yang, Z Wang
arXiv preprint arXiv:1906.10306, 2019
1002019
Learning zero-sum simultaneous-move markov games using function approximation and correlated equilibrium
Q Xie, Y Chen, Z Wang, Z Yang
Conference on learning theory, 3674-3682, 2020
962020
Neural temporal-difference learning converges to global optima
Q Cai, Z Yang, JD Lee, Z Wang
Advances in Neural Information Processing Systems 32, 2019
892019
Networked multi-agent reinforcement learning in continuous spaces
K Zhang, Z Yang, T Basar
2018 IEEE conference on decision and control (CDC), 2771-2776, 2018
872018
Convergent policy optimization for safe reinforcement learning
M Yu, Z Yang, M Kolar, Z Wang
Advances in Neural Information Processing Systems 32, 2019
802019
Sparse nonlinear regression: Parameter estimation and asymptotic inference
Z Yang, Z Wang, H Liu, YC Eldar, T Zhang
arXiv preprint arXiv:1511.04514, 2015
75*2015
A near-optimal algorithm for stochastic bilevel optimization via double-momentum
P Khanduri, S Zeng, M Hong, HT Wai, Z Wang, Z Yang
Advances in neural information processing systems 34, 30271-30283, 2021
702021
Neural trust region/proximal policy optimization attains globally optimal policy
B Liu, Q Cai, Z Yang, Z Wang
Advances in neural information processing systems 32, 2019
692019
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–20