Stochastic variance-reduced policy gradient M Papini, D Binaghi, G Canonaco, M Pirotta, M Restelli International conference on machine learning, 4026-4035, 2018 | 166 | 2018 |
Frequentist regret bounds for randomized least-squares value iteration A Zanette, D Brandfonbrener, E Brunskill, M Pirotta, A Lazaric International Conference on Artificial Intelligence and Statistics, 1954-1964, 2020 | 130 | 2020 |
Exploration-exploitation in constrained mdps Y Efroni, S Mannor, M Pirotta arXiv preprint arXiv:2003.02189, 2020 | 118 | 2020 |
Safe policy iteration M Pirotta, M Restelli, A Pecorino, D Calandriello International Conference on Machine Learning, 307-315, 2013 | 111 | 2013 |
Efficient bias-span-constrained exploration-exploitation in reinforcement learning R Fruit, M Pirotta, A Lazaric, R Ortner International Conference on Machine Learning, 1578-1586, 2018 | 104 | 2018 |
Policy gradient in lipschitz markov decision processes M Pirotta, M Restelli, L Bascetta Machine Learning 100, 255-283, 2015 | 95 | 2015 |
Adaptive step-size for policy gradient methods M Pirotta, M Restelli, L Bascetta Advances in Neural Information Processing Systems 26, 2013 | 82 | 2013 |
Multi-objective reinforcement learning with continuous pareto frontier approximation M Pirotta, S Parisi, M Restelli Proceedings of the AAAI conference on artificial intelligence 29 (1), 2015 | 70 | 2015 |
Policy gradient approaches for multi-objective sequential decision making S Parisi, M Pirotta, N Smacchia, L Bascetta, M Restelli 2014 International Joint Conference on Neural Networks (IJCNN), 2323-2330, 2014 | 64 | 2014 |
Multi-objective reinforcement learning through continuous pareto manifold approximation S Parisi, M Pirotta, M Restelli Journal of Artificial Intelligence Research 57, 187-227, 2016 | 55 | 2016 |
Importance weighted transfer of samples in reinforcement learning A Tirinzoni, A Sessa, M Pirotta, M Restelli International Conference on Machine Learning, 4936-4945, 2018 | 54 | 2018 |
Inverse reinforcement learning through policy gradient minimization M Pirotta, M Restelli Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016 | 52 | 2016 |
Adversarial attacks on linear contextual bandits E Garcelon, B Roziere, L Meunier, J Tarbouriech, O Teytaud, A Lazaric, ... Advances in Neural Information Processing Systems 33, 14362-14373, 2020 | 48 | 2020 |
Near optimal exploration-exploitation in non-communicating markov decision processes R Fruit, M Pirotta, A Lazaric Advances in Neural Information Processing Systems 31, 2018 | 46 | 2018 |
Regret bounds for kernel-based reinforcement learning OD Domingues, P Ménard, M Pirotta, E Kaufmann, M Valko arXiv preprint arXiv:2004.05599, 2020 | 43* | 2020 |
Boosted fitted q-iteration S Tosatto, M Pirotta, C d’Eramo, M Restelli International Conference on Machine Learning, 3434-3443, 2017 | 43 | 2017 |
Adaptive batch size for safe policy gradients M Papini, M Pirotta, M Restelli Advances in neural information processing systems 30, 2017 | 43 | 2017 |
Manifold-based multi-objective policy search with sample reuse S Parisi, M Pirotta, J Peters Neurocomputing 263, 3-14, 2017 | 42 | 2017 |
Compatible reward inverse reinforcement learning AM Metelli, M Pirotta, M Restelli Advances in neural information processing systems 30, 2017 | 42 | 2017 |
No-regret exploration in goal-oriented reinforcement learning J Tarbouriech, E Garcelon, M Valko, M Pirotta, A Lazaric International Conference on Machine Learning, 9428-9437, 2020 | 40 | 2020 |