Zachary Nado
Zachary Nado
Google Brain
Verified email at - Homepage
Cited by
Cited by
Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift
Y Ovadia, E Fertig, J Ren, Z Nado, D Sculley, S Nowozin, J Dillon, ...
Advances in neural information processing systems 32, 2019
Palm 2 technical report
R Anil, AM Dai, O Firat, M Johnson, D Lepikhin, A Passos, S Shakeri, ...
arXiv preprint arXiv:2305.10403, 2023
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
Underspecification presents challenges for credibility in modern machine learning
A D'Amour, K Heller, D Moldovan, B Adlam, B Alipanahi, A Beutel, ...
Journal of Machine Learning Research 23 (226), 1-61, 2022
On empirical comparisons of optimizers for deep learning
D Choi, CJ Shallue, Z Nado, J Lee, CJ Maddison, GE Dahl
arXiv preprint arXiv:1910.05446, 2019
Evaluating prediction-time batch normalization for robustness under covariate shift
Z Nado, S Padhy, D Sculley, A D'Amour, B Lakshminarayanan, J Snoek
arXiv preprint arXiv:2006.10963, 2020
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ...
arXiv preprint arXiv:2403.05530, 2024
Which algorithmic choices matter at which batch sizes? insights from a noisy quadratic model
G Zhang, L Li, Z Nado, J Martens, S Sachdeva, G Dahl, C Shallue, ...
Advances in neural information processing systems 32, 2019
Uncertainty baselines: Benchmarks for uncertainty & robustness in deep learning
Z Nado, N Band, M Collier, J Djolonga, MW Dusenberry, S Farquhar, ...
arXiv preprint arXiv:2106.04015, 2021
Plex: Towards reliability using pretrained large model extensions
D Tran, J Liu, MW Dusenberry, D Phan, M Collier, J Ren, K Han, Z Wang, ...
arXiv preprint arXiv:2207.07411, 2022
A loss curvature perspective on training instabilities of deep learning models
J Gilmer, B Ghorbani, A Garg, S Kudugunta, B Neyshabur, D Cardoze, ...
International Conference on Learning Representations, 2022
Benchmarking bayesian deep learning on diabetic retinopathy detection tasks
N Band, TGJ Rudner, Q Feng, A Filos, Z Nado, MW Dusenberry, G Jerfel, ...
arXiv preprint arXiv:2211.12717, 2022
AG: Imperative-style Coding with Graph-based Performance
D Moldovan, J Decker, F Wang, A Johnson, B Lee, Z Nado, D Sculley, ...
Proceedings of Machine Learning and Systems 1, 389-405, 2019
Adaptive gradient methods at the edge of stability
JM Cohen, B Ghorbani, S Krishnan, N Agarwal, S Medapati, M Badura, ...
arXiv preprint arXiv:2207.14484, 2022
Revisiting one-vs-all classifiers for predictive uncertainty and out-of-distribution detection in neural networks
S Padhy, Z Nado, J Ren, J Liu, J Snoek, B Lakshminarayanan
arXiv preprint arXiv:2007.05134, 2020
A large batch optimizer reality check: Traditional, generic optimizers suffice across batch sizes
Z Nado, JM Gilmer, CJ Shallue, R Anil, GE Dahl
arXiv preprint arXiv:2102.06356, 2021
A simple approach to improve single-model deep uncertainty via distance-awareness
JZ Liu, S Padhy, J Ren, Z Lin, Y Wen, G Jerfel, Z Nado, J Snoek, D Tran, ...
Journal of Machine Learning Research 24 (42), 1-63, 2023
Pre-trained Gaussian processes for Bayesian optimization
Z Wang, GE Dahl, K Swersky, C Lee, Z Nado, J Gilmer, J Snoek, ...
arXiv preprint arXiv:2109.08215, 2021
Stochastic gradient Langevin dynamics that exploit neural network structure
Z Nado, J Snoek, R Grosse, D Duvenaud, B Xu, J Martens
Benchmarking neural network training algorithms
GE Dahl, F Schneider, Z Nado, N Agarwal, CS Sastry, P Hennig, ...
arXiv preprint arXiv:2306.07179, 2023
The system can't perform the operation now. Try again later.
Articles 1–20