Stanislav Fort

Cited by

	All	Since 2019
Citations	4671	4639
h-index	21	21
i10-index	23	23

1900

950

475

1425

201820192020202120222023202414 36 154 336 611 1661 1830

Public access

View all

5 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Balaji LakshminarayananSenior Staff Research Scientist at Google DeepMindVerified email at google.com
Surya GanguliAssociate Professor, Stanford UniversityVerified email at stanford.edu
Clara Huiyi HuGoogle DeepMindVerified email at google.com
Stanisław JastrzębskiChief Technology Officer & Chief Scientist @ Molecule.OneVerified email at molecule.one
Jie RenResearch Scientist at Google BrainVerified email at google.com
Jeremiah Zhe LiuGoogle Research and Harvard UniversityVerified email at mail.harvard.edu
Dustin TranResearch Scientist, GoogleVerified email at google.com
Daniel M. RoyResearch Director, Vector Institute; Prof., U. Toronto (Statistics, CS)Verified email at utoronto.ca
Gintare Karolina DziugaiteGoogle DeepMindVerified email at google.com
Srini NarayananUC Berkeley and GoogleVerified email at icsi.berkeley.edu
Hui Khoon NgAssoc Prof, Yale-NUS College, and Centre for Quantum Technologies, National University of SingaporeVerified email at nus.edu.sg
Yihui QuekMassachusetts Institute of TechnologyVerified email at mit.edu
Dan WilkinsResearch Scientist, Stanford UniversityVerified email at stanford.edu
Jared KaplanJohns Hopkins University & AnthropicVerified email at pha.jhu.edu
Christopher OlahAnthropicVerified email at google.com

Stanislav Fort

Google DeepMind

Verified email at stanford.edu - Homepage

machine learning artificial intelligence AI safety


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, et al. Training a helpful and harmless assistant with reinforcement learning from human feedback Y Bai, A Jones, K Ndousse, A Askell, A Chen, N DasSarma arXiv preprint arXiv:2204.05862 1, 2022	979*	2022
Constitutional AI: Harmlessness from AI Feedback Y Bai, S Kadavath, S Kundu, A Askell, J Kernion, A Jones, A Chen, ... arXiv preprint arXiv:2212.08073, 2022	779	2022
Deep Ensembles: A Loss Landscape Perspective S Fort, H Hu, B Lakshminarayanan arXiv preprint arXiv:1912.02757, 2019	620	2019
Exploring the limits of out-of-distribution detection S Fort, J Ren, B Lakshminarayanan Advances in Neural Information Processing Systems 34, 7068-7081, 2021	306	2021
Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai, S Kadavath, B Mann, ... arXiv preprint arXiv:2209.07858, 2022	300	2022
Predictability and surprise in large generative models D Ganguli, D Hernandez, L Lovitt, A Askell, Y Bai, A Chen, T Conerly, ... Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022	221	2022
Training independent subnetworks for robust prediction M Havasi, R Jenatton, S Fort, JZ Liu, J Snoek, B Lakshminarayanan, ... arXiv preprint arXiv:2010.06610, 2020	200	2020
Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the neural tangent kernel S Fort, GK Dziugaite, M Paul, S Kharaghani, DM Roy, S Ganguli Advances in Neural Information Processing Systems 33, 5850-5861, 2020	163	2020
A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection J Ren, S Fort, J Liu, AG Roy, S Padhy, B Lakshminarayanan arXiv preprint arXiv:2106.09022, 2021	160	2021
The Break-Even Point on Optimization Trajectories of Deep Neural Networks S Jastrzebski, M Szymczak, S Fort, D Arpit, J Tabor, K Cho, K Geras arXiv preprint arXiv:2002.09572, 2020	158	2020
Language models (mostly) know what they know S Kadavath, T Conerly, A Askell, T Henighan, D Drain, E Perez, ... arXiv preprint arXiv:2207.05221, 2022	108	2022
Gaussian Prototypical Networks for Few-Shot Learning on Omniglot S Fort arXiv preprint arXiv:1708.02735, 2017	98	2017
Large Scale Structure of Neural Network Loss Landscapes S Fort, S Jastrzebski arXiv preprint arXiv:1906.04724, 2019	84	2019
Stiffness: A new perspective on generalization in neural networks S Fort, PK Nowak, S Jastrzebski, S Narayanan arXiv preprint arXiv:1901.09491, 2019	82	2019
Adaptive quantum state tomography with neural networks Y Quek, S Fort, HK Ng arXiv preprint arXiv:1812.06693, 2018	64	2018
Measuring progress on scalable oversight for large language models SR Bowman, J Hyun, E Perez, E Chen, C Pettit, S Heiner, K Lukošiūtė, ... arXiv preprint arXiv:2211.03540, 2022	58	2022
Discovery of gamma-ray pulsations from the transitional redback PSR J1227-4853 TJ Johnson, PS Ray, J Roy, CC Cheung, AK Harding, HJ Pletsch, S Fort, ... The Astrophysical Journal 806 (1), 91, 2015	58	2015
The goldilocks zone: Towards better understanding of neural network loss landscapes S Fort, A Scherlis Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 3574-3581, 2019	44	2019
Emergent properties of the local geometry of neural loss landscapes S Fort, S Ganguli arXiv preprint arXiv:1910.05929, 2019	42	2019
Analyzing monotonic linear interpolation in neural network loss landscapes J Lucas, J Bae, MR Zhang, S Fort, R Zemel, R Grosse arXiv preprint arXiv:2104.11044, 2021	34*	2021

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors