Steven Basart
Steven Basart
PhD, University of Chicago
E-mailová adresa ověřena na: - Domovská stránka
The many faces of robustness: A critical analysis of out-of-distribution generalization
D Hendrycks, S Basart, N Mu, S Kadavath, F Wang, E Dorundo, R Desai, ...
Proceedings of the IEEE/CVF international conference on computer vision …, 2021
Natural adversarial examples
D Hendrycks, K Zhao, S Basart, J Steinhardt, D Song
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
Measuring massive multitask language understanding
D Hendrycks, C Burns, S Basart, A Zou, M Mazeika, D Song, J Steinhardt
arXiv preprint arXiv:2009.03300, 2020
Measuring mathematical problem solving with the math dataset
D Hendrycks, C Burns, S Kadavath, A Arora, S Basart, E Tang, D Song, ...
arXiv preprint arXiv:2103.03874, 2021
Improving and Assessing Anomaly Detectors for Large-Scale Settings
D Hendrycks, S Basart, M Mazeika, A Zou, J Kwon, M Mostajabi, ...
Measuring coding challenge competence with apps
D Hendrycks, S Basart, S Kadavath, M Mazeika, A Arora, E Guo, C Burns, ...
arXiv preprint arXiv:2105.09938, 2021
Aligning ai with shared human values
D Hendrycks, C Burns, S Basart, A Critch, J Li, D Song, J Steinhardt
arXiv preprint arXiv:2008.02275, 2020
Diode: A dense indoor and outdoor depth dataset
I Vasiljevic, N Kolkin, S Zhang, R Luo, H Wang, FZ Dai, AF Daniele, ...
arXiv preprint arXiv:1908.00463, 2019
Testing robustness against unforeseen adversaries
D Kang, Y Sun, D Hendrycks, T Brown, J Steinhardt
Representation engineering: A top-down approach to ai transparency
A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ...
arXiv preprint arXiv:2310.01405, 2023
Do the rewards justify the means? measuring trade-offs between rewards and ethical behavior in the machiavelli benchmark
A Pan, JS Chan, A Zou, N Li, S Basart, T Woodside, H Zhang, S Emmons, ...
International Conference on Machine Learning, 26837-26867, 2023
How would the viewer feel? Estimating wellbeing from video scenarios
M Mazeika, E Tang, A Zou, S Basart, JS Chan, D Song, D Forsyth, ...
Advances in Neural Information Processing Systems 35, 18571-18585, 2022
A quantitative measure of generative adversarial network distributions
D Hendrycks, S Basart
Harmbench: A standardized evaluation framework for automated red teaming and robust refusal
M Mazeika, L Phan, X Yin, A Zou, Z Wang, N Mu, E Sakhaee, N Li, ...
arXiv preprint arXiv:2402.04249, 2024
Scaling out-of-distribution detection for real-world settings
S Basart, M Mantas, M Mohammadreza, S Jacob, S Dawn
International Conference on Machine Learning, 2022
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
N Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti, JD Li, AK Dombrowski, ...
arXiv preprint arXiv:2403.03218, 2024
Robustness Evaluation of Proxy Models against Adversarial Optimization
A Zou, L Phan, N Li, JS Chan, M Mazeika, A O'Gara, S Basart, J Ng, ...
Enhancing Neural Network Transparency through Representation Analysis
A Zou, L Phan, SL Chen, J Campbell, PH Guo, R Ren, A Pan, X Yin, ...
Evaluating Robustness to Unforeseen Adversarial Attacks
M Kaufmann, D Kang, Y Sun, X Yin, S Basart, M Mazeika, A Dziedzic, ...
Towards Robustness of Neural Networks
S Basart
The University of Chicago, 2021
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–20