Follow
Róbert Csordás
Title
Cited by
Cited by
Year
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
R Csordás, K Irie, J Schmidhuber
EMNLP 2021, 2021
1252021
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks
R Csordás, S van Steenkiste, J Schmidhuber
International Conference on Learning Representations (ICLR), 2021
982021
Randomized Positional Encodings Boost Length Generalization of Transformers
A Ruoss, G Delétang, T Genewein, J Grau-Moya, R Csordás, M Bennani, ...
arXiv preprint arXiv:2305.16843, 2023
682023
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
K Irie, I Schlag, R Csordás, J Schmidhuber
Conference on Neural Information Processing Systems (NeurIPS), 2021, 2021
672021
A generalist neural algorithmic learner
B Ibarz, V Kurin, G Papamakarios, K Nikiforou, M Bennani, R Csordás, ...
Learning on Graphs Conference, 2: 1-2: 23, 2022
652022
Mindstorms in Natural Language-Based Societies of Mind
M Zhuge, H Liu, F Faccio, DR Ashley, R Csordás, A Gopalakrishnan, ...
arXiv preprint arXiv:2305.17066, 2023
592023
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
R Csordás, K Irie, J Schmidhuber
International Conference on Learning Representations (ICLR), 2021
562021
Improving Differentiable Neural Computers Through Memory Masking, De-allocation, and Link Distribution Sharpness Control
R Csordás, J Schmidhuber
International Conference on Learning Representations (ICLR), 2019
452019
A Modern Self-Referential Weight Matrix That Learns to Modify Itself
K Irie, I Schlag, R Csordás, J Schmidhuber
Deep RL Workshop NeurIPS 2021, 2021
372021
The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention
K Irie*, R Csordás*, J Schmidhuber
International Conference on Machine Learning 2022 (ICML), 2022
332022
Method and apparatus for generating a displacement map of an input dataset pair
R Csordás, Á Kis-Benedek, B Szalkai
US Patent 10,380,753, 2019
312019
Approximating Two-Layer Feedforward Networks for Efficient Transformers
R Csordás, K Irie, J Schmidhuber
arXiv preprint arXiv:2310.10837, 2023
152023
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations
R Csordás, K Irie, J Schmidhuber
EMNLP 2022, 2022
122022
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
R Csordás, P Piękos, K Irie, J Schmidhuber
arXiv preprint arXiv:2312.07987, 2023
62023
Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
R Csordás, C Potts, CD Manning, A Geiger
arXiv preprint arXiv:2408.10920, 2024
32024
Automating Continual Learning
K Irie, R Csordás, J Schmidhuber
22023
Improving Baselines in the Wild
K Irie, I Schlag, R Csordás, J Schmidhuber
NeurIPS 2021 Workshop on Distribution Shifts: Connecting Methods and …, 2021
22021
Self-organising Neural Discrete Representation Learning à la Kohonen
K Irie, R Csordás, J Schmidhuber
International Conference on Artificial Neural Networks, 343-362, 2024
1*2024
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
J Kallini, S Murty, CD Manning, C Potts, R Csordás
arXiv preprint arXiv:2410.20771, 2024
2024
MoEUT: Mixture-of-Experts Universal Transformers
R Csordás, K Irie, J Schmidhuber, C Potts, CD Manning
arXiv preprint arXiv:2405.16039, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20