Llama: Open and efficient foundation language models H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971, 2023 | 5802 | 2023 |
Llama 2: Open foundation and fine-tuned chat models H Touvron, L Martin, K Stone, P Albert, A Almahairi, Y Babaei, ... arXiv preprint arXiv:2307.09288, 2023 | 4433 | 2023 |
Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring S Humeau, K Shuster, MA Lachaux, J Weston arXiv preprint arXiv:1905.01969, 2019 | 522 | 2019 |
CCNet: Extracting high quality monolingual datasets from web crawl data G Wenzek, MA Lachaux, A Conneau, V Chaudhary, F Guzmán, A Joulin, ... arXiv preprint arXiv:1911.00359, 2019 | 488 | 2019 |
Unsupervised translation of programming languages MA Lachaux, B Roziere, L Chanussot, G Lample arXiv preprint arXiv:2006.03511, 2020 | 333* | 2020 |
Mistral 7B AQ Jiang, A Sablayrolles, A Mensch, C Bamford, DS Chaplot, D Casas, ... arXiv preprint arXiv:2310.06825, 2023 | 183 | 2023 |
DOBF: A Deobfuscation Pre-Training Objective for Programming Languages MA Lachaux, B Roziere, M Szafraniec, G Lample Advances in Neural Information Processing Systems 34, 2021 | 120* | 2021 |
Mixtral of experts AQ Jiang, A Sablayrolles, A Roux, A Mensch, B Savary, C Bamford, ... arXiv preprint arXiv:2401.04088, 2024 | 73 | 2024 |
Hypertree proof search for neural theorem proving G Lample, T Lacroix, MA Lachaux, A Rodriguez, A Hayat, T Lavril, ... Advances in neural information processing systems 35, 26337-26349, 2022 | 69 | 2022 |
LLaMA: open and efficient foundation language models. arXiv H Touvron, T Lavril, G Izacard, X Martinet, MA Lachaux, T Lacroix, ... arXiv preprint arXiv:2302.13971, 2023 | 53 | 2023 |
Target conditioning for one-to-many generation MA Lachaux, A Joulin, G Lample arXiv preprint arXiv:2009.09758, 2020 | 12 | 2020 |