The Sketch Engine: ten years on A Kilgarriff, V Baisa, J Bušta, M Jakubíček, V Kovář, J Michelfeit, P Rychlý, ... Lexicography 1 (1), 7-36, 2014 | 4173 | 2014 |
The TenTen corpus family M Jakubíček, A Kilgarriff, V Kovář, P Rychlý, V Suchomel 7th international corpus linguistics conference CL, 125-127, 2013 | 679 | 2013 |
HindEnCorp-Hindi-English and Hindi-only Corpus for Machine Translation. O Bojar, V Diatka, P Rychlý, P Stranák, V Suchomel, A Tamchyna, ... LREC, 3550-3555, 2014 | 153 | 2014 |
Efficient web crawling for large text corpora V Suchomel, J Pomikálek Proceedings of the seventh Web as Corpus Workshop (WAC7), 39-43, 2012 | 149* | 2012 |
SkELL: Web Interface for English Language Learning. V Baisa, V Suchomel RASLAN, 63-70, 2014 | 114 | 2014 |
Finding terms in corpora for many languages with the Sketch Engine M Jakubíček, A Kilgarriff, V Kovář, P Rychlý, V Suchomel Proceedings of the Demonstrations at the 14th Conference of the European …, 2014 | 78 | 2014 |
arTenTen: Arabic corpus and word sketches T Arts, Y Belinkov, N Habash, A Kilgarriff, V Suchomel Journal of King Saud University-Computer and Information Sciences 26 (4 …, 2014 | 75 | 2014 |
Text Tokenisation Using unitok. J Michelfeit, J Pomikálek, V Suchomel RASLAN, 71-75, 2014 | 57 | 2014 |
csTenTen17, a Recent Czech Web Corpus. V Suchomel RASLAN, 111-123, 2018 | 29 | 2018 |
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint. ana 2.0 T Erjavec, M Ogrodniczuk, P Osenova, N Ljubešić, K Simov, V Grigorova, ... CLARIN ERIC, 2021 | 28* | 2021 |
Large corpora for Turkic languages and unsupervised morphological analysis V Baisa, V Suchomel Proceedings of the Eighth conference on International Language Resources and …, 2012 | 27 | 2012 |
Recent Czech Web Corpora. V Suchomel RASLAN, 77-83, 2012 | 24 | 2012 |
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages M Banón, M Espla-Gomis, ML Forcada, C García-Romero, T Kuzman, ... 23rd Annual Conference of the European Association for Machine Translation …, 2022 | 19 | 2022 |
Current challenges in web corpus building M Jakubíček, V Kovář, P Rychlý, V Suchomel Proceedings of the 12th Web as Corpus Workshop, 1-4, 2020 | 18 | 2020 |
Better web corpora for corpus linguistics and NLP V Suchomel Masaryk University, 2020 | 15 | 2020 |
Annotated amharic corpora P Rychlý, V Suchomel Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Brno …, 2016 | 15 | 2016 |
arTenTen: a new, vast corpus for Arabic Y Belinkov, N Habash, A Kilgarriff, N Ordan, R Roth, V Suchomel Proceedings of WACL 20, 2013 | 15 | 2013 |
Terminology extraction for academic slovene using sketch engine D Fišer, V Suchomel, M Jakubícek Tenth Workshop on Recent Advances in Slavonic Natural Language Processing …, 2016 | 14 | 2016 |
Building a 50M Corpus of Tajik Language. G Dovudov, J Pomikálek, V Suchomel, P Smerk RASLAN, 89-95, 2011 | 11 | 2011 |
HindMonoCorp 0.5 O Bojar, V Diatka, P Rychlý, P Straňák, V Suchomel, A Tamchyna, ... Charles University, Faculty of Mathematics and Physics, Institute of Formal …, 2014 | 9 | 2014 |