From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource

Basilio Calderone, Nabil Hathout and Franck Sajous 2014 Proceedings of the 16th EURALEX International Congress Bolzano, Italy 431-446 Andrea Abel, Chiara Vettori and Natascia Ralli EURAC research 978-88-88906-97-3 PDF article ] F. Sajous, N. Hathout et B. Calderone (2014). From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource Proceedings of the 16th EURALEX International Congress, pp. 431-446, Bolzano, Italy. .bib ] French lexicon, lexical resource for psycholinguistic studies, Wiktionary In this paper, we present two French lexical resources, GLÀFF and PsychoGLÀFF. The former, automatically extracted from the collaborative online dictionary Wiktionary, is a large-scale versatile lexicon exploitable in Natural Language Processing applications and linguistic studies. The latter, based on GLÀFF, is a lexicon specifically designed for psycholinguistic research. GLÀFF, counting more than 1.4 million entries, features an unprecedented size. It reports lemmas, main syntactic categories, inflectional features and phonemic transcriptions. PsychoGLÀFF contains additional information related to formal aspects of the lexicon and its distribution. It contains about 340,000 entries (120,000 lemmas) that are corpora-attested. We explain how the resources have been created and compare them to other known resources in terms of coverage and quality. Regarding PsychoGLÀFF, the comparison shows that it has an exceptionally large repertoire while having a comparable quality.