Semi-Automatic Enrichment of Crowdsourced Synonymy Networks: The WISIGOTH system applied to Wiktionary

Franck Sajous Emmanuel Navarro, Bruno Gaume, Laurent Prévot, and Yannick Chudy 2013 (published online the 5th November 2011) Language Resources and Evaluation 47 1 63-96 Iryna Gurevych and Torsten Zesch 10.1007/s10579-011-9168-6 F. Sajous, E. Navarro, B. Gaume, L. Prévot and Y. Chudy (2011). Semi-Automatic Enrichment of Crowdsourced Synonymy Networks: The WISIGOTH system applied to Wiktionary Language Resources and Evaluation, 47(1), pp. 63-96. PDF article ] Synonymy Networks, Semantic Relatedness, Collaboratively Constructed, Resources, Wiktionary, Semi-Automatic Enrichment, Random Walks, Small Worlds Semantic lexical resources are a mainstay of various Natural Language Processing applications. However, comprehensive and reliable resources are rare and not often freely available. Handcrafted resources are too costly for being a general solution while automatically-built resources need to be validated by experts or at least thoroughly evaluated. We propose in this paper a picture of the current situation with regard to lexical resources, their building and their evaluation. We give an in-depth description of Wiktionary, a freely available and collaboratively built multilingual dictionary. Wiktionary is presented here as a promising raw resource for NLP. We propose a semi-automatic approach based on random walks for enriching Wiktionary synonymy network that uses both endogenous and exogenous data. We take advantage of the wiki infrastructure to propose a validation “by crowds”. Finally, we present an implementation called WISIGOTH, which supports our approach. .bib ]