Semi-Automatic Enrichment of Crowdsourced Synonymy Networks: The WISIGOTH system applied to Wiktionary
Franck Sajous
Emmanuel Navarro,
Bruno Gaume,
Laurent Prévot,
and Yannick Chudy
2013 (published online the 5th November 2011)
Language Resources and Evaluation
47
1
63-96
Iryna Gurevych and Torsten Zesch
10.1007/s10579-011-9168-6
F. Sajous, E. Navarro, B. Gaume, L. Prévot and Y. Chudy (2011).
Semi-Automatic Enrichment of Crowdsourced Synonymy Networks: The WISIGOTH system applied to Wiktionary
Language Resources and Evaluation, 47(1), pp. 63-96.
[ PDF article ]
Synonymy Networks, Semantic Relatedness, Collaboratively Constructed,
Resources, Wiktionary, Semi-Automatic Enrichment, Random Walks, Small Worlds
Semantic lexical resources are a mainstay of various Natural Language Processing applications. However, comprehensive and reliable resources are rare and not often freely available. Handcrafted resources are too costly for being a general solution while automatically-built resources need to be validated by experts or at least thoroughly evaluated. We propose in this paper a picture of the current situation with regard to lexical resources, their building and their evaluation. We give an in-depth description of Wiktionary, a freely available and collaboratively built multilingual dictionary. Wiktionary is presented here as a promising raw resource for NLP. We propose a semi-automatic approach based on random walks for enriching Wiktionary synonymy network that uses both endogenous and exogenous data. We take advantage of the wiki infrastructure to propose a validation “by crowds”. Finally, we present an implementation called WISIGOTH, which supports our approach.
[ .bib ]