Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary
Emmanuel Navarro,
Franck Sajous
Bruno Gaume,
Laurent Prévot,
and Yannick Chudy
2010
Advances in Natural Language Processing - Proceedings of the 7th International Conference on NLP (IceTAL 2010)
Lecture Notes in Computer Science
6233
H. Loftsson, E. Rögnvaldsson, S. Helgadóttir
Springer
Berlin, Heidelberg
332-344
978-3-642-14770-8
10.1007/978-3-642-14770-8_37
E.Navarro, F. Sajous, B. Gaume, L. Prévot and Y. Chudy.
Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary
In: H. Loftsson, E. Rögnvaldsson, S. Helgadóttir (eds).
Proceedings of the 7th International Conference on NLP (IceTAL 2010): Advances in Natural Language Processing, Lecture Notes in Computer Science, vol 6233, pp. 332-344. Springer, Berlin/Heidelberg.
[ Authors version ]
Collaboratively Constructed Lexical Resources, Endogenous Enrichment, Crowdsourcing, Wiktionary, Random Walks
The lack of large-scale, freely available and durable lexical
resources, and the consequences for NLP, is widely acknowledged but the
attempts to cope with usual bottlenecks preventing their development often
result in dead-ends. This article introduces a language-independent,
semi-automatic and endogenous method for enriching lexical resources,
based on collaborative editing and random walks through existing lexical
relationships, and shows how this approach enables us to overcome recurrent
impediments. It compares the impact of using different data sources
and similarity measures on the task of improving synonymy networks.
Finally, it defines an architecture for applying the presented method to
Wiktionary and explains how it has been implemented.
[ .bib ]