Acquisition and enrichment of morphological and morphosemantic knowledge from the French Wiktionary

Nabil Hathout, Franck Sajous and Basilio Calderone 2014 Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing (LG-LP 2014) Dublin, Ireland 65-74 Jorge Baptista, Pushpak Bhattacharyya, Christiane Fellbaum, Mikel Forcada, Chu-Ren Huang, Svetla Koeva, Cvetana Krstev, Eric Laporte Association for Computational Linguistics and Dublin City University 10.3115/v1/W14-5809 78-1-873769-44-7 PDF article ] N. Hathout , F. Sajous and B. Calderone (2014). Acquisition and enrichment of morphological and morphosemantic knowledge from the French Wiktionary. Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing (LG-LP 2014), pp. 65-74, Dublin, Ireland .bib ] We present two approaches to automatically acquire morphologically related words from Wiktionary. Starting with related words explicitly mentioned in the dictionary, we propose a method based on orthographic similarity to detect new derived words from the entries' definitions with an overall accuracy of 93.5%. Using word pairs from the initial lexicon as patterns of formal analogies to filter new derived words enables us to rise the accuracy up to 99%, while extending the lexicon's size by 56%. In a last experiment, we show that it is possible to semantically type the morphological definitions, focusing on the detection of process nominals.