How to create order in large closed subsets of WordNet-type dictionaries

Ahti Lohk, Ottokar Tilk, Leo Võhandu

Abstract


This article presents a new two-step method to handle and study large closed subsets of WordNet-type dictionaries with the goal of finding possible structural inconsistencies. The notion of closed subset is explained using a WordNet tree. A novel and very fast method to order large relational systems is described and compared with some other fast methods. All the presented methods have been tested using Estonian1 and Princeton WordNet2 largest closed sets.

DOI: http://dx.doi.org/10.5128/ERYa9.10


Keywords


thesaurus, closed set, seriation, Power Iteration Clustering (PIC), reducing number of crossings, WordNet

Full Text:

PDF

References


Eades, Peter; Wormald, Nicholas C. 1994. Edge crossings in drawings of bipartite graphs. – Algorithmica, 11 (4), 379–403. http://dx.doi.org/10.1007/BF01187020

Flannery, P. B.; Press, H. W.; Teukolsky, A. S.; Vetterling, T. W. 2009. Numerical Recipes in C. The Art of Scientific Computing. South Asia: Cambridge University Press India.

Jing, H. 1998. Usage of WordNet in natural language generation. – Proceedings of the Workshop Usage of WordNet in Natural Language Processing Systems: COLING-ACL 1998; August 16, Montreal, Quebec, Canada, 128–134

Jünger, Michael; Mutzel, Petra 1997. 2-Layer Straightline Crossing Minimization: Performance of exact and heuristic algorithms. – Journal of Graph Algorithms and Applications, 1 (1), 1–25. http://dx.doi.org/10.7155/jgaa.00001

Li, X.; Szpakowicz, S.; Matwin, S. 1995. A WordNet-based algorithm for word sense disambiguation. – Proceedings of IJCAI 1995. Morgan Kaufmann Publishers, 1368–1374.

Lin, Frank; Cohen, William W. 2010. Power iteration clustering. – Proceeding of the 27th International Conference on Machine Learning, June 21-24, 2010, Haifa, Israel. Omnipress, 655–662.

Liu, Y.; Jiangsheng, Y.; Zhengshan, W.; Shiwen, Y. 2004. Two kinds of hypernymy faults in Word-Net: the cases of ring and isolator. – Petr Sojka, Karel Pala, Pavel Smrz, Christine Fellbaum, Piek Vossen (Eds.). Proceedings of the Second Global WordNet Conference. Brno, Czech Republic, 20-23 January 2004. Masaryk University, 347–351.

Lohk, Ahti; Võhandu, Leo 2012. Eesti Wordnet’i struktuuri analüüsist. – Eesti Rakenduslingvistika Ühingu aastaraamat, 8, 139–151. http://dx.doi.org/10.5128/ERYa8.09

Lohk, Ahti; Vare, Kadri; Võhandu, Leo 2012a. First steps in checking and comparing Princeton WordNet and Estonian WordNet. – Miriam Butt et al. (Eds.). Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH. April 23-24 2012, Avignon, France. Association for Computational Linguistics, 25–29.

Lohk, Ahti; Vare, Kadri; Võhandu, Leo 2012b. Visual Study of Estonian WordNet using Bipartite Graphs and Minimal Crossing algorithm. – Proceedings of 6th International Global WordNet Conference, Matsue, Japan, 2012, 167–173.

Miller, G.; Beckwith, R.; Fellbaum, C.; Gross, D.; Miller, K. 1990. Introduction to WordNet: An on-line lexical database. – International Journal of Lexicography 3, 235–312.

Morato, J.; Marzal, M. Á.; Lloréns, J.; Moreiro, J. 2004. WordNet applications. – Petr Sojka, Karel Pala, Pavel Smrz, Christine Fellbaum, Piek Vossen (Eds.). Proceedings of the Second Global WordNet Conference. Brno, Czech Republic, 20-23 January 2004, 270–278.

Mäkinen, Erkki; Sieranta, Mika 1994. Genetic algorithms for drawing bipartite graphs. – International Journal of Computer Mathematics, 53 (3-4), 157–166. http://dx.doi.org/10.1080/00207169408804322

Orav, Heili; Kerner, Kadri; Parm, Sirli 2011. Eesti Wordnet'i hetkeseisust. – Keel ja Kirjandus, 2, 96–106.

Richens, Tom 2008. Anomalies in the WordNET verb hierarchy. – Proceedings of the 22nd International Conference on Computational Linguistics: COLING-ACL 2008, August, Manchester, UK, 729–736.

Rila, M.; Tokunaga, T.; Tanaka, H. 1998, The use of WordNet in information retrieval. – Proceedings of the Workshop Usage of WordNet in Natural Language Processing Systems: COLING-ACL 1998, August 16, Montreal, Quebec, Canada, 31–37.

Salam, Khan Md Anwarus; Khan, Mumit; Nishino, Tetsuro 2009. Example based English-Bengali machine translation using WordNet. – Proceedings of the Triangle Symposium on Advanced ICT 2009 (TriSAI 2009), October 28-30, 2009. Tokyo, Japan.

Sugiyama, Kozo; Tagawa, Shojiro; Toda, Mitsuhiko 1981. Methods for Visual Understanding of Hierarchical System Structures. – IEEE Transactions on Systems, Man and Cybernetics, 11 (2), 109–125. http://dx.doi.org/10.1109/TSMC.1981.4308636

Vider, Kadri 2001. Eesti keele tesaurus – teooria ja tegelikkus. – Margit Langemets (Toim.). Leksikograafiaseminar “Sõna tänapäeva maailmas” / Leksikografinen seminaari “Sanat nykymaailmassa”. Ettekannete kogumik. Eesti Keele Instituudi toimetised 9. Tallinn: Eesti Keele Sihtasutus, 134–156.




DOI: http://dx.doi.org/10.5128/ERYa9.10

Refbacks

  • There are currently no refbacks.


Copyright (c) 2013 Ahti Lohk, Ottokar Tilk, Leo Võhandu

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

ISSN 1736-2563 (print)
ISSN 2228-0677 (online)
DOI 10.5128/ERYa.1736-2563