Published June 19, 2020 | Version v1
Dataset Open

SemMyv - Semantic Database for Erzya

Description

This SQLite database contains Erzya lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the corpus. Also, the frequency of a syntactic relation between two words is recorded. This means that it is possible to see how frequently for example the word for a dog has appeared with a subject relation with the verb for bark. These database is translated from SemFi by using Giellatekno XML dictionaries. For a detailed description of the structure, see https://www.kaggle.com/mikahama/semfi-finnish-semantics-with-syntactic-relations An easy programmatic interface is provided in UralicNLP: https://github.com/mikahama/uralicNLP/wiki/Semantics-(SemFi,-SemUr) Cite as Hämäläinen, Mika. (2018). Extracting a Semantic Database with Syntactic Relations for Finnish to Boost Resources for Endangered Uralic Languages. In The Proceedings of Logic and Engineering of Natural Language Semantics 15 (LENLS15)

Files

Files (539.9 MB)

Name Size Download all
Checksum: md5:e141d27378c77bc7320d8d06094733ac

PID: http://hdl.handle.net/11304/30359bcb-80e0-429b-96ba-57f5afa87930
539.9 MB Download

Additional details

Identifiers

b2rec
4688e43db55c4ed7b2517418c646c2ca

CLARIN metadata

Language Code
myv
Resource Type
Other