Published June 19, 2020
| Version
v1
Dataset
Open
SemMyv - Semantic Database for Erzya
Creators
Description
This SQLite database contains Erzya lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the corpus. Also, the frequency of a syntactic relation between two words is recorded. This means that it is possible to see how frequently for example the word for a dog has appeared with a subject relation with the verb for bark.
These database is translated from SemFi by using Giellatekno XML dictionaries.
For a detailed description of the structure, see https://www.kaggle.com/mikahama/semfi-finnish-semantics-with-syntactic-relations
An easy programmatic interface is provided in UralicNLP: https://github.com/mikahama/uralicNLP/wiki/Semantics-(SemFi,-SemUr)
Cite as
Hämäläinen, Mika. (2018). Extracting a Semantic Database with Syntactic Relations for Finnish to Boost Resources for Endangered Uralic Languages. In The Proceedings of Logic and Engineering of Natural Language Semantics 15 (LENLS15)
Files
Files
(539.9 MB)
| Name | Size | Download all |
|---|---|---|
|
Checksum: md5:e141d27378c77bc7320d8d06094733ac
PID: http://hdl.handle.net/11304/30359bcb-80e0-429b-96ba-57f5afa87930 |
539.9 MB | Download |
Additional details
Identifiers
- b2rec
- 4688e43db55c4ed7b2517418c646c2ca
CLARIN metadata
- Language Code
- myv
- Resource Type
- Other