Published June 19, 2020
| Version
v1
Dataset
Open
SemKpv - Semantic Database for Komi-Zyrian
Creators
Description
This SQLite database contains Komi-Zyrian lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the corpus. Also, the frequency of a syntactic relation between two words is recorded. This means that it is possible to see how frequently for example the word for a dog has appeared with a subject relation with the verb for bark.
These database is translated from SemFi by using Giellatekno XML dictionaries.
For a detailed description of the structure, see https://www.kaggle.com/mikahama/semfi-finnish-semantics-with-syntactic-relations
An easy programmatic interface is provided in UralicNLP: https://github.com/mikahama/uralicNLP/wiki/Semantics-(SemFi,-SemUr)
Cite as
Hämäläinen, Mika. (2018). Extracting a Semantic Database with Syntactic Relations for Finnish to Boost Resources for Endangered Uralic Languages. In The Proceedings of Logic and Engineering of Natural Language Semantics 15 (LENLS15)
Files
Files
(547.2 MB)
| Name | Size | Download all |
|---|---|---|
|
Checksum: md5:529e5aa351c863deeafe2c5fd9e77725
PID: http://hdl.handle.net/11304/84e35dca-8643-4740-afdb-c3142f72f440 |
547.2 MB | Download |
Additional details
Identifiers
- b2rec
- 25c92ff35816466a94aeb8f0f42f508d
CLARIN metadata
- Language Code
- kpv
- Resource Type
- Other