Published June 19, 2020 | Version v1
Dataset Open

SemKpv - Semantic Database for Komi-Zyrian

Description

This SQLite database contains Komi-Zyrian lemmas and their frequencies in a big corpus. The lemmas are linked to each other based on the syntactic relations they have had in the corpus. Also, the frequency of a syntactic relation between two words is recorded. This means that it is possible to see how frequently for example the word for a dog has appeared with a subject relation with the verb for bark. These database is translated from SemFi by using Giellatekno XML dictionaries. For a detailed description of the structure, see https://www.kaggle.com/mikahama/semfi-finnish-semantics-with-syntactic-relations An easy programmatic interface is provided in UralicNLP: https://github.com/mikahama/uralicNLP/wiki/Semantics-(SemFi,-SemUr) Cite as Hämäläinen, Mika. (2018). Extracting a Semantic Database with Syntactic Relations for Finnish to Boost Resources for Endangered Uralic Languages. In The Proceedings of Logic and Engineering of Natural Language Semantics 15 (LENLS15)

Files

Files (547.2 MB)

Name Size Download all
Checksum: md5:529e5aa351c863deeafe2c5fd9e77725

PID: http://hdl.handle.net/11304/84e35dca-8643-4740-afdb-c3142f72f440
547.2 MB Download

Additional details

Identifiers

b2rec
25c92ff35816466a94aeb8f0f42f508d

CLARIN metadata

Language Code
kpv
Resource Type
Other