Published December 28, 2017 | Version v1
Dataset Open

Hate speech dataset annotated for Portuguese

Creators

Description

Portuguese Hate Speech Twitter Dataset is a dataset of Twitter messages manually annotated for Hate Speech using a hierarchical structure of classes. 5,668 messages were collected on Twitter, from 1,156 distinct users and classified as containing hate speech using a hierarchical structure of classes. A multiclass and multilabel approach was considered. Two different formats of the dataset are provided, plus the hierarchy of classes. The text of the tweets is omitted in this dataset due to the conditions and terms of the Twitter API.

Files

annotator_classes.csv

Files (1.3 MB)

Name Size Download all
Checksum: md5:8b5c94dfcf4619ae91ff12111556ade5

PID: http://hdl.handle.net/11304/9e66772d-a0ab-458d-bc0f-8b51565fb5c3
157.8 kB Preview Download
Checksum: md5:36ad29d776da80bf693e3e1881d80b59

PID: http://hdl.handle.net/11304/45041b9c-0773-4f4a-b0de-74e98543ebf7
1.0 MB Preview Download
Checksum: md5:0d4801f5c0ddb8a87ccabbff1706e963

PID: http://hdl.handle.net/11304/2401e993-417a-4eee-802d-566d3e7f7641
2.5 kB Preview Download
Checksum: md5:11991393203c1ba13146e2d56deecd00

PID: http://hdl.handle.net/11304/ea42d8e3-1007-475a-891c-d41c7497a62a
41.9 kB Preview Download
Checksum: md5:6a9df9a165ddd9cdd22b6a4cc22db193

PID: http://hdl.handle.net/11304/12590949-ac5e-40a4-a66a-09a4a2d9f31f
7.9 kB Download
Checksum: md5:e6bbebb2ca8b9e9df27a8033034f1d40

PID: http://hdl.handle.net/11304/50951d54-e96f-4fd1-b521-216bf32daf08
10.0 kB Preview Download
Checksum: md5:6b6037b3ef77b8311298a0ae665d4ff8

PID: http://hdl.handle.net/11304/f8f7469f-193c-4acc-a6c6-347e7aa1fc96
118.9 kB Preview Download
Checksum: md5:d76f13def8517a60f6299160d2a93157

PID: http://hdl.handle.net/11304/e286ae95-1b7a-4968-8e03-55229a005052
2.1 kB Preview Download

Additional details

Identifiers

b2rec
9005efe2d6be4293b63c3cffd4cf193e