L-Hsab: a Levantine Twitter Dataset for Hate Speech and Abusive Language

dc.contributor.author Mulki, Hala
dc.contributor.author Haddad, Hatem
dc.contributor.author Ali, Chedi Bechikh
dc.contributor.author Alshabani, Halima
dc.date.accessioned 2021-12-13T10:34:35Z
dc.date.available 2021-12-13T10:34:35Z
dc.date.issued 2019
dc.description 3rd Workshop on Abusive Language Online -- AUG 01, 2019 -- Florence, ITALY en_US
dc.description.abstract Hate speech and abusive language have become a common phenomenon on Arabic social media. Automatic hate speech and abusive detection systems can facilitate the prohibition of toxic textual contents. The complexity, informality and ambiguity of the Arabic dialects hindered the provision of the needed resources for Arabic abusive/hate speech detection research. In this paper, we introduce the first publicly-available Levantine Hate Speech and Abusive (L-HSAB) Twitter dataset with the objective to be a benchmark dataset for automatic detection of online Levantine toxic contents. We, further, provide a detailed review of the data collection steps and how we design the annotation guidelines such that a reliable dataset annotation is guaranteed. This has been later emphasized through the comprehensive evaluation of the annotations as the annotation agreement metrics of Cohen's Kappa (k) and Krippendorff's alpha (alpha) indicated the consistency of the annotations. en_US
dc.description.sponsorship UCLA, Google, Facebook, Element AI, Aylien en_US
dc.identifier.isbn 978-1-950737-43-7
dc.identifier.uri https://hdl.handle.net/20.500.13091/1014
dc.language.iso en en_US
dc.publisher ASSOC COMPUTATIONAL LINGUISTICS-ACL en_US
dc.relation.ispartof THIRD WORKSHOP ON ABUSIVE LANGUAGE ONLINE en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject AGREEMENT en_US
dc.title L-Hsab: a Levantine Twitter Dataset for Hate Speech and Abusive Language en_US
dc.type Conference Object en_US
dspace.entity.type Publication
gdc.author.id haddad, hatem/0000-0003-3599-7229
gdc.author.wosid haddad, hatem/ABD-1530-2021
gdc.coar.access metadata only access
gdc.coar.type text::conference output
gdc.description.department Fakülteler, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü en_US
gdc.description.endpage 118 en_US
gdc.description.publicationcategory Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality N/A
gdc.description.startpage 111 en_US
gdc.description.wosquality N/A
gdc.identifier.wos WOS:000538480400012
gdc.index.type WoS
gdc.wos.citedcount 103

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
W19-3512.pdf
Size:
1.46 MB
Format:
Adobe Portable Document Format