Syntax-Ignorant N-Gram Embeddings for Dialectal Arabic Sentiment Analysis

dc.contributor.author Mulki, Hala
dc.contributor.author Haddad, Hatem
dc.contributor.author Gridach, Mourad
dc.contributor.author Babaoglu, İsmail
dc.date.accessioned 2021-12-13T10:32:20Z
dc.date.available 2021-12-13T10:32:20Z
dc.date.issued 2021
dc.description.abstract Arabic sentiment analysis models have recently employed compositional paragraph or sentence embedding features to represent the informal Arabic dialectal content. These embeddings are mostly composed via ordered, syntax-aware composition functions and learned within deep neural network architectures. With the differences in the syntactic structure and words' order among the Arabic dialects, a sentiment analysis system developed for one dialect might not be efficient for the others. Here we present syntax-ignorant, sentiment-specific n-gram embeddings for sentiment analysis of several Arabic dialects. The novelty of the proposed model is illustrated through its features and architecture. In the proposed model, the sentiment is expressed by embeddings, composed via the unordered additive composition function and learned within a shallow neural architecture. To evaluate the generated embeddings, they were compared with the state-of-the art word/paragraph embeddings. This involved investigating their efficiency, as expressive sentiment features, based on the visualisation maps constructed for our n-gram embeddings and word2vec/doc2vec. In addition, using several Eastern/Western Arabic datasets of single-dialect and multi-dialectal contents, the ability of our embeddings to recognise the sentiment was investigated against word/paragraph embeddings-based models. This comparison was performed within both shallow and deep neural network architectures and with two unordered composition functions employed. The results revealed that the introduced syntax-ignorant embeddings could represent single and combinations of different dialects efficiently, as our shallow sentiment analysis model, trained with the proposed n-gram embeddings, could outperform the word2vec/doc2vec models and rival deep neural architectures consuming, remarkably, less training time. en_US
dc.identifier.doi 10.1017/S135132492000008X
dc.identifier.issn 1351-3249
dc.identifier.issn 1469-8110
dc.identifier.scopus 2-s2.0-85082685504
dc.identifier.uri https://doi.org/10.1017/S135132492000008X
dc.identifier.uri https://hdl.handle.net/20.500.13091/1011
dc.language.iso en en_US
dc.publisher CAMBRIDGE UNIV PRESS en_US
dc.relation.ispartof NATURAL LANGUAGE ENGINEERING en_US
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject n-gram embeddings en_US
dc.subject Unordered compositionality en_US
dc.subject Arabic dialects en_US
dc.subject Sentiment analysis en_US
dc.title Syntax-Ignorant N-Gram Embeddings for Dialectal Arabic Sentiment Analysis en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.id haddad, hatem/0000-0003-3599-7229
gdc.author.scopusid 57200388232
gdc.author.scopusid 22734490100
gdc.author.scopusid 50161532700
gdc.author.scopusid 23097339300
gdc.author.wosid haddad, hatem/ABD-1530-2021
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access metadata only access
gdc.coar.type text::journal::journal article
gdc.description.department Fakülteler, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümü en_US
gdc.description.endpage 338 en_US
gdc.description.issue 3 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.startpage 315 en_US
gdc.description.volume 27 en_US
gdc.description.wosquality Q1
gdc.identifier.openalex W3010964828
gdc.identifier.wos WOS:000656232400003
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.diamondjournal false
gdc.oaire.impulse 2.0
gdc.oaire.influence 2.7157243E-9
gdc.oaire.isgreen false
gdc.oaire.popularity 3.0088576E-9
gdc.oaire.publicfunded false
gdc.oaire.sciencefields 0202 electrical engineering, electronic engineering, information engineering
gdc.oaire.sciencefields 02 engineering and technology
gdc.openalex.collaboration International
gdc.openalex.fwci 0.44057865
gdc.openalex.normalizedpercentile 0.68
gdc.opencitations.count 2
gdc.plumx.crossrefcites 1
gdc.plumx.mendeley 26
gdc.plumx.scopuscites 4
gdc.scopus.citedcount 4
gdc.virtual.author Babaoğlu, İsmail
gdc.wos.citedcount 2
relation.isAuthorOfPublication 871b6e10-080d-4f91-8bf5-c78453b0d57d
relation.isAuthorOfPublication.latestForDiscovery 871b6e10-080d-4f91-8bf5-c78453b0d57d

Files