Repository logoGCRIS
  • English
  • Türkçe
  • Русский
Log In
New user? Click here to register. Have you forgotten your password?
Home
Communities
Browse GCRIS
Entities
Overview
GCRIS Guide
  1. Home
  2. Browse by Author

Browsing by Author "Haddad, Hatem"

Filter results by typing the first few letters
Now showing 1 - 5 of 5
  • Results Per Page
  • Sort Options
  • Loading...
    Thumbnail Image
    Article
    Citation - WoS: 1
    Citation - Scopus: 1
    Empirical Evaluation of Leveraging Named Entities for Arabic Sentiment Analysis
    (ZARKA PRIVATE UNIV, 2020) Mulki, Hala; Haddad, Hatem; Gridach, Mourad; Babaoglu, İsmail
    Social media reflects the attitudes of the public towards specific events. Events are often related to persons, locations or organizations, the so-called Named Entities (NEs). This can define NEs as sentiment-bearing components. In this paper, we dive beyond NEs recognition to the exploitation of sentiment-annotated NEs in Arabic sentiment analysis. Therefore, we develop an algorithm to detect the sentiment of NEs based on the majority of attitudes towards them. This enabled tagging NEs with proper tags and, thus, including them in a sentiment analysis framework of two models: supervised and lexicon-based. Both models were applied on datasets of multi-dialectal content. The results revealed that NEs have no considerable impact on the supervised model, while employing NEs in the lexicon-based model improved the classification performance and outperformed most of the baseline systems.
  • Loading...
    Thumbnail Image
    Conference Object
    Citation - WoS: 103
    L-Hsab: a Levantine Twitter Dataset for Hate Speech and Abusive Language
    (ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2019) Mulki, Hala; Haddad, Hatem; Ali, Chedi Bechikh; Alshabani, Halima
    Hate speech and abusive language have become a common phenomenon on Arabic social media. Automatic hate speech and abusive detection systems can facilitate the prohibition of toxic textual contents. The complexity, informality and ambiguity of the Arabic dialects hindered the provision of the needed resources for Arabic abusive/hate speech detection research. In this paper, we introduce the first publicly-available Levantine Hate Speech and Abusive (L-HSAB) Twitter dataset with the objective to be a benchmark dataset for automatic detection of online Levantine toxic contents. We, further, provide a detailed review of the data collection steps and how we design the annotation guidelines such that a reliable dataset annotation is guaranteed. This has been later emphasized through the comprehensive evaluation of the annotations as the annotation agreement metrics of Cohen's Kappa (k) and Krippendorff's alpha (alpha) indicated the consistency of the annotations.
  • Loading...
    Thumbnail Image
    Article
    Citation - WoS: 2
    Citation - Scopus: 4
    Syntax-Ignorant N-Gram Embeddings for Dialectal Arabic Sentiment Analysis
    (CAMBRIDGE UNIV PRESS, 2021) Mulki, Hala; Haddad, Hatem; Gridach, Mourad; Babaoglu, İsmail
    Arabic sentiment analysis models have recently employed compositional paragraph or sentence embedding features to represent the informal Arabic dialectal content. These embeddings are mostly composed via ordered, syntax-aware composition functions and learned within deep neural network architectures. With the differences in the syntactic structure and words' order among the Arabic dialects, a sentiment analysis system developed for one dialect might not be efficient for the others. Here we present syntax-ignorant, sentiment-specific n-gram embeddings for sentiment analysis of several Arabic dialects. The novelty of the proposed model is illustrated through its features and architecture. In the proposed model, the sentiment is expressed by embeddings, composed via the unordered additive composition function and learned within a shallow neural architecture. To evaluate the generated embeddings, they were compared with the state-of-the art word/paragraph embeddings. This involved investigating their efficiency, as expressive sentiment features, based on the visualisation maps constructed for our n-gram embeddings and word2vec/doc2vec. In addition, using several Eastern/Western Arabic datasets of single-dialect and multi-dialectal contents, the ability of our embeddings to recognise the sentiment was investigated against word/paragraph embeddings-based models. This comparison was performed within both shallow and deep neural network architectures and with two unordered composition functions employed. The results revealed that the introduced syntax-ignorant embeddings could represent single and combinations of different dialects efficiently, as our shallow sentiment analysis model, trained with the proposed n-gram embeddings, could outperform the word2vec/doc2vec models and rival deep neural architectures consuming, remarkably, less training time.
  • Loading...
    Thumbnail Image
    Conference Object
    Citation - WoS: 7
    Citation - Scopus: 10
    Syntax-Ignorant N-Gram Embeddings for Sentiment Analysis of Arabic Dialects
    (ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2019) Mulki, Hala; Haddad, Hatem; Gridach, Mourad; Babaoglu, İsmail
    Arabic sentiment analysis models have employed compositional embedding features to represent the Arabic dialectal content. These embeddings are usually composed via ordered, syntax-aware composition functions and learned within deep neural frameworks. With the free word order and the varying syntax nature across the different Arabic dialects, a sentiment analysis system developed for one dialect might not be efficient for the others. Here we present syntax-ignorant n-gram embeddings to be used in sentiment analysis of several Arabic dialects. The proposed embeddings were composed and learned using an unordered composition function and a shallow neural model. Five datasets of different dialects were used to evaluate the produced embeddings in the sentiment analysis task. The obtained results revealed that, our syntax-ignorant embeddings could outperform word2vec model and doc2vec both variant models in addition to hand-crafted system baselines, while a competent performance was noticed towards baseline systems that adopted more complicated neural architectures.
  • Loading...
    Thumbnail Image
    Conference Object
    Citation - WoS: 42
    Citation - Scopus: 74
    T-Hsab: a Tunisian Hate Speech and Abusive Dataset
    (SPRINGER INTERNATIONAL PUBLISHING AG, 2019) Haddad, Hatem; Mulki, Hala; Oueslati, Asma
    Since the Jasmine Revolution at 2011, Tunisia has entered a new era of ultimate freedom of expression with a full access into social media. This has been associated with an unrestricted spread of toxic contents such as Abusive and Hate speech. Considering the psychological harm, let alone the potential hate crimes that might be caused by these toxic contents, automatic Abusive and Hate speech detection systems become a mandatory. This evokes the need for Tunisian benchmark datasets required to evaluate Abusive and Hate speech detection models. Being an underrepresented dialect, no previous Abusive or Hate speech datasets were provided for the Tunisian dialect. In this paper, we introduce the first publicly-available Tunisian Hate and Abusive speech (T-HSAB) dataset with the objective to be a benchmark dataset for automatic detection of online Tunisian toxic contents. We provide a detailed review of the data collection steps and how we design the annotation guidelines such that a reliable dataset annotation is guaranteed. This was later emphasized through the comprehensive evaluation of the annotations as the annotation agreement metrics of Cohen's Kappa (k) and Krippendorff's alpha (alpha) indicated the consistency of the annotations.
Repository logo
Collections
  • Scopus Collection
  • WoS Collection
  • TrDizin Collection
  • PubMed Collection
Entities
  • Research Outputs
  • Organizations
  • Researchers
  • Projects
  • Awards
  • Equipments
  • Events
About
  • Contact
  • GCRIS
  • Research Ecosystems
  • Feedback
  • OAI-PMH

Log in to GCRIS Dashboard

Powered by Research Ecosystems

  • Privacy policy
  • End User Agreement
  • Feedback