ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset

arXiv · November 1, 2020 · Notable

Summary

Researchers introduce ASAD, a new large-scale, high-quality Arabic Sentiment Analysis Dataset based on 95K tweets with positive, negative, and neutral labels. The dataset is launched with a competition sponsored by KAUST offering a total of 17000 USD in prizes. Baseline models are implemented and results reported to provide a reference for competition participants.

Keywords

Arabic Sentiment Analysis · Dataset · ASAD · Twitter · KAUST

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Overview of the Arabic Sentiment Analysis 2021 Competition at KAUST

arXiv · Sep 29

KAUST organized an Arabic Sentiment Analysis Challenge where participants developed ML models to classify tweets as positive, negative, or neutral. The competition used the ASAD dataset with 55K tweets for training, 20K for validation, and 20K for final evaluation. The full dataset of 100K labeled tweets has been released for public use.

ADAB: Arabic Dataset for Automated Politeness Benchmarking -- A Large-Scale Resource for Computational Sociopragmatics

arXiv · Feb 14

The paper introduces ADAB (Arabic Politeness Dataset), a new annotated Arabic dataset for politeness detection collected from online platforms. The dataset covers Modern Standard Arabic and multiple dialects (Gulf, Egyptian, Levantine, and Maghrebi). It contains 10,000 samples across 16 politeness categories and achieves substantial inter-annotator agreement (kappa = 0.703). Why it matters: This dataset addresses the under-explored area of Arabic-language resources for politeness detection, which is crucial for culturally-aware NLP systems.

ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset

Summary

Keywords

Related

Overview of the Arabic Sentiment Analysis 2021 Competition at KAUST

ADAB: Arabic Dataset for Automated Politeness Benchmarking -- A Large-Scale Resource for Computational Sociopragmatics