Skip to content
GCC AI Research

Search

Results for "WiDS"

Lifting up female scientists

KAUST ·

KAUST hosted a regional Women in Data Science (WiDS) conference, part of a global event held at over 100 regional institutions led by Stanford University. The KAUST event featured exclusively female speakers and aimed to highlight data science research and applications. KAUST is launching a 'Women in Data Sciences and Technology' initiative to support women's education and careers in the field. Why it matters: This initiative can help address the underrepresentation of women in data science in Saudi Arabia and the broader region.

ArabJobs: A Multinational Corpus of Arabic Job Ads

arXiv ·

The ArabJobs dataset is a new corpus of over 8,500 Arabic job advertisements collected from Egypt, Jordan, Saudi Arabia, and the UAE. The dataset contains over 550,000 words and captures linguistic, regional, and socio-economic variation in the Arab labor market. It is available on GitHub and can be used for fairness-aware Arabic NLP and labor market research.

NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task

arXiv ·

The third Nuanced Arabic Dialect Identification Shared Task (NADI 2022) focused on advancing Arabic NLP through dialect identification and sentiment analysis at the country level. A total of 21 teams participated, with the winning team achieving 27.06 F1 score on dialect identification and 75.16 F1 score on sentiment analysis. The task highlights the challenges in Arabic dialect processing and motivates further research. Why it matters: Standardized evaluations like NADI are crucial for benchmarking progress and fostering innovation in Arabic NLP, especially for dialectal variations.

KAUST “Dear AI” campaign targets gender bias in AI, profiles Saudi women in tech

KAUST ·

KAUST is launching the "Dear AI" campaign and hackathon to address gender bias and under-representation of women and Saudi/Arab people in AI, after finding AI image tools return only 1% women for prompts like "imagine entrepreneur." The campaign calls for accurate representation in AI datasets from Saudi Arabia and beyond. KAUST notes that 47% of graduates in their AI academy are women. Why it matters: This campaign highlights the need for more inclusive AI training data and addresses gender imbalances in STEM fields in Saudi Arabia.

NADI 2024: The Fifth Nuanced Arabic Dialect Identification Shared Task

arXiv ·

The fifth Nuanced Arabic Dialect Identification (NADI) 2024 shared task aimed to advance Arabic NLP through dialect identification and dialect-to-MSA machine translation. 51 teams registered, with 12 participating and submitting 76 valid submissions across three subtasks. The winning teams achieved 50.57 F1 for multi-label dialect identification, 0.1403 RMSE for dialectness level identification, and 20.44 BLEU for dialect-to-MSA translation. Why it matters: The results highlight the continued challenges in Arabic dialect processing and provide a benchmark for future research in this area.

NADI 2023: The Fourth Nuanced Arabic Dialect Identification Shared Task

arXiv ·

The fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023) aimed to advance Arabic NLP through shared tasks focused on dialect identification and dialect-to-MSA machine translation. 58 teams registered, with 18 participating across three subtasks: dialect identification, dialect-to-MSA translation, and another translation task. The winning teams achieved 87.27 F1 in dialect identification, 14.76 BLEU in one translation task, and 21.10 BLEU in the other. Why it matters: NADI provides valuable benchmarks and datasets for Arabic dialect processing, encouraging further research in this challenging area.