Skip to content
GCC AI Research

ArabJobs: A Multinational Corpus of Arabic Job Ads

arXiv · · Notable

Summary

The ArabJobs dataset is a new corpus of over 8,500 Arabic job advertisements collected from Egypt, Jordan, Saudi Arabia, and the UAE. The dataset contains over 550,000 words and captures linguistic, regional, and socio-economic variation in the Arab labor market. It is available on GitHub and can be used for fairness-aware Arabic NLP and labor market research.

Get the weekly digest

Top AI stories from the GCC region, every week.