ArabJobs: A Multinational Corpus of Arabic Job Ads

arXiv · September 26, 2025 · Notable

Summary

The ArabJobs dataset is a new corpus of over 8,500 Arabic job advertisements collected from Egypt, Jordan, Saudi Arabia, and the UAE. The dataset contains over 550,000 words and captures linguistic, regional, and socio-economic variation in the Arab labor market. It is available on GitHub and can be used for fairness-aware Arabic NLP and labor market research.

Keywords

Arabic · job advertisements · corpus · dataset · labor market

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.