JobArabi: An Arabic Corpus and Analysis of Job Announcements from Social Media
arXiv ·
Researchers have introduced JobArabi, a new large-scale corpus consisting of 20,528 Arabic job announcements collected from X between January 2024 and October 2025. The dataset was compiled using a linguistically informed query framework covering various Arabic recruitment expressions, offering metadata like timestamps and geolocation for detailed analysis. Quantitative analysis of JobArabi reveals sociolinguistic patterns, including persistent gendered hiring language, regional occupational demand variations, and emotional framing in recruitment messages. Why it matters: This corpus provides a valuable resource for research in Arabic NLP, computational social science, and digital labor studies, offering unique insights into labor market communication and linguistic change in the Arab world.