Skip to content
GCC AI Research

AraBERT

🇦🇪MBZUAI / AUB · UAE · Arabic PLM

Pre-trained BERT model for Arabic NLP from MBZUAI and the American University of Beirut. One of the most widely-used Arabic NLP models in research.

AraBERT was one of the first Arabic-specific transformer models, released in 2020. Trained on large Arabic text corpora, it established strong baselines across Arabic NLP tasks including sentiment analysis, NER, question answering, and text classification. AraBERTv2 improved on the original with better pre-training data and tokenization. It remains widely cited and used as a foundation for Arabic NLP research.

Sizes:Base, Large

Related Articles