MBZUAI has released Jais and Jais-chat, two new open generative large language models (LLMs) with a focus on Arabic. The 13 billion parameter models are based on the GPT-3 architecture and pretrained on Arabic, English, and code. Evaluation shows state-of-the-art Arabic knowledge and reasoning, with competitive English performance.
G42's Core42 has released Jais, a new Arabic large language model. Jais includes 13 billion parameters and was trained on a dataset of 126B tokens, including 43B Arabic tokens. According to the developers, Jais achieves state-of-the-art results on Arabic benchmarks and competitive performance on English benchmarks. Why it matters: Jais represents a significant step forward for Arabic NLP, providing a powerful new tool for researchers and developers in the region.
G42's Inception has open-sourced Jais, a 13-billion parameter Arabic large language model (LLM). Jais was trained on a 395-billion-token Arabic and English dataset and outperforms existing Arabic models. The model is a collaboration between Inception, MBZUAI, and Cerebras Systems, and was trained on the Condor Galaxy supercomputer. Why it matters: This release establishes a new standard for Arabic language AI, providing over 400 million Arabic speakers access to generative AI and fostering innovation in the region.
MBZUAI and Core42 have launched Jais Climate, the first bilingual (Arabic/English) LLM dedicated to climate and sustainability. It is fine-tuned with 1.4 million climate-related instructions and trained on the ClimaInstruct dataset. Jais Climate is built on Jais 13B and incorporates technology from Vicuna, an open-source LLM. Why it matters: This model provides accessible climate data to a wide audience, including decision-makers and the general public, and highlights the UAE's focus on using AI for sustainability.
Inception, Cerebras, and MBZUAI have released Jais 2, a 70 billion parameter open-weight Arabic LLM. Jais 2 is trained on an Arabic-first dataset and features a redesigned architecture for stronger reasoning and fluency across Arabic dialects and English. It integrates a safety-first framework and demonstrates capabilities in understanding Arabic poetry, culture, and social media tone. Why it matters: Jais 2 addresses the historical underrepresentation of Arabic in AI by providing a culturally and linguistically faithful model, potentially accelerating innovation across the region.
MBZUAI is highlighting five key AI innovations for UAE Innovation Month, including the Jais Arabic LLM developed with Core42 and Cerebras Systems. They also highlight the AI Operating System (AIOS) for reducing AI computing energy costs. Additionally, MBZUAI received a US patent for an AI-based handwriting generation technology. Why it matters: MBZUAI's focus on Arabic NLP and efficient AI computing positions the UAE as a leader in responsible and inclusive AI development.
This paper explores multilingual satire detection methods in English and Arabic using zero-shot and chain-of-thought (CoT) prompting. It compares the performance of Jais-chat(13B) and LLaMA-2-chat(7B) on distinguishing satire from truthful news. Results show that CoT prompting significantly improves Jais-chat's performance, achieving an F1-score of 80% in English. Why it matters: This demonstrates the potential of Arabic LLMs like Jais to handle nuanced language tasks such as satire detection, which is critical for combating misinformation in the region.