Skip to content
GCC AI Research

G42 Releases Nanda 87B, Opening New Frontiers in Hindi-English Language AI

G42 · Significant research

Summary

G42 has launched Nanda 87B, an open-source Hindi-English LLM developed by MBZUAI in collaboration with Inception and Cerebras. Nanda 87B is built upon Llama-3.1 70B and trained on a dataset with over 65 billion Hindi tokens. The model is engineered for real-world use being fluent in formal Hindi, casual speech, and Hinglish, and is designed for translation, summarization, instruction-following, and transliteration tasks. Why it matters: This release marks a major advancement in creating inclusive AI technology tailored for one of the world's largest linguistic communities.

Keywords

G42 · MBZUAI · Nanda 87B · Hindi · LLM

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

Meet Jais, The World’s Most Advanced Arabic LLM - G42

Inception ·

G42's Core42 has released Jais, a new Arabic large language model. Jais includes 13 billion parameters and was trained on a dataset of 126B tokens, including 43B Arabic tokens. According to the developers, Jais achieves state-of-the-art results on Arabic benchmarks and competitive performance on English benchmarks. Why it matters: Jais represents a significant step forward for Arabic NLP, providing a powerful new tool for researchers and developers in the region.

Meet Jais, The World’s Most Advanced Arabic LLM - G42

Inception ·

G42's Core42 has released Jais, a collection of Arabic large language models, including a 13B parameter version. Jais-13B is trained on a 395B token dataset containing Arabic and English text. According to the blog post, Jais-13B achieves state-of-the-art results on Arabic NLP benchmarks. Why it matters: This release establishes a new benchmark for Arabic language AI, potentially enabling more sophisticated and culturally relevant applications.