Skip to content
GCC AI Research

Search

Results for "Hindi"

G42 Releases Nanda 87B, Opening New Frontiers in Hindi-English Language AI

G42 ·

G42 has launched Nanda 87B, an open-source Hindi-English LLM developed by MBZUAI in collaboration with Inception and Cerebras. Nanda 87B is built upon Llama-3.1 70B and trained on a dataset with over 65 billion Hindi tokens. The model is engineered for real-world use being fluent in formal Hindi, casual speech, and Hinglish, and is designed for translation, summarization, instruction-following, and transliteration tasks. Why it matters: This release marks a major advancement in creating inclusive AI technology tailored for one of the world's largest linguistic communities.

UrduFake@FIRE2021: Shared Track on Fake News Identification in Urdu

arXiv ·

The UrduFake@FIRE2021 shared task focused on fake news detection in the Urdu language, framed as a binary classification problem. 34 teams registered, with 18 submitting results and 11 providing technical reports, showcasing diverse approaches. The top-performing system utilized the stochastic gradient descent (SGD) algorithm, achieving an F-score of 0.679.

LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content

arXiv ·

Researchers have introduced LlamaLens, a specialized multilingual LLM designed for analyzing news and social media content. The model addresses domain specificity and multilinguality, with a focus on news and social media in Arabic, English, and Hindi. LlamaLens was evaluated on 18 tasks represented by 52 datasets, outperforming the state-of-the-art on 23 testing sets. Why it matters: This work contributes a valuable resource for multilingual NLP research, particularly in the context of analyzing news and social media content across diverse languages.