Skip to content
GCC AI Research

Search

Results for "GigaBERT"

An Empirical Study of Pre-trained Transformers for Arabic Information Extraction

arXiv ·

This paper introduces GigaBERT, a customized bilingual BERT model pre-trained for Arabic NLP and English-to-Arabic zero-shot transfer learning. The study evaluates GigaBERT's performance on four information extraction tasks: named entity recognition, part-of-speech tagging, argument role labeling, and relation extraction. Results show that GigaBERT outperforms mBERT, XLM-RoBERTa, and AraBERT in both supervised and zero-shot transfer settings. Why it matters: GigaBERT advances Arabic NLP by providing a high-performing, publicly available model tailored for the complexities of the Arabic language and cross-lingual applications.

AraBERT: Transformer-based Model for Arabic Language Understanding

arXiv ·

Researchers at the American University of Beirut (AUB) have released AraBERT, a BERT model pre-trained specifically for Arabic language understanding. The model was trained on a large Arabic corpus and compared against multilingual BERT and other state-of-the-art methods. AraBERT achieved state-of-the-art performance on several tested Arabic NLP tasks including sentiment analysis, named entity recognition, and question answering. Why it matters: This release provides the Arabic NLP community with a high-performing, open-source language model, facilitating further research and development.

G42 Releases Nanda 87B, Opening New Frontiers in Hindi-English Language AI

G42 ·

G42 has launched Nanda 87B, an open-source Hindi-English LLM developed by MBZUAI in collaboration with Inception and Cerebras. Nanda 87B is built upon Llama-3.1 70B and trained on a dataset with over 65 billion Hindi tokens. The model is engineered for real-world use being fluent in formal Hindi, casual speech, and Hinglish, and is designed for translation, summarization, instruction-following, and transliteration tasks. Why it matters: This release marks a major advancement in creating inclusive AI technology tailored for one of the world's largest linguistic communities.

Technology Innovation Institute Introduces World’s Most Powerful Open LLM: Falcon 180B

TII ·

Technology Innovation Institute (TII) in the UAE has launched Falcon 180B, an open access large language model with 180 billion parameters trained on 3.5 trillion tokens. Falcon 180B ranks first on the Hugging Face Leaderboard for pretrained LLMs, outperforming Meta's LLaMA 2 and nearing the performance of OpenAI's GPT-4 and Google's PaLM 2. The model is available for research and commercial use under the 'Falcon 180B TII License', based upon Apache 2.0. Why it matters: This release strengthens the UAE's position in AI development and promotes open access to advanced AI technology, fostering innovation and collaboration.

Self-supervised DNA models and scalable sequence processing with memory augmented transformers

MBZUAI ·

Dr. Mikhail Burtsev of the London Institute presented research on GENA-LM, a suite of transformer-based DNA language models. The talk addressed the challenge of scaling transformers for genomic sequences, proposing recurrent memory augmentation to handle long input sequences efficiently. This approach improves language modeling performance and holds promise for memory-intensive applications in bioinformatics. Why it matters: This research can significantly advance AI's capabilities in genomics by enabling the processing of much larger DNA sequences, with potential breakthroughs in understanding and treating diseases.

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

arXiv ·

Researchers from MBZUAI have released MobiLlama, a fully transparent open-source 0.5 billion parameter Small Language Model (SLM). MobiLlama is designed for resource-constrained devices, emphasizing enhanced performance with reduced resource demands. The full training data pipeline, code, model weights, and checkpoints are available on Github.

Technology Innovation Institute Announces Launch of NOOR, the World’s Largest Arabic NLP Model

TII ·

Technology Innovation Institute (TII) in Abu Dhabi, in collaboration with LightOn, has launched NOOR, a 10 billion parameter Arabic natural language processing (NLP) model. The model was trained on a large, high-quality cross-domain Arabic dataset including web data, books, poetry, news, and technical information. It enables applications in automated summarization, chatbots, and personalized marketing. Why it matters: NOOR represents a significant advancement in Arabic NLP, potentially enabling more sophisticated AI applications tailored to the Arabic language and regional needs.