G42 and Cerebras, in partnership with MBZUAI and C-DAC, will deploy an 8 exaflop AI supercomputer in India. The system will operate under India's governance frameworks, with all data remaining within national jurisdiction to meet sovereign security and compliance requirements. The supercomputer will be accessible to Indian researchers, startups, and government entities under the India AI Mission.
MBZUAI researchers introduce PG-Video-LLaVA, a large multimodal model with pixel-level grounding capabilities for videos, integrating audio cues for enhanced understanding. The model uses an off-the-shelf tracker and grounding module to localize objects in videos based on user prompts. PG-Video-LLaVA is evaluated on video question-answering and grounding benchmarks, using Vicuna instead of GPT-3.5 for reproducibility.
Researchers introduce PALO, a polyglot large multimodal model with visual reasoning capabilities in 10 major languages including Arabic. A semi-automated translation approach was used to adapt the multimodal instruction dataset from English to the target languages. The models are trained across three scales (1.7B, 7B and 13B parameters) and a multilingual multimodal benchmark is proposed for evaluation.
Researchers at MBZUAI have introduced QRAFT, an LLM-based framework designed to automate the generation of fact-checking articles. The system mimics the writing workflow of human fact-checkers, aiming to bridge the gap between automated fact-checking systems and public dissemination. While QRAFT outperforms existing text-generation methods, it still falls short of expert-written articles, highlighting areas for further research.
MBZUAI has released Jais and Jais-chat, two new open generative large language models (LLMs) with a focus on Arabic. The 13 billion parameter models are based on the GPT-3 architecture and pretrained on Arabic, English, and code. Evaluation shows state-of-the-art Arabic knowledge and reasoning, with competitive English performance.
This paper proposes a smart dome model for mosques that uses AI to control dome movements based on weather conditions and overcrowding. The model utilizes Congested Scene Recognition Network (CSRNet) and fuzzy logic techniques in Python to determine when to open and close the domes to maintain fresh air and sunlight. The goal is to automatically manage dome operation based on real-time data, specifying the duration for which the domes should remain open each hour.
Researchers in Saudi Arabia have developed a deep learning framework for automated counting and geolocation of palm trees using aerial images. The system uses a Faster R-CNN model trained on a dataset of 10,000 palm tree instances collected in the Kharj region using DJI drones. Geolocation accuracy of 2.8m was achieved using geotagged metadata and photogrammetry techniques.