Skip to content
GCC AI Research

Search

Results for "Model size"

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

arXiv ·

Researchers from MBZUAI have released MobiLlama, a fully transparent open-source 0.5 billion parameter Small Language Model (SLM). MobiLlama is designed for resource-constrained devices, emphasizing enhanced performance with reduced resource demands. The full training data pipeline, code, model weights, and checkpoints are available on Github.

How computer vision model architecture and training affect performance

MBZUAI ·

MBZUAI researchers found that ImageNet performance isn't always indicative of real-world task performance for computer vision models. The study analyzed four popular model configurations, revealing variations in behavior on specific image types despite similar overall ImageNet accuracy. It indicates that certain model configurations are better suited for particular tasks, even with lower ImageNet scores. Why it matters: This challenges the reliance on ImageNet as a sole benchmark and highlights the need for task-specific evaluations in computer vision.

The head of Abu Dhabi’s AI university wants to defuse a tech ‘atomic bomb’

MBZUAI ·

MBZUAI's president Eric Xing warns against the unchecked pursuit of increasingly large AI models, drawing an analogy to an "atomic bomb" due to the unpredictability of their behavior. He argues that the field lacks sufficient understanding of what these models learn and whether their outputs are reliable, advocating for more efficient models. Xing emphasizes the need for debuggability and error tracking in AI, similar to established engineering practices. Why it matters: The piece highlights growing concerns within the AI community about the scalability and potential risks associated with increasingly complex AI models, particularly regarding transparency and control.

Knowledge distillation and the greening of LLMs

MBZUAI ·

Researchers from MBZUAI, University of British Columbia, and Monash University have created LaMini-LM, a collection of small language models distilled from ChatGPT. LaMini-LM is trained on a dataset of 2.58M instructions and can be deployed on consumer laptops and mobile devices. The smaller models perform almost as well as larger counterparts while addressing security concerns. Why it matters: This work enables the deployment of LLMs in resource-constrained environments and enhances data security by reducing reliance on cloud-based LLMs.

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

arXiv ·

The paper introduces Sparse-Quantized Representation (SpQR), a new compression format and quantization technique for large language models (LLMs). SpQR identifies outlier weights and stores them in higher precision while compressing the remaining weights to 3-4 bits. The method achieves less than 1% accuracy loss in perplexity for LLaMA and Falcon LLMs and enables a 33B parameter LLM to run on a single 24GB consumer GPU. Why it matters: This enables near-lossless compression of LLMs, making powerful models accessible on resource-constrained devices and accelerating inference without significant accuracy degradation.

Alumni Spotlight: Making AI accessible for all

MBZUAI ·

MBZUAI alumnus Ahmed Sharshar is developing smaller AI models to make the technology more accessible, especially in resource-constrained environments like Egypt. His master's thesis involved creating an app that assesses lung health using mobile phone video analysis, eliminating the need for traditional medical devices. Sharshar is pursuing his Ph.D. at MBZUAI, focusing on lightweight and energy-efficient models for various applications. Why it matters: Democratizing AI through smaller, efficient models can enable broader applications and innovation across diverse sectors in the Middle East and beyond.

K2-V2: Full Openness Finally Meets Real Performance

MBZUAI ·

IFM has released K2-V2, a 70B-class LLM that takes a "360-open" approach by making its weights, data, training details, checkpoints, and fine-tuning recipes publicly available. K2-V2 matches leading open-weight model performance while offering full transparency, contrasting with proprietary and semi-open Chinese models. Independent evaluations show K2 as a high-performance, fully open-source alternative in the AI landscape. Why it matters: K2-V2 provides developers with a transparent and reproducible foundation model, fostering trust and enabling customization without sacrificing performance, which is crucial for sensitive applications in the region.

Parameter-Efficient Fine-Tuning for NLP Models

MBZUAI ·

The article discusses parameter-efficient fine-tuning methods for large NLP models, highlighting their importance due to the increasing size and computational demands of state-of-the-art language models. It provides an overview of these methods, presenting them in a unified view to emphasize their similarities and differences. Indraneil, a PhD candidate at TU Darmstadt's UKP Lab, is researching parameter-efficient fine-tuning, sparsity, and conditional computation methods to improve LLM performance in multilingual, multi-task settings. Why it matters: Efficient fine-tuning techniques are crucial for democratizing access to and accelerating the deployment of large language models in the region and beyond.