Knowledge distillation and the greening of LLMs

MBZUAI · Significant research

Summary

Researchers from MBZUAI, University of British Columbia, and Monash University have created LaMini-LM, a collection of small language models distilled from ChatGPT. LaMini-LM is trained on a dataset of 2.58M instructions and can be deployed on consumer laptops and mobile devices. The smaller models perform almost as well as larger counterparts while addressing security concerns. Why it matters: This work enables the deployment of LLMs in resource-constrained environments and enhances data security by reducing reliance on cloud-based LLMs.

Keywords

LaMini-LM · knowledge distillation · ChatGPT · MBZUAI · security

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation

arXiv · Dec 15

Researchers at MBZUAI have demonstrated a method called "Data Laundering" to artificially boost language model benchmark scores using knowledge distillation. The technique covertly transfers benchmark-specific knowledge, leading to inflated accuracy without genuine improvements in reasoning. The study highlights a vulnerability in current AI evaluation practices and calls for more robust benchmarks.

Efficient and inclusive NLP: An instruction-based approach to improve language models

MBZUAI · Invalid Date

MBZUAI Assistant Professor Alham Fikri Aji is presenting research at EACL 2024 on efficient NLP for low-resource languages. The study uses knowledge distillation, transferring knowledge from a larger model (ChatGPT) to a smaller one using synthetic instruction data. The goal is to achieve similar performance with less computational resources, focusing on underrepresented languages. Why it matters: This work addresses the need for more accessible and inclusive NLP technologies, especially for languages lacking extensive datasets and computational resources.

Climate conscious computing

MBZUAI · Invalid Date

MBZUAI's Qirong Ho and colleagues are developing an Artificial Intelligence Operating System (AIOS) for decarbonization, aiming to reduce energy waste in AI development. The AIOS focuses on improving communication efficiency between machines during AI model training, as inefficient communication leads to prolonged tasks and increased energy consumption. This system addresses the high computing power demands of large language models like ChatGPT and LLaMA-2. Why it matters: By optimizing energy usage in AI development, the AIOS could significantly reduce the carbon footprint of AI technologies in the region and globally.

Vicuna, Altman, and the importance of green AI

MBZUAI · Invalid Date

MBZUAI President Eric Xing led a global collaboration to develop Vicuna, an LLM alternative to GPT-3 addressing the unsustainable costs of training LLMs. OpenAI CEO Sam Altman acknowledged Abu Dhabi's role in the global AI conversation, building off of achievements like Vicuna. Xing and colleagues are publishing research at MLSys 2023 on "cross-mesh resharding" to improve computer communication in deep learning, aiming for low-carbon, affordable, and miniaturized AI. Why it matters: This research signals a push towards sustainable AI development in the region, emphasizing efficiency and reduced environmental impact.

Knowledge distillation and the greening of LLMs

Summary

Keywords

Related

Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation

Efficient and inclusive NLP: An instruction-based approach to improve language models

Climate conscious computing

Vicuna, Altman, and the importance of green AI