June 2023

11 articles

Top Stories

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models

arXiv · Jun 13 · NLP LLM

MBZUAI researchers introduce XrayGPT, a conversational medical vision-language model for analyzing chest radiographs and answering open-ended questions. The model aligns a medical visual encoder (MedClip) with a fine-tuned large language model (Vicuna) using a linear transformation. To enhance performance, the LLM was fine-tuned using 217k interactive summaries generated from radiology reports.

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

arXiv · Jun 8 · CV LLM

Video-ChatGPT is a new multimodal model that combines a video-adapted visual encoder with a large language model (LLM) to enable detailed video understanding and conversation. The authors introduce a new dataset of 100,000 video-instruction pairs for training the model. They also develop a quantitative evaluation framework for video-based dialogue models.

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

arXiv · Jun 5 · LLM Research

The paper introduces Sparse-Quantized Representation (SpQR), a new compression format and quantization technique for large language models (LLMs). SpQR identifies outlier weights and stores them in higher precision while compressing the remaining weights to 3-4 bits. The method achieves less than 1% accuracy loss in perplexity for LLaMA and Falcon LLMs and enables a 33B parameter LLM to run on a single 24GB consumer GPU. Why it matters: This enables near-lossless compression of LLMs, making powerful models accessible on resource-constrained devices and accelerating inference without significant accuracy degradation.

June 2023

Top Stories

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

More This Month

Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models

Neural Bayes estimators for censored inference with peaks-over-threshold models

Suzana Nunes named L'Oréal-UNESCO For Women in Science Laureate

From the seeds of discovery to improved crops

KAUST researchers integrate two-dimensional materials into silicon microchips

538 organizations forge new alliances at TME’s first “Global Sustainable Development Congress” at KAUST

Unscented Autoencoder

N-Shot Benchmarking of Whisper on Diverse Arabic Speech Recognition