Skip to content
GCC AI Research

Search

Results for "Vicuna"

Collaboration releases Vicuna – environmentally friendly, cost-effective rival to ChatGPT

MBZUAI ·

Researchers from MBZUAI, UC Berkeley, CMU, Stanford, and UC San Diego collaborated to create Vicuna, an open-source chatbot that costs $300 to train, unlike ChatGPT which costs over $4 million. Vicuna achieves 90% of ChatGPT's subjective language quality while being far more energy-efficient and can run on a single GPU. It was fine-tuned from Meta AI’s LLaMA model using user-shared conversations and has gained significant traction on GitHub. Why it matters: This research demonstrates that high-quality chatbots can be developed at a fraction of the cost and environmental impact, opening up new possibilities for sustainable AI development in the region.

Vicuna, Altman, and the importance of green AI

MBZUAI ·

MBZUAI President Eric Xing led a global collaboration to develop Vicuna, an LLM alternative to GPT-3 addressing the unsustainable costs of training LLMs. OpenAI CEO Sam Altman acknowledged Abu Dhabi's role in the global AI conversation, building off of achievements like Vicuna. Xing and colleagues are publishing research at MLSys 2023 on "cross-mesh resharding" to improve computer communication in deep learning, aiming for low-carbon, affordable, and miniaturized AI. Why it matters: This research signals a push towards sustainable AI development in the region, emphasizing efficiency and reduced environmental impact.

Llama 2: a global release of local importance

MBZUAI ·

MBZUAI is a global partner in Meta's release of Llama 2, joining organizations like IBM, AWS, Microsoft, and NVIDIA. MBZUAI will provide early feedback and help build the software as a global community. MBZUAI is working on large language models, developing a sustainable LLM named Vicuna, and strengthening infrastructure for LLM-chat evaluation. Why it matters: MBZUAI's involvement promises to bring about a new generation of UAE-born AI advancements built around the Llama 2 ecosystem and fact-checking capabilities.

CamelEval: Advancing Culturally Aligned Arabic Language Models and Benchmarks

arXiv ·

The paper introduces Juhaina, a 9.24B parameter Arabic-English bilingual LLM trained with an 8,192 token context window. It identifies limitations in the Open Arabic LLM Leaderboard (OALL) and proposes a new benchmark, CamelEval, for more comprehensive evaluation. Juhaina outperforms models like Llama and Gemma in generating helpful Arabic responses and understanding cultural nuances. Why it matters: This culturally-aligned LLM and associated benchmark could significantly advance Arabic NLP and democratize AI access for Arabic speakers.

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models

arXiv ·

MBZUAI researchers introduce XrayGPT, a conversational medical vision-language model for analyzing chest radiographs and answering open-ended questions. The model aligns a medical visual encoder (MedClip) with a fine-tuned large language model (Vicuna) using a linear transformation. To enhance performance, the LLM was fine-tuned using 217k interactive summaries generated from radiology reports.

MBZUAI is changing the landscape of large language models in the region.

MBZUAI ·

MBZUAI has been actively involved in developing AI and generative models, contributing to models like Llama 2, Jais, Vicuna, and LaMini. Professor Preslav Nakov notes Llama 2's improvements in size and carbon footprint over Llama 1. MBZUAI aims to tackle challenges like information accuracy, economic costs, and the scarcity of Arabic online content. Why it matters: MBZUAI's work helps address the limitations of current LLMs, particularly for Arabic, and promotes sustainable AI development in the region.

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

arXiv ·

MBZUAI researchers introduce PG-Video-LLaVA, a large multimodal model with pixel-level grounding capabilities for videos, integrating audio cues for enhanced understanding. The model uses an off-the-shelf tracker and grounding module to localize objects in videos based on user prompts. PG-Video-LLaVA is evaluated on video question-answering and grounding benchmarks, using Vicuna instead of GPT-3.5 for reproducibility.

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

arXiv ·

Video-ChatGPT is a new multimodal model that combines a video-adapted visual encoder with a large language model (LLM) to enable detailed video understanding and conversation. The authors introduce a new dataset of 100,000 video-instruction pairs for training the model. They also develop a quantitative evaluation framework for video-based dialogue models.