Skip to content
GCC AI Research

Search

Results for "RAG"

RIRAG: Regulatory Information Retrieval and Answer Generation

arXiv ·

Researchers introduce a new task for generating question-passage pairs to aid in developing regulatory question-answering (QA) systems. The ObliQA dataset, comprising 27,869 questions from Abu Dhabi Global Markets (ADGM) financial regulations, is presented. A baseline Regulatory Information Retrieval and Answer Generation (RIRAG) system is designed and evaluated using the RePASs metric.

QU-NLP at QIAS 2025 Shared Task: A Two-Phase LLM Fine-Tuning and Retrieval-Augmented Generation Approach for Islamic Inheritance Reasoning

arXiv ·

The QU-NLP team presented their approach to the QIAS 2025 shared task on Islamic Inheritance Reasoning, fine-tuning the Fanar-1-9B model using LoRA and integrating it into a RAG pipeline. Their system achieved an accuracy of 0.858 on the final test, outperforming models like GPT 4.5, LLaMA, and Mistral in zero-shot settings. The system particularly excelled in advanced reasoning, achieving 97.6% accuracy. Why it matters: This demonstrates the effectiveness of domain-specific fine-tuning and retrieval augmentation for Arabic LLMs in complex reasoning tasks, even surpassing frontier models.

Cross-Document Topic-Aligned Chunking for Retrieval-Augmented Generation

arXiv ·

This paper introduces Cross-Document Topic-Aligned (CDTA) chunking to address knowledge fragmentation in Retrieval-Augmented Generation (RAG) systems. CDTA identifies topics across documents, maps segments to topics, and synthesizes them into unified chunks. Experiments on HotpotQA and UAE legal texts show that CDTA improves faithfulness and citation accuracy compared to existing chunking methods, especially for complex queries requiring multi-hop reasoning.

BRIQA: Balanced Reweighting in Image Quality Assessment of Pediatric Brain MRI

arXiv ·

This paper introduces BRIQA, a new method for automated assessment of artifact severity in pediatric brain MRI, which is important for diagnostic accuracy. BRIQA uses gradient-based loss reweighting and a rotating batching scheme to handle class imbalance in artifact severity levels. Experiments show BRIQA improves average macro F1 score from 0.659 to 0.706, especially for Noise, Zipper, Positioning and Contrast artifacts.

Retrieval Augmentation as a Shortcut to the Training Data

MBZUAI ·

This article discusses retrieval augmentation in text generation, where information retrieved from an external source is used to condition predictions. It references recent work on retrieval-augmented image captioning, showing that model size can be greatly reduced when training data is available through retrieval. The author intends to continue this work focusing on the intersection of retrieval augmentation and in-context learning, and controllable image captioning for language learning materials. Why it matters: This research direction has the potential to improve transfer learning in vision-language models, which could be especially relevant for downstream applications in Arabic NLP and multimodal tasks.