Skip to content
GCC AI Research

Search

Results for "long documents"

NLP for Long, Structured Documents

MBZUAI ·

Jan Buchmann from TU Darmstadt presented research on NLP for long, structured documents at MBZUAI. The research addresses gaps in using document structure and improving the verifiability of LM responses. Experiments showed that models learn to represent document structure during pre-training, and larger models can cite sources well. Why it matters: This research contributes to making NLP more effective for complex documents like scientific articles and legal texts, which is crucial for information accessibility.

Cross-Document Topic-Aligned Chunking for Retrieval-Augmented Generation

arXiv ·

This paper introduces Cross-Document Topic-Aligned (CDTA) chunking to address knowledge fragmentation in Retrieval-Augmented Generation (RAG) systems. CDTA identifies topics across documents, maps segments to topics, and synthesizes them into unified chunks. Experiments on HotpotQA and UAE legal texts show that CDTA improves faithfulness and citation accuracy compared to existing chunking methods, especially for complex queries requiring multi-hop reasoning.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv ·

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.