Skip to content
GCC AI Research

Search

Results for "NLP framework"

An NLP-Driven Framework for Curriculum-Labor Market Alignment: Schema-Constrained LLM Extraction, ESCO-Anchored Semantic Matching, and Multi-Dimensional Gap Quantification

arXiv ·

Researchers proposed a four-stage NLP framework combining schema-constrained LLM extraction, Sentence-BERT (SBERT) alignment with ESCO, an adjudication protocol, and a verification mechanism for curriculum-labor market alignment. The framework was instantiated for the ABET-accredited BSc Computer Science program at the United Arab Emirates University (UAEU), extracting 400 competency records from the study plan and aligning them with 30 job postings. The extractor achieved a Cohen's kappa of 0.79 on the skill slot and surfaced interpretable supply-demand gaps in general, transversal, algorithms, and software engineering skills, with a minimal gap in AI and data science. Why it matters: This framework provides a robust, NLP-driven method to identify crucial skill gaps in higher education curricula, directly supporting quality assurance and workforce development initiatives in the region.

Modeling Text as a Living Object

MBZUAI ·

The InterText project, funded by the European Research Council, aims to advance NLP by developing a framework for modeling fine-grained relationships between texts. This approach enables tracing the origin and evolution of texts and ideas. Iryna Gurevych from the Technical University of Darmstadt presented the intertextual approach to NLP, covering data modeling, representation learning, and practical applications. Why it matters: This research could enable a new generation of AI applications for text work and critical reading, with potential applications in collaborative knowledge construction and document revision assistance.

LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking

arXiv ·

Researchers have introduced LLMeBench, a customizable framework for evaluating large language models (LLMs) across diverse NLP tasks and languages. The framework features generic dataset loaders, multiple model providers, and pre-implemented evaluation metrics, supporting in-context learning with zero- and few-shot settings. LLMeBench was tested on 31 unique NLP tasks using 53 datasets across 90 experimental setups with 296K data points, and the code has been open-sourced. Why it matters: The framework's flexibility and ease of customization should accelerate LLM benchmarking, especially for Arabic and other low-resource languages.

Uncovering Temporal Framing in the News

arXiv ·

Researchers from MBZUAI have proposed a new taxonomy of eight temporal frames and studied their persuasive use in news discourse. They created a multilingual dataset by expertly annotating 458 English and German news articles, identifying over 2,000 temporally framed sentences and approximately 3,000 annotations. Their experiments demonstrated that temporal framing is learnable at the sentence level, with supervised models significantly outperforming zero-shot classification approaches. Why it matters: This research provides a valuable dataset and methodology for understanding how time-related language shapes interpretation in news, contributing to advancements in NLP for media analysis and potentially countering disinformation.