When disagreement becomes a signal for AI models

MBZUAI · Significant research

Summary

A new paper coauthored by researchers at The University of Melbourne and MBZUAI explores disagreement in human annotation for AI training. The paper treats disagreement as a signal (human label variation or HLV) rather than noise, and proposes new evaluation metrics based on fuzzy set theory. These metrics adapt accuracy and F-score to cases where multiple labels may plausibly apply, aligning model output with the distribution of human judgments. Why it matters: This research addresses a key challenge in NLP by accounting for the inherent ambiguity in human language, potentially leading to more robust and human-aligned AI systems.

Keywords

MBZUAI · human label variation · HLV · fuzzy set theory · Jensen-Shannon divergence

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Truth from uncertainty: using AI’s internal signals to spot hallucinations

MBZUAI · Invalid Date

Researchers from MBZUAI developed "uncertainty quantification heads" (UQ heads) to detect hallucinations in language models by probing internal states and estimating the credibility of generated text. UQ heads leverage attention maps and logits to identify potential hallucinations without altering the model's generation process or relying on external knowledge. The team found that UQ heads achieved state-of-the-art performance in claim-level hallucination detection across different domains and languages. Why it matters: This approach offers a more efficient and accurate method for identifying hallucinations, improving the reliability and trustworthiness of language models in various applications.

The search for an antidote to Byzantine attacks

MBZUAI · Invalid Date

MBZUAI researchers have developed a new method called "Byzantine antidote" (Bant) to defend federated learning systems against Byzantine attacks, where malicious nodes intentionally disrupt the training process. Bant uses trust scores and a trial function to dynamically filter out corrupted updates, even when most nodes are compromised. The system can identify poorly labeled data while still training models effectively, addressing both unconscious mistakes and deliberate sabotage. Why it matters: This research enhances the reliability and security of federated learning in sensitive sectors like healthcare and finance, enabling safer collaborative AI development.

Tackling human-written disinformation and machine hallucinations

MBZUAI · Invalid Date

MBZUAI Professor Preslav Nakov is researching methods to identify and combat the harmful uses of large language models in generating disinformation. He notes that disinformation, unlike fake news, is weaponized with the intent to persuade, not just to lie. His research focuses on the linguistic differences between human-written and machine-generated disinformation, such as the use of rhetorical devices in human propaganda. Why it matters: As AI-generated content becomes more prevalent, understanding and mitigating its potential for spreading disinformation is critical for maintaining trust and integrity in information ecosystems, especially during major election cycles.

The search for an antidote to Byzantine attacks

MBZUAI · Invalid Date

MBZUAI researchers have developed 'Byzantine antidote' (Bant), a novel defense mechanism against Byzantine attacks in federated learning. Bant uses trust scores and a trial function to dynamically filter and neutralize corrupted updates, even when a majority of nodes are compromised. The research was presented at the 40th Annual AAAI Conference on Artificial Intelligence.

When disagreement becomes a signal for AI models

Summary

Keywords

Related

Truth from uncertainty: using AI’s internal signals to spot hallucinations

The search for an antidote to Byzantine attacks

Tackling human-written disinformation and machine hallucinations

The search for an antidote to Byzantine attacks