Search

Results for "content moderation"

FanarGuard: A Culturally-Aware Moderation Filter for Arabic Language Models

arXiv · Nov 24

The paper introduces FanarGuard, a bilingual moderation filter for Arabic and English language models that considers both safety and cultural alignment. A dataset of 468K prompt-response pairs was created and scored by LLM judges on harmlessness and cultural awareness to train the filter. The first benchmark targeting Arabic cultural contexts was developed to evaluate cultural alignment. Why it matters: FanarGuard advances context-sensitive AI safeguards by integrating cultural awareness into content moderation, addressing a critical gap in current alignment techniques.

Multimodal pretraining for objectionable content detection in videos

MBZUAI · Invalid Date

Thamar Solorio from the University of Houston presented preliminary work on multimodal representation learning for detecting objectionable content in videos at MBZUAI. The research investigates two multimodal pretraining mechanisms, finding contrastive learning more effective than unimodal representation prediction. The study also assesses the value of common multimodal corpora for this task. Why it matters: This research contributes to the development of AI techniques for content moderation, an important issue for online platforms in the Middle East and globally.

Can crowdsourced fact-checking curb misinformation on social media?

MBZUAI · Invalid Date

MBZUAI Professor Preslav Nakov discusses Meta's shift to crowdsourced fact-checking via Community Notes, replacing third-party fact-checkers. Community Notes, originating from Twitter's Birdwatch, allows users to add context to potentially misleading posts, visible after community consensus. Research indicates this approach can reduce misinformation and lead to post retractions. Why it matters: The adoption of crowdsourcing for content moderation by major platforms like Meta could significantly impact online information quality for billions of users.

Social Media Influencers, Misinformation, and the threat to elections

MBZUAI · Invalid Date

A panel discussion hosted by MBZUAI in collaboration with the Manara Center for Coexistence and Dialogue addressed misinformation and its threat to elections. The talk covered the reasons behind the rise of misinformation, citizen perspectives, and the role of social media influencers. Two cases, the Indian general elections of 2024 and the upcoming US presidential elections in November 2024, were used to describe the contours of misinformation. Why it matters: Understanding the dynamics of misinformation, especially through social media influencers, is crucial for safeguarding democratic processes in the region and globally.

Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts

arXiv · Jun 14

A new methodology emulating fact-checker criteria assesses news outlet factuality and bias using LLMs. The approach uses prompts based on fact-checking criteria to elicit and aggregate LLM responses for predictions. Experiments demonstrate improvements over baselines, with error analysis on media popularity and region, and a released dataset/code at https://github.com/mbzuai-nlp/llm-media-profiling.