Search

Results for "verification"

Is that person real? Zoom rolls out new tool to verify meeting participants - Gulf News

Gulf News · Apr 17

Zoom is reportedly rolling out a new tool designed to verify the identity of participants in online meetings, as indicated by a report from Gulf News. This initiative aims to enhance the security and authenticity of virtual interactions on its platform. The specific technologies employed for this verification, such as AI or computer vision, are not detailed in the provided title. Why it matters: This feature could significantly improve trust and security in virtual communication for businesses and individuals across the Middle East region.

Fact-Checking Complex Claims with Program-Guided Reasoning

arXiv · May 22

This paper introduces ProgramFC, a fact-checking model that decomposes complex claims into simpler sub-tasks using a library of functions. The model uses LLMs to generate reasoning programs and executes them by delegating sub-tasks, enhancing explainability and data efficiency. Experiments on fact-checking datasets demonstrate ProgramFC's superior performance compared to baseline methods, with publicly available code and data.

Truth from uncertainty: using AI’s internal signals to spot hallucinations

MBZUAI · Invalid Date

Researchers from MBZUAI developed "uncertainty quantification heads" (UQ heads) to detect hallucinations in language models by probing internal states and estimating the credibility of generated text. UQ heads leverage attention maps and logits to identify potential hallucinations without altering the model's generation process or relying on external knowledge. The team found that UQ heads achieved state-of-the-art performance in claim-level hallucination detection across different domains and languages. Why it matters: This approach offers a more efficient and accurate method for identifying hallucinations, improving the reliability and trustworthiness of language models in various applications.

CAPTCHAs aren’t just annoying, they’re a reality check for AI agents

MBZUAI · Invalid Date

MBZUAI researchers created Open CaptchaWorld, a new benchmark to test AI agents on solving CAPTCHAs. The benchmark includes 20 modern CAPTCHA types that require perception, reasoning, and interactive actions within a browser. While humans achieve 93.3% accuracy, the best AI agent only reaches 40% on the benchmark. Why it matters: This research highlights a critical gap in current AI agent capabilities, as CAPTCHAs are gatekeepers to high-value web actions like e-commerce and secure logins.

Making LLM accuracy a matter of fact

MBZUAI · Invalid Date

MBZUAI NLP master's graduate Hasan Iqbal developed OpenFactCheck, a framework for fact-checking and evaluating the factual accuracy of large language models. The framework consists of three modules: ResponseEvaluator, LLMEvaluator, and CheckerEvaluator. OpenFactCheck was published at EMNLP 2024 and accepted at NAACL 2025 and COLING 2025, with Iqbal playing an active role at COLING in Abu Dhabi. Why it matters: The development of automated fact-checking frameworks is crucial for ensuring the reliability and trustworthiness of information generated by increasingly prevalent LLMs, especially in the Arabic-speaking world.

Trustworthiness Assurance for Autonomous Software Systems in the AI Era

MBZUAI · Invalid Date

Dr. Youcheng Sun from the University of Manchester presented on ensuring the trustworthiness of AI systems using formal verification, software testing, and explainable AI. He discussed applying these techniques to challenges like copyright protection for AI models. Dr. Sun's research has been funded by organizations including Google, Ethereum Foundation, and the UK’s Defence Science and Technology Laboratory. Why it matters: As AI adoption grows in the GCC, ensuring the safety, dependability, and trustworthiness of these systems is crucial for public trust and responsible innovation.