Search

Results for "Tamarin"

Formal Methods for Modern Payment Protocols

MBZUAI · Invalid Date

Researchers at ETH Zurich have formalized models of the EMV payment protocol using the Tamarin model checker. They discovered flaws allowing attackers to bypass PIN requirements for high-value purchases on EMV cards like Mastercard and Visa. The team also collaborated with an EMV consortium member to verify the improved EMV Kernel C-8 protocol. Why it matters: This research highlights the importance of formal methods in identifying critical vulnerabilities in widely used payment systems, potentially impacting financial security for consumers in the GCC region and worldwide.

Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models

arXiv · Nov 28

Researchers at MBZUAI have introduced Video-R2, a reinforcement learning approach to improve the consistency and visual grounding of reasoning in multimodal language models. Video-R2 combines timestamp-aware supervised fine-tuning with Group Relative Policy Optimization (GRPO) guided by a Temporal Alignment Reward (TAR). The model demonstrates higher Think Answer Consistency (TAC), Video Attention Score (VAS), and accuracy across multiple benchmarks, showing improved temporal alignment and reasoning coherence for video understanding.

Addressing NLP problems in low resource settings

MBZUAI · Invalid Date

Thamar Solorio from the University of Houston will discuss machine learning approaches for spontaneous human language processing. The talk will cover adapting multilingual transformers to code-switching data and using data augmentation for domain adaptation in sequence labeling tasks. Solorio will also provide an overview of other research projects at the RiTUAL lab, focusing on the scarcity of labeled data. Why it matters: This presentation addresses key challenges in Arabic NLP related to data scarcity, which is a persistent obstacle in developing effective AI applications for the region.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv · Dec 18

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.