Skip to content
GCC AI Research

Search

Results for "jailbreak"

How jailbreak attacks work and a new way to stop them

MBZUAI ·

Researchers at MBZUAI and other institutions have published a study at ACL 2024 investigating how jailbreak attacks work on LLMs. The study used a dataset of 30,000 prompts and non-linear probing to interpret the effects of jailbreak attacks, finding that existing interpretations were inadequate. The researchers propose a new approach to improve LLM safety against such attacks by identifying the layers in neural networks where the behavior occurs. Why it matters: Understanding and mitigating jailbreak attacks is crucial for ensuring the responsible and secure deployment of LLMs, particularly in the Arabic-speaking world where these models are increasingly being used.

CRC Seminar Series - Cristofaro Mune, Niek Timmers

TII ·

Cristofaro Mune and Niek Timmers presented a seminar on bypassing unbreakable crypto using fault injection on Espressif ESP32 chips. The presentation detailed how the hardware-based Encrypted Secure Boot implementation of the ESP32 SoC was bypassed using a single EM glitch, without knowing the decryption key. This attack exploited multiple hardware vulnerabilities, enabling arbitrary code execution and extraction of plain-text data from external flash. Why it matters: The research highlights critical security vulnerabilities in embedded systems and the potential for fault injection attacks to bypass secure boot mechanisms, necessitating stronger hardware-level security measures.

Your voice can jailbreak a speech model – here’s how to stop it, without retraining

MBZUAI ·

A new paper from MBZUAI demonstrates that state-of-the-art speech models can be easily jailbroken using audio perturbations to generate harmful content, achieving success rates of 76-93% on models like Qwen2-Audio and LLaMA-Omni. The researchers adapted projected gradient descent (PGD) to the audio domain to optimize waveforms that push the model towards harmful responses. They propose a defense mechanism based on post-hoc activation patching that hardens models at inference time without retraining. Why it matters: This research highlights a critical vulnerability in speech-based LLMs and offers a practical solution, contributing to the development of more secure and trustworthy AI systems in the region and globally.

CRC Seminar Series - Conor McMenamin

TII ·

Conor McMenamin from Universitat Pompeu Fabra presented a seminar on State Machine Replication (SMR) without honest participants. The talk covered the limitations of current SMR protocols and introduced the ByRa model, a framework for player characterization free of honest participants. He then described FAIRSICAL, a sandbox SMR protocol, and discussed how the ideas could be extended to real-world protocols, with a focus on blockchains and cryptocurrencies. Why it matters: This research on SMR protocols and their incentive compatibility could lead to more robust and secure blockchain technologies in the region.

How secure is AI-generated Code: A Large-Scale Comparison of Large Language Models

arXiv ·

A study compared the vulnerability of C programs generated by nine state-of-the-art Large Language Models (LLMs) using a zero-shot prompt. The researchers introduced FormAI-v2, a dataset of 331,000 C programs generated by these LLMs, and found that at least 62.07% of the generated programs contained vulnerabilities, detected via formal verification. The research highlights the need for risk assessment and validation when deploying LLM-generated code in production environments.

K2 Think Hackathon: could your idea turn into impact in 48 hours?

MBZUAI ·

MBZUAI is hosting the K2 Think Hackathon, challenging participants to develop applications using the K2 Think reasoning model developed with G42. The hackathon involves a global idea call followed by a 48-hour build challenge in Abu Dhabi for the top 10 teams. The winning feature will be integrated into the K2 Think application. Why it matters: This hackathon provides a valuable opportunity to test and shape a cutting-edge AI model, potentially leading to innovative applications in various sectors like finance and education within the UAE and beyond.

CRC Team Places 6th in Global Cyber Security Competition

TII ·

A team from the Cryptography Research Center (CRC) secured 6th place out of 210 teams in the 'Donjon CTF 2021: Capture the Fortress' cybersecurity competition. The competition featured jeopardy-style challenges covering cryptography, reverse engineering, and hardware security. The CRC team participated to improve visibility and assess team capabilities, particularly in hardware security. Why it matters: The CRC team's strong performance highlights the growing cybersecurity expertise in the UAE and its attractiveness for talent in this field.

Can AI stop the next pandemic? Scientists unveil vaccine breakthrough - Gulf News

Gulf News ·

The provided article content is empty. Therefore, no specific details about the AI application, the scientific breakthrough, the involved researchers, or their affiliations can be extracted from the text. Without this information, it is impossible to describe the specific nature of the vaccine breakthrough or how AI contributed to it. Why it matters: The potential significance of AI in pandemic preparedness and vaccine development for the region's healthcare and technology sectors cannot be assessed without the full article content.