How secure is AI-generated Code: A Large-Scale Comparison of Large Language Models

arXiv · April 29, 2024 · Significant research

Summary

A study compared the vulnerability of C programs generated by nine state-of-the-art Large Language Models (LLMs) using a zero-shot prompt. The researchers introduced FormAI-v2, a dataset of 331,000 C programs generated by these LLMs, and found that at least 62.07% of the generated programs contained vulnerabilities, detected via formal verification. The research highlights the need for risk assessment and validation when deploying LLM-generated code in production environments.

Keywords

Large Language Models · LLMs · code generation · vulnerabilities · security

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

LLMEffiChecker: Understanding and Testing Efficiency Degradation of Large Language Models

arXiv · Oct 7

The paper introduces LLMEffiChecker, a tool to test the computational efficiency robustness of LLMs by identifying vulnerabilities that can significantly degrade performance. LLMEffiChecker uses both white-box (gradient-guided perturbation) and black-box (causal inference-based perturbation) methods to delay the generation of the end-of-sequence token. Experiments on nine public LLMs demonstrate that LLMEffiChecker can substantially increase response latency and energy consumption with minimal input perturbations.

How secure is AI-generated Code: A Large-Scale Comparison of Large Language Models

Summary

Keywords

Related

LLMEffiChecker: Understanding and Testing Efficiency Degradation of Large Language Models