Dr. Thierry Lestable, Executive Director of AIDRC, was interviewed at Black Hat USA 2022, discussing AIDRC's projects in Cyber Reasoning Systems (CRS) and infrastructure security. He highlighted the growing availability of AI within system design, advances in LLMs, and the impact of quantum computing on cybersecurity. He emphasized AIDRC's commitment to developing cybersecurity systems and software. Why it matters: The interview showcases AIDRC's contributions to cybersecurity research and development, highlighting the UAE's growing role in addressing global cybersecurity challenges through AI and advanced technologies.
MBZUAI researchers have developed K2 Think, an open-source AI reasoning system for interpretable energy decisions. K2 Think uses long chain-of-thought supervised fine-tuning and reinforcement learning to improve accuracy on multi-step reasoning in complex energy problems. The system breaks down challenges into smaller, auditable steps and uses test-time scaling for real-time adaptation. Why it matters: The open-source nature of K2 Think promotes transparency, trust, and compliance in critical energy environments while allowing secure deployment on sovereign infrastructure.
Liangming Pan from UCSB presented research on building reliable generative AI agents by integrating symbolic representations with LLMs. The neuro-symbolic strategy combines the flexibility of language models with precise knowledge representation and verifiable reasoning. The work covers Logic-LM, ProgramFC, and learning from automated feedback, aiming to address LLM limitations in complex reasoning tasks. Why it matters: Improving the reliability of LLMs is crucial for high-stakes applications in finance, medicine, and law within the region and globally.
The paper introduces a web-based expert system called RCSES for civil service regulations in Saudi Arabia. The system covers 17 regulations and utilizes XML for knowledge representation and ASP.net for rule-based inference. RCSES was validated by domain experts and technical users, and compared favorably to other web-based expert systems.
MBZUAI, G42, and Cerebras Systems have launched K2 Think V2, a 70-billion parameter reasoning system built on the K2-V2 base model. K2 Think V2 is fully open-source, from pre-training data to post-training alignment, ensuring transparency and reproducibility. It achieves leading results on complex reasoning benchmarks like AIME2025 and GPQA-Diamond. Why it matters: This release marks a significant advancement in the UAE's AI capabilities, demonstrating leadership in building globally accessible and fully sovereign AI systems focused on reasoning.
MBZUAI and G42 have launched K2 Think, an open-source AI system for advanced reasoning with 32 billion parameters. It outperforms reasoning models 20 times larger, employing techniques like long chain-of-thought fine-tuning and reinforcement learning. K2 Think will be available on Cerebras' platform, achieving 2,000 tokens per second, and ranks highly in math performance. Why it matters: This launch positions the UAE as a leader in AI innovation through public-private partnerships and open-source contributions, demonstrating that efficient AI design can rival larger models.
Niket Tandon from the Allen Institute for AI presented a talk at MBZUAI on enabling large language models to focus on human needs and continuously learn from interactions. He proposed a memory architecture inspired by the theory of recursive reminding to guide models in avoiding past errors. The talk addressed who to ask, what to ask, when to ask and how to apply the obtained guidance. Why it matters: The research explores how to align LLMs with human feedback, a key challenge for practical and ethical AI deployment.
MBZUAI researchers at the Institute of Foundation Models (IFM) investigated the role of reinforcement learning (RL) in improving reasoning abilities of language models. Their study found that RL acts as an 'elicitor' for reasoning in domains frequently encountered during pre-training (e.g., math, coding), while genuinely teaching new reasoning skills in underrepresented domains (e.g., logic, simulations). To support their analysis, they created a new dataset called GURU containing 92,000 examples across six domains. Why it matters: This research clarifies the impact of reinforcement learning on language model reasoning, paving the way for developing models with more generalizable reasoning abilities across diverse domains, an important direction for more capable AI systems.