Search

Results for "Reasoning System"

Intelligent, sovereign, explainable energy decisions: powered by open-source AI reasoning

MBZUAI · Invalid Date

MBZUAI researchers have developed K2 Think, an open-source AI reasoning system for interpretable energy decisions. K2 Think uses long chain-of-thought supervised fine-tuning and reinforcement learning to improve accuracy on multi-step reasoning in complex energy problems. The system breaks down challenges into smaller, auditable steps and uses test-time scaling for real-time adaptation. Why it matters: The open-source nature of K2 Think promotes transparency, trust, and compliance in critical energy environments while allowing secure deployment on sovereign infrastructure.

Empowering Large Language Models with Reliable Reasoning

MBZUAI · Invalid Date

Liangming Pan from UCSB presented research on building reliable generative AI agents by integrating symbolic representations with LLMs. The neuro-symbolic strategy combines the flexibility of language models with precise knowledge representation and verifiable reasoning. The work covers Logic-LM, ProgramFC, and learning from automated feedback, aiming to address LLM limitations in complex reasoning tasks. Why it matters: Improving the reliability of LLMs is crucial for high-stakes applications in finance, medicine, and law within the region and globally.

MBZUAI and G42 Launch K2 Think: A Leading Open-Source System for Advanced AI Reasoning

MBZUAI · Invalid Date

MBZUAI and G42 have launched K2 Think, an open-source AI system for advanced reasoning with 32 billion parameters. It outperforms reasoning models 20 times larger, employing techniques like long chain-of-thought fine-tuning and reinforcement learning. K2 Think will be available on Cerebras' platform, achieving 2,000 tokens per second, and ranks highly in math performance. Why it matters: This launch positions the UAE as a leader in AI innovation through public-private partnerships and open-source contributions, demonstrating that efficient AI design can rival larger models.

Web-Based Expert System for Civil Service Regulations: RCSES

arXiv · Jan 12

The paper introduces a web-based expert system called RCSES for civil service regulations in Saudi Arabia. The system covers 17 regulations and utilizes XML for knowledge representation and ASP.net for rule-based inference. RCSES was validated by domain experts and technical users, and compared favorably to other web-based expert systems.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv · Dec 18

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.