Mingyu Ding from UC Berkeley presented research on endowing robots with human-like commonsense and physical reasoning capabilities. The talk covered multimodal commonsense reasoning integrating vision, world models, and language-based task planners. It also discussed physical reasoning approaches for robots to infer dynamics and physical properties of objects. Why it matters: Enhancing robots with these capabilities can improve their ability to generalize across everyday tasks, leading to greater social benefits and impact.
MBZUAI researchers are working to improve computer vision models by incorporating common sense knowledge. They aim to address issues like the generation of unrealistic human features, such as hands with incorrect numbers of fingers. By integrating common-sense knowledge, like the fact that humans typically have five fingers per hand, they seek to make deep learning models more reliable. Why it matters: This research could improve the accuracy and trustworthiness of AI-generated content, making it more suitable for real-world applications.
A new dataset called ArabCulture is introduced to address the lack of culturally relevant commonsense reasoning resources in Arabic AI. The dataset covers 13 countries across the Gulf, Levant, North Africa, and the Nile Valley, spanning 12 daily life domains with 54 fine-grained subtopics. It was built from scratch by native speakers writing and validating culturally relevant questions. Why it matters: The dataset highlights the need for more culturally aware models and benchmarks tailored to the Arabic-speaking world, moving beyond machine-translated resources.
MBZUAI researchers have created ArabCulture, a new benchmark dataset to measure cultural commonsense reasoning capabilities in Arabic language models. The dataset was built by native Arabic speakers from 13 countries and is the largest of its kind. Testing 31 language models, the researchers found that many systems struggle with understanding cultural concepts across the Arab world. Why it matters: The new benchmark addresses a gap in AI, enabling development of culturally-aware AI systems tailored to the nuances of the Arabic-speaking world.
MBZUAI researchers created a new benchmark dataset called TextGames to evaluate the reasoning abilities of LLMs. The dataset uses simple, text-based games requiring skills like pattern recognition and logical thinking. LLMs struggled with the hardest questions, suggesting limitations in their reasoning capabilities despite advancements in language understanding. Why it matters: This research highlights the need for specialized reasoning models and benchmarks that go beyond memorization to truly test AI's problem-solving abilities.
Liangming Pan from UCSB presented research on building reliable generative AI agents by integrating symbolic representations with LLMs. The neuro-symbolic strategy combines the flexibility of language models with precise knowledge representation and verifiable reasoning. The work covers Logic-LM, ProgramFC, and learning from automated feedback, aiming to address LLM limitations in complex reasoning tasks. Why it matters: Improving the reliability of LLMs is crucial for high-stakes applications in finance, medicine, and law within the region and globally.
MBZUAI researchers introduce SocialMaze, a new benchmark for evaluating social reasoning capabilities in large language models (LLMs). SocialMaze includes six diverse tasks across social reasoning games, daily-life interactions, and digital community platforms, emphasizing deep reasoning, dynamic interaction, and information uncertainty. Experiments show that LLMs vary in handling dynamic interactions, degrade under uncertainty, but can be improved via fine-tuning on curated reasoning examples.
Mausam, head of Yardi School of AI at IIT Delhi and affiliate professor at University of Washington, will discuss Neuro-Symbolic AI. The talk will cover recent research threads with applications in NLP, probabilistic decision-making, and constraint satisfaction. Mausam's research explores neuro-symbolic machine learning, computer vision for radiology, NLP for robotics, multilingual NLP, and intelligent information systems. Why it matters: Neuro-Symbolic AI is gaining importance as it combines the strengths of neural and symbolic approaches, potentially leading to more robust and explainable AI systems.