MBZUAI Professor Monojit Choudhury co-authored a study on LLMs and their capacity for moral reasoning, with the study being presented at the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL) in Malta. The study included contributions from Aditi Khandelwal, Utkarsh Agarwal, and Kumar Tanmay from Microsoft. The research explores AI alignment, ensuring AI systems align with human values, moral principles, and ethical considerations. Why it matters: The study provides insight into LLMs' capabilities regarding complex ethical issues, which is important for guiding the development of AI in a way that is consistent with human values.
MBZUAI researchers introduce SocialMaze, a new benchmark for evaluating social reasoning capabilities in large language models (LLMs). SocialMaze includes six diverse tasks across social reasoning games, daily-life interactions, and digital community platforms, emphasizing deep reasoning, dynamic interaction, and information uncertainty. Experiments show that LLMs vary in handling dynamic interactions, degrade under uncertainty, but can be improved via fine-tuning on curated reasoning examples.
This paper introduces rational counterfactuals, a method for identifying counterfactuals that maximize the attainment of a desired consequent. The approach aims to identify the antecedent that leads to a specific outcome for rational decision-making. The theory is applied to identify variable values that contribute to peace, such as Allies, Contingency, Distance, Major Power, Capability, Democracy, and Economic Interdependency. Why it matters: The research provides a framework for analyzing and promoting conditions conducive to peace using counterfactual reasoning.
A new survey paper provides a deep dive into post-training methodologies for Large Language Models (LLMs), analyzing their role in refining LLMs beyond pretraining. It addresses key challenges such as catastrophic forgetting, reward hacking, and inference-time trade-offs, and highlights emerging directions in model alignment, scalable adaptation, and inference-time reasoning. The paper also provides a public repository to continually track developments in this fast-evolving field.
MBZUAI's Executive Program held a module on AI ethics, safety, and societal impacts, led by Professors Tom Mitchell and Justine Cassell. The session covered machine learning bias, privacy, AI's impact on jobs and education, and the ethical use of AI. Forty-two participants from ministerial leadership and top industry executives are part of the first cohort. Why it matters: This highlights MBZUAI and the UAE's commitment to ethical AI development as part of building a knowledge-based economy.
Researchers from MBZUAI and Monash University presented a study at EMNLP 2024 examining LLMs' ability to interpret empathy, emotion, and morality in written stories. The study builds on a framework for modeling empathic similarity between narratives, using the EmpathicStories dataset. They are exploring ways to improve LLMs' capabilities with complex concepts like empathy, especially for applications in fields like healthcare. Why it matters: Enhancing LLMs with empathic understanding could lead to more effective and human-centered AI applications, particularly in sensitive domains requiring nuanced communication.
Niket Tandon from the Allen Institute for AI presented a talk at MBZUAI on enabling large language models to focus on human needs and continuously learn from interactions. He proposed a memory architecture inspired by the theory of recursive reminding to guide models in avoiding past errors. The talk addressed who to ask, what to ask, when to ask and how to apply the obtained guidance. Why it matters: The research explores how to align LLMs with human feedback, a key challenge for practical and ethical AI deployment.
Liangming Pan from UCSB presented research on building reliable generative AI agents by integrating symbolic representations with LLMs. The neuro-symbolic strategy combines the flexibility of language models with precise knowledge representation and verifiable reasoning. The work covers Logic-LM, ProgramFC, and learning from automated feedback, aiming to address LLM limitations in complex reasoning tasks. Why it matters: Improving the reliability of LLMs is crucial for high-stakes applications in finance, medicine, and law within the region and globally.