Skip to content
GCC AI Research

Multimodal machine intelligence and its human-centered possibilities

MBZUAI · Notable

Summary

A panel discussion was hosted at MBZUAI in collaboration with the Manara Center for Coexistence and Dialogue. The discussion centered on the potential of multimodal machine intelligence for human-centered applications, particularly in health and wellbeing. USC Professor Shrikanth Narayanan spoke on creating trustworthy and inclusive AI that considers protected variables. Why it matters: This signals MBZUAI's interest in exploring ethical AI development and its applications for societal good, potentially driving research and policy initiatives in the region.

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

Foundations of Multisensory Artificial Intelligence

MBZUAI ·

Paul Liang from CMU presented on machine learning foundations for multisensory AI, discussing a theoretical framework for modality interactions. The talk covered cross-modal attention and multimodal transformer architectures, and applications in mental health, pathology, and robotics. Liang's research aims to enable AI systems to integrate and learn from diverse real-world sensory modalities. Why it matters: This highlights the growing importance of multimodal AI research and its potential for advancements across various sectors in the region, including healthcare and robotics.

Multimodality for story-level understanding and generation of visual data

MBZUAI ·

Vicky Kalogeiton from École Polytechnique discussed the importance of multimodality for story-level recognition and generation using video, audio, text, masks and clinical data. She presented on multimodal video understanding using FunnyNet-W and Short Film Dataset. She further showed examples of visual generation from text and other modalities (ET, CAD, DynamicGuidance). Why it matters: Multimodal AI research is growing globally, and this talk highlights the potential of combining different data types for enhanced understanding and generation, which could have implications for various applications, including those relevant to the Middle East.

Humanizing Technology with Assistive Augmentations

MBZUAI ·

This article discusses a talk on "Assistive Augmentation," designing human-computer interfaces to augment human abilities. Examples include 'AiSee' for blind users, 'Prospero' for memory training, and 'MuSS-Bits' for deaf users to feel music. Suranga Nanayakkara from the National University of Singapore will present the talk, highlighting insights from psychology, human-centered machine learning, and design thinking. Why it matters: Such assistive technologies can significantly improve the quality of life for individuals with disabilities and extend human capabilities.

Making human-machine conversation more lifelike than ever at GITEX

MBZUAI ·

MBZUAI researchers demonstrated a low-latency, multilingual multimodal AI system at GITEX that integrates speech, text, and visual capabilities for more lifelike human-machine conversation. The demo, led by Dr. Hisham Cholakkal, includes a mobile app where users can point their camera at an object and ask questions, receiving spoken answers in multiple languages. They are also integrating the model into a robot dog that can respond to voice commands. Why it matters: This work addresses key challenges in deploying LLMs to real-world applications in the Middle East, such as multilingual support and real-time responsiveness.