Skip to content
GCC AI Research

Search

Results for "generative music"

Identifying bias in generative music models: A new study presented at NAACL

MBZUAI ·

MBZUAI researchers found that only 5.7% of music in existing datasets used to train generative music systems comes from non-Western genres. They discovered that 94% of the music represented Western music, while Africa, the Middle East, and South Asia accounted for only 0.3%, 0.4%, and 0.9% respectively. The team also tested whether parameter-efficient fine-tuning with adapters could improve generative music systems on underrepresented styles, presenting their findings at NAACL. Why it matters: This research highlights the critical need for more diverse datasets in AI music generation to better serve global musical traditions and audiences.

Learn to control

MBZUAI ·

Patrick van der Smagt, Director of AI Research at Volkswagen Group, discussed the use of generative machine learning models for predicting and controlling complex stochastic systems in robotics. The talk highlighted examples in robotics and beyond and addressed the challenges of achieving quality and trust in AI systems. He also mentioned his involvement in a European industry initiative on trust in AI and his membership in the AI Council of the State of Bavaria. Why it matters: Understanding control in robotics, along with trust in AI, are key issues for further development of autonomous systems, especially in industrial applications within the GCC region.

Peering into humanity through music

MBZUAI ·

MBZUAI Visiting Assistant Professor Gus Xia studies music to understand how AI can act more human-like in high-context activities. Xia analyzes and creates music with computers to explore the differences between human and machine perception. He aims to leverage music's abstract nature to study creative intelligence in AI. Why it matters: This research could lead to AI systems that interact more naturally with humans, particularly in creative fields.

Expanding artistic frontiers in artificial intelligence

KAUST ·

KAUST computer scientist Mohamed Elhoseiny and his VISION CAIR team developed Creative Walk Adversarial Networks (CWAN) for novel art generation. CWAN learns from existing art styles and deviates using 'random walk deviation' methods. Human evaluators preferred CWAN-generated art compared to other methods like StyleGAN2. Why it matters: The research demonstrates AI's potential as a valuable tool for artists, enabling the creation of unique and meaningful art, and explores more effective emotional language in image captioning.

OmniGen: Unified Multimodal Sensor Generation for Autonomous Driving

arXiv ·

The paper introduces OmniGen, a unified framework for generating aligned multimodal sensor data for autonomous driving using a shared Bird's Eye View (BEV) space. It uses a novel generalizable multimodal reconstruction method (UAE) to jointly decode LiDAR and multi-view camera data through volume rendering. The framework incorporates a Diffusion Transformer (DiT) with a ControlNet branch to enable controllable multimodal sensor generation, demonstrating good performance and multimodal consistency.

Cross-modal understanding and generation of multimodal content

MBZUAI ·

Nicu Sebe from the University of Trento presented recent work on video generation, focusing on animating objects in a source image using external information like labels, driving videos, or text. He introduced a Learnable Game Engine (LGE) trained from monocular annotated videos, which maintains states of scenes, objects, and agents to render controllable viewpoints. Why it matters: This talk highlights advancements in cross-modal AI, potentially enabling new applications in gaming, simulation, and content creation within the region.

Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization

arXiv ·

This paper introduces Diffusion-BBO, a new online black-box optimization (BBO) framework that uses a conditional diffusion model as an inverse surrogate model. The framework employs an Uncertainty-aware Exploration (UaE) acquisition function to propose scores in the objective space for conditional sampling. The approach is shown theoretically to achieve a near-optimal solution and empirically outperforms existing online BBO baselines across 6 scientific discovery tasks.