Researchers at MBZUAI introduce FissionFusion, a hierarchical model merging approach to improve medical image analysis performance. The method uses local and global aggregation of models based on hyperparameter configurations, along with a cyclical learning rate scheduler for efficient model generation. Experiments show FissionFusion outperforms standard model souping by approximately 6% on HAM10000 and CheXpert datasets and improves OOD performance.
The paper introduces VENOM, a text-driven framework for generating high-quality unrestricted adversarial examples using diffusion models. VENOM unifies image content generation and adversarial synthesis into a single reverse diffusion process, enhancing both attack success rate and image quality. The framework incorporates an adaptive adversarial guidance strategy with momentum to ensure the generated adversarial examples align with the distribution of natural images.
KAUST held its second hackathon and third NVIDIA workshop. Attendees listened to lectures from international experts. Participants worked on porting their scientific applications to a GPU accelerator. Why it matters: Such events help build regional expertise in accelerated computing and attract international collaboration.
Researchers at MBZUAI have demonstrated a method called "Data Laundering" to artificially boost language model benchmark scores using knowledge distillation. The technique covertly transfers benchmark-specific knowledge, leading to inflated accuracy without genuine improvements in reasoning. The study highlights a vulnerability in current AI evaluation practices and calls for more robust benchmarks.
MBZUAI researchers found that ImageNet performance isn't always indicative of real-world task performance for computer vision models. The study analyzed four popular model configurations, revealing variations in behavior on specific image types despite similar overall ImageNet accuracy. It indicates that certain model configurations are better suited for particular tasks, even with lower ImageNet scores. Why it matters: This challenges the reliance on ImageNet as a sole benchmark and highlights the need for task-specific evaluations in computer vision.
The article discusses the rise of large language models like ChatGPT and Gemini. It highlights their role in driving the first wave of AI development. Why it matters: While lacking specifics, the article suggests ongoing interest in the impact and future of LLMs, a key area of AI research and development.
Yanwei Fu from Fudan University will present research on multimodal models, robotic grasping, and fMRI neural decoding. Topics include few-shot learning, object-centered self-supervised learning, image manipulation, and visual-language alignment. The research also covers Transformer compression and applications of large models with MVS 3D modeling in robotic arm grasping. Why it matters: While the talk is not directly about Middle East AI, the topics covered are core to advancing AI research and applications in the region.