This article discusses domain shift in machine learning, where testing data differs from training data, and methods to mitigate it via domain adaptation and generalization. Domain adaptation uses labeled source data and unlabeled target data. Domain generalization uses labeled data from single or multiple source domains to generalize to unseen target domains. Why it matters: Research in mitigating domain shift enhances the robustness and applicability of AI models in diverse real-world scenarios.
Researchers at MBZUAI have introduced MedMerge, a transfer learning technique that merges weights from independently initialized models to improve performance on medical imaging tasks. MedMerge learns kernel-level weights to combine features from different models into a single model. Experiments across various medical imaging tasks demonstrated performance gains of up to 7% in F1 score.
The study compares deep learning models trained via transfer learning from ImageNet (TII-models) against those trained solely on medical images (LMI-models) for disease segmentation. Results show that combining outputs from both model types can improve segmentation performance by up to 10% in certain scenarios. A repository of models, code, and over 10,000 medical images is available on GitHub to facilitate further research.
This paper introduces a domain generalization (DG) method for Diabetic Retinopathy (DR) classification that maximizes mutual information using a large pretrained model. The method aims to address the challenge of domain shift in medical imaging caused by variations in data acquisition. Experiments on public datasets demonstrate that the proposed method outperforms state-of-the-art techniques, achieving a 5.25% improvement in average accuracy.
This paper introduces a method for quantifying the transferability of architectural components in Single Image Super-Resolution (SISR) models, termed "Universality," and proposes a Universality Assessment Equation (UAE). Guided by the UAE, the authors design optimized modules, Cycle Residual Block (CRB) and Depth-Wise Cycle Residual Block (DCRB), and demonstrate their effectiveness across various datasets and low-level tasks. Results show that networks using these modules outperform state-of-the-art methods, achieving improved PSNR or parameter reduction.
MBZUAI researchers presented a method for cross-cultural transfer learning to improve language models' understanding of diverse Arab cultures. They used in-context learning and demonstration-based reinforcement (DITTO) to transfer cultural knowledge between countries. Experiments showed up to 34% improvement in performance on cultural understanding benchmarks using only a few demonstrations. Why it matters: This research addresses the gap in cultural understanding of Arabic language models, especially for smaller Arab countries, and provides a novel transfer learning approach.
This article discusses retrieval augmentation in text generation, where information retrieved from an external source is used to condition predictions. It references recent work on retrieval-augmented image captioning, showing that model size can be greatly reduced when training data is available through retrieval. The author intends to continue this work focusing on the intersection of retrieval augmentation and in-context learning, and controllable image captioning for language learning materials. Why it matters: This research direction has the potential to improve transfer learning in vision-language models, which could be especially relevant for downstream applications in Arabic NLP and multimodal tasks.
KAUST researchers in the Image and Video Understanding Lab are applying machine learning to computer vision for automated navigation, including self-driving cars and UAVs. They tested their algorithms on KAUST roads, aiming to replicate the brain's efficiency in tasks like activity and object recognition. The team is also exploring the possibility of creative algorithms that can transfer skills without direct training. Why it matters: This research contributes to the advancement of autonomous systems and explores the fundamental questions of replicating human intelligence in machines within the GCC region.