GCC AI Research

Search

Results for "merger"

MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks

arXiv ·

Researchers at MBZUAI have introduced MedMerge, a transfer learning technique that merges weights from independently initialized models to improve performance on medical imaging tasks. MedMerge learns kernel-level weights to combine features from different models into a single model. Experiments across various medical imaging tasks demonstrated performance gains of up to 7% in F1 score.

FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis

arXiv ·

Researchers at MBZUAI introduce FissionFusion, a hierarchical model merging approach to improve medical image analysis performance. The method uses local and global aggregation of models based on hyperparameter configurations, along with a cyclical learning rate scheduler for efficient model generation. Experiments show FissionFusion outperforms standard model souping by approximately 6% on HAM10000 and CheXpert datasets and improves OOD performance.

DynaMMo: Dynamic Model Merging for Efficient Class Incremental Learning for Medical Images

arXiv ·

Researchers at MBZUAI have developed DynaMMo, a dynamic model merging method for efficient class incremental learning using medical images. DynaMMo merges multiple networks at different training stages using lightweight learnable modules, reducing computational overhead. Evaluated on three datasets, DynaMMo achieved a 10-fold reduction in GFLOPS compared to existing dynamic methods with a 2.76 average accuracy drop.

Interpretable and synergistic deep learning for visual explanation and statistical estimations of segmentation of disease features from medical images

arXiv ·

The study compares deep learning models trained via transfer learning from ImageNet (TII-models) against those trained solely on medical images (LMI-models) for disease segmentation. Results show that combining outputs from both model types can improve segmentation performance by up to 10% in certain scenarios. A repository of models, code, and over 10,000 medical images is available on GitHub to facilitate further research.

DaringFed: A Dynamic Bayesian Persuasion Pricing for Online Federated Learning under Two-sided Incomplete Information

arXiv ·

This paper introduces DaringFed, a novel dynamic Bayesian persuasion pricing mechanism for online federated learning (OFL) that addresses the challenge of two-sided incomplete information (TII) regarding resources. It formulates the interaction between the server and clients as a dynamic signaling and pricing allocation problem within a Bayesian persuasion game, demonstrating the existence of a unique Bayesian persuasion Nash equilibrium. Evaluations on real and synthetic datasets demonstrate that DaringFed optimizes accuracy and convergence speed and improves the server's utility.

Distillation Policy Optimization

arXiv ·

The paper introduces a novel actor-critic framework called Distillation Policy Optimization that combines on-policy and off-policy data for reinforcement learning. It incorporates variance reduction mechanisms like a unified advantage estimator (UAE) and a residual baseline. The empirical results demonstrate improved sample efficiency for on-policy algorithms, bridging the gap with off-policy methods.

SlimPajama-DC: Understanding Data Combinations for LLM Training

arXiv ·

Researchers at MBZUAI release SlimPajama-DC, an empirical analysis of data combinations for pretraining LLMs using the SlimPajama dataset. The study examines the impact of global vs. local deduplication and the proportions of highly-deduplicated multi-source datasets. Results show that increased data diversity after global deduplication is crucial, with the best configuration outperforming models trained on RedPajama.

Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance

arXiv ·

The paper introduces a framework for camel farm monitoring using a combination of automated annotation and fine-tune distillation. The Unified Auto-Annotation framework uses GroundingDINO and SAM to automatically annotate surveillance video data. The Fine-Tune Distillation framework then fine-tunes student models like YOLOv8, transferring knowledge from a larger teacher model, using data from Al-Marmoom Camel Farm in Dubai.