Researchers from MBZUAI have developed MMRINet, a Mamba-based neural network for efficient brain tumor segmentation in MRI scans. The model uses Dual-Path Feature Refinement and Progressive Feature Aggregation to achieve high accuracy with only 2.5M parameters, making it suitable for low-resource clinical environments. MMRINet achieves a Dice score of 0.752 and HD95 of 12.23 on the BraTS-Lighthouse SSA 2025 benchmark.
MBZUAI researchers introduce TerraFM, a scalable self-supervised learning model for Earth observation that uses Sentinel-1 and Sentinel-2 imagery. The model unifies radar and optical inputs through modality-specific patch embeddings and adaptive cross-attention fusion. TerraFM achieves strong generalization on classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.
Researchers from MBZUAI introduced RP-SAM2, a method to improve surgical instrument segmentation by refining point prompts for more stable results. RP-SAM2 uses a novel shift block and compound loss function to reduce sensitivity to point prompt placement, improving segmentation accuracy in data-constrained settings. Experiments on the Cataract1k and CaDIS datasets show that RP-SAM2 enhances segmentation accuracy and reduces variance compared to SAM2, with code available on GitHub.
Researchers introduce SALT, a parameter-efficient fine-tuning method for medical image segmentation that combines singular value adaptation with low-rank transformation. SALT selectively adapts influential singular values and complements this with a low-rank update for the remaining subspace. Experiments on five medical datasets show SALT outperforms state-of-the-art PEFT methods by 2-5% in Dice score with only 3.9% trainable parameters.
Researchers at MBZUAI introduce FissionFusion, a hierarchical model merging approach to improve medical image analysis performance. The method uses local and global aggregation of models based on hyperparameter configurations, along with a cyclical learning rate scheduler for efficient model generation. Experiments show FissionFusion outperforms standard model souping by approximately 6% on HAM10000 and CheXpert datasets and improves OOD performance.
Researchers at MBZUAI have introduced MedMerge, a transfer learning technique that merges weights from independently initialized models to improve performance on medical imaging tasks. MedMerge learns kernel-level weights to combine features from different models into a single model. Experiments across various medical imaging tasks demonstrated performance gains of up to 7% in F1 score.
This paper introduces a new Single Domain Generalization (SDG) method called ConDiSR for medical image classification, using channel-wise contrastive disentanglement and reconstruction-based style regularization. The method is evaluated on multicenter histopathology image classification, achieving a 1% improvement in average accuracy compared to state-of-the-art SDG baselines. Code is available at https://github.com/BioMedIA-MBZUAI/ConDiSR.
Researchers from MBZUAI have developed XReal, a diffusion model for generating realistic chest X-ray images with precise control over anatomy and pathology location. The model utilizes an Anatomy Controller and a Pathology Controller to introduce spatial control in a pre-trained Text-to-Image Diffusion Model without fine-tuning. XReal outperforms existing X-ray diffusion models in realism, as evaluated by quantitative metrics and radiologists' ratings, and the code/weights are available.
Video-ChatGPT is a new multimodal model that combines a video-adapted visual encoder with a large language model (LLM) to enable detailed video understanding and conversation. The authors introduce a new dataset of 100,000 video-instruction pairs for training the model. They also develop a quantitative evaluation framework for video-based dialogue models.
Researchers from MBZUAI have developed EchoCoTr, a novel spatiotemporal deep learning method for estimating left ventricular ejection fraction (LVEF) from echocardiograms. EchoCoTr combines CNNs and vision transformers to overcome the limitations of each when applied to medical video data. The method achieves state-of-the-art results on the EchoNet-Dynamic dataset, demonstrating improved accuracy compared to existing approaches, with code available on GitHub.
This paper introduces a self-supervised contrastive learning method for segmenting the left ventricle in echocardiography images when limited labeled data is available. The approach uses contrastive pretraining to improve the performance of UNet and DeepLabV3 segmentation networks. Experiments on the EchoNet-Dynamic dataset show the method achieves a Dice score of 0.9252, outperforming existing approaches, with code available on Github.
A new brain tumor segmentation method based on convolutional neural networks is proposed for the BraTS-GoAT challenge. The method employs the MedNeXt architecture and model ensembling to segment tumors in brain MRI scans from diverse populations. Experiments on the unseen validation set demonstrate promising results with an average DSC of 85.54%.