Skip to content
GCC AI Research

Search

Results for "Image Representation"

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

arXiv ·

The paper introduces the Prism Hypothesis, which posits a correspondence between an encoder's feature spectrum and its functional role, with semantic encoders capturing low-frequency components and pixel encoders retaining high-frequency information. Based on this, the authors propose Unified Autoencoding (UAE), a model that harmonizes semantic structure and pixel details using a frequency-band modulator. Experiments on ImageNet and MS-COCO demonstrate that UAE effectively unifies semantic abstraction and pixel-level fidelity, achieving state-of-the-art performance.

Unifying Vision Representation

MBZUAI ·

This seminar explores vision systems through self-supervised representation learning, addressing challenges and solutions in mainstream vision self-supervised learning methods. It discusses developing versatile representations across modalities, tasks, and architectures to propel the evolution of the vision foundation model. Tong Zhang from EPFL, with a background from Beihang University, New York University, and Australian National University, will lead the talk. Why it matters: Advancing vision foundation models is crucial for expanding AI applications, especially in the Middle East where computer vision can address challenges in areas like urban planning, agriculture, and environmental monitoring.

UAE: Universal Anatomical Embedding on Multi-modality Medical Images

arXiv ·

Researchers propose a universal anatomical embedding (UAE) framework for medical image analysis to learn appearance, semantic, and cross-modality anatomical embeddings. UAE incorporates semantic embedding learning with prototypical contrastive loss, a fixed-point-based matching strategy, and an iterative approach for cross-modality embedding learning. The framework was evaluated on landmark detection, lesion tracking and CT-MRI registration tasks, outperforming existing state-of-the-art methods.

CTRL: Closed-Loop Data Transcription via Rate Reduction

MBZUAI ·

A talk introduces a computational framework for learning a compact structured representation for real-world datasets, that is both discriminative and generative. It proposes to learn a closed-loop transcription between the distribution of a high-dimensional multi-class dataset and an arrangement of multiple independent subspaces, known as a linear discriminative representation (LDR). The optimality of the closed-loop transcription can be characterized in closed-form by an information-theoretic measure known as the rate reduction. Why it matters: The framework unifies concepts and benefits of auto-encoding and GAN and generalizes them to the settings of learning a both discriminative and generative representation for multi-class visual data.

ConDiSR: Contrastive Disentanglement and Style Regularization for Single Domain Generalization

arXiv ·

This paper introduces a new Single Domain Generalization (SDG) method called ConDiSR for medical image classification, using channel-wise contrastive disentanglement and reconstruction-based style regularization. The method is evaluated on multicenter histopathology image classification, achieving a 1% improvement in average accuracy compared to state-of-the-art SDG baselines. Code is available at https://github.com/BioMedIA-MBZUAI/ConDiSR.

Point correlations for graphics, vision and machine learning

MBZUAI ·

The article discusses the importance of sample correlations in computer graphics, vision, and machine learning, highlighting how tailored randomness can improve the efficiency of existing models. It covers various correlations studied in computer graphics and tools to characterize them, including the use of neural networks for developing different correlations. Gurprit Singh from the Max Planck Institute for Informatics will be presenting on the topic. Why it matters: Optimizing sampling techniques via understanding and applying correlations can lead to significant advancements and efficiency gains across multiple AI fields.

Dates Fruit Disease Recognition using Machine Learning

arXiv ·

This paper proposes a machine learning method for early detection and classification of date fruit diseases, which are economically important to countries like Saudi Arabia. The method uses a hybrid feature extraction approach combining L*a*b color features, statistical features, and Discrete Wavelet Transform (DWT) texture features. Experiments using a dataset of 871 images achieved the highest average accuracy using Random Forest (RF), Multilayer Perceptron (MLP), Naïve Bayes (NB), and Fuzzy Decision Trees (FDT) classifiers.