Search

Results for "texture analysis"

Fine-tuning Text-to-Image Models: Reinforcement Learning and Reward Over-Optimization

MBZUAI · Invalid Date

The article discusses research on fine-tuning text-to-image diffusion models, including reward function training, online reinforcement learning (RL) fine-tuning, and addressing reward over-optimization. A Text-Image Alignment Assessment (TIA2) benchmark is introduced to study reward over-optimization. TextNorm, a method for confidence calibration in reward models, is presented to reduce over-optimization risks. Why it matters: Improving the alignment and fidelity of text-to-image models is crucial for generating high-quality content, and addressing over-optimization enhances the reliability of these models in creative applications.

Point correlations for graphics, vision and machine learning

MBZUAI · Invalid Date

The article discusses the importance of sample correlations in computer graphics, vision, and machine learning, highlighting how tailored randomness can improve the efficiency of existing models. It covers various correlations studied in computer graphics and tools to characterize them, including the use of neural networks for developing different correlations. Gurprit Singh from the Max Planck Institute for Informatics will be presenting on the topic. Why it matters: Optimizing sampling techniques via understanding and applying correlations can lead to significant advancements and efficiency gains across multiple AI fields.

Dates Fruit Disease Recognition using Machine Learning

arXiv · Nov 17

This paper proposes a machine learning method for early detection and classification of date fruit diseases, which are economically important to countries like Saudi Arabia. The method uses a hybrid feature extraction approach combining L*a*b color features, statistical features, and Discrete Wavelet Transform (DWT) texture features. Experiments using a dataset of 871 images achieved the highest average accuracy using Random Forest (RF), Multilayer Perceptron (MLP), Naïve Bayes (NB), and Fuzzy Decision Trees (FDT) classifiers.

Modeling Text as a Living Object

MBZUAI · Invalid Date

The InterText project, funded by the European Research Council, aims to advance NLP by developing a framework for modeling fine-grained relationships between texts. This approach enables tracing the origin and evolution of texts and ideas. Iryna Gurevych from the Technical University of Darmstadt presented the intertextual approach to NLP, covering data modeling, representation learning, and practical applications. Why it matters: This research could enable a new generation of AI applications for text work and critical reading, with potential applications in collaborative knowledge construction and document revision assistance.

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

arXiv · Dec 22

The paper introduces the Prism Hypothesis, which posits a correspondence between an encoder's feature spectrum and its functional role, with semantic encoders capturing low-frequency components and pixel encoders retaining high-frequency information. Based on this, the authors propose Unified Autoencoding (UAE), a model that harmonizes semantic structure and pixel details using a frequency-band modulator. Experiments on ImageNet and MS-COCO demonstrate that UAE effectively unifies semantic abstraction and pixel-level fidelity, achieving state-of-the-art performance.

Making sense of space and time in video

MBZUAI · Invalid Date

MBZUAI researchers presented a new approach to video analysis at ICCV in Paris, led by Syed Talal Wasim. The approach builds on still image processing techniques like focal modulation to analyze spatial and temporal information in video separately. It aims to improve temporal aggregation while avoiding the computational complexity of transformers. Why it matters: This research advances video understanding in computer vision by offering a more efficient method for temporal modeling, crucial for applications like activity recognition and video surveillance.