GCC AI Research

Search

Results for "Temporal Information"

Modeling Complex Object Changes in Satellite Image Time-Series: Approach based on CSP and Spatiotemporal Graph

arXiv ·

This paper introduces a novel approach for monitoring and analyzing the evolution of complex geographic objects in satellite image time-series. The method uses a spatiotemporal graph and constraint satisfaction problems (CSP) to model and analyze object changes. Experiments on real-world satellite images from Saudi Arabian cities demonstrate the effectiveness of the proposed approach.

Learning Time-Series Representations by Hierarchical Uniformity-Tolerance Latent Balancing

arXiv ·

The paper introduces TimeHUT, a new method for learning time-series representations using hierarchical uniformity-tolerance balancing of contrastive representations. TimeHUT employs a hierarchical setup to learn both instance-wise and temporal information, along with a temperature scheduler to balance uniformity and tolerance. The method was evaluated on UCR, UAE, Yahoo, and KPI datasets, demonstrating superior performance in classification tasks and competitive results in anomaly detection.

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

arXiv ·

FancyVideo, a new video generator, introduces a Cross-frame Textual Guidance Module (CTGM) to enhance text-to-video models. CTGM uses a Temporal Information Injector and Temporal Affinity Refiner to achieve frame-specific textual guidance, improving comprehension of temporal logic. Experiments on the EvalCrafter benchmark demonstrate FancyVideo's state-of-the-art performance in generating dynamic and consistent videos, also supporting image-to-video tasks.

TiBiX: Leveraging Temporal Information for Bidirectional X-ray and Report Generation

arXiv ·

Researchers at MBZUAI have introduced TiBiX, a novel approach leveraging temporal information from previous chest X-rays (CXRs) and reports for bidirectional generation of current CXRs and reports. TiBiX addresses two key challenges: generating current images from previous images and reports, and generating current reports from both previous and current images. The study also introduces a curated temporal benchmark dataset derived from the MIMIC-CXR dataset and achieves state-of-the-art results in report generation.

A Novel CNN-LSTM-based Approach to Predict Urban Expansion

arXiv ·

This paper introduces a novel two-step method for predicting urban expansion using time-series satellite imagery. The approach combines semantic image segmentation with a CNN-LSTM model to learn temporal features. Experiments on satellite images from Riyadh, Jeddah, and Dammam in Saudi Arabia demonstrate improved performance compared to existing methods based on Mean Square Error, Root Mean Square Error, Peak Signal to Noise Ratio, Structural Similarity Index, and overall classification accuracy.

VideoMolmo: Spatio-Temporal Grounding Meets Pointing

arXiv ·

Researchers from MBZUAI have introduced VideoMolmo, a large multimodal model for spatio-temporal pointing conditioned on textual descriptions. The model incorporates a temporal module with an attention mechanism and a temporal mask fusion pipeline using SAM2 for improved coherence across video sequences. They also curated a dataset of 72k video-caption pairs and introduced VPoS-Bench, a benchmark for evaluating generalization across real-world scenarios, with code and models publicly available.

Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts

arXiv ·

Researchers introduce TimeTravel, a benchmark dataset for evaluating large multimodal models (LMMs) on historical and cultural artifacts. The benchmark comprises 10,250 expert-verified samples across 266 cultures and 10 historical regions, designed to assess AI in tasks like classification and interpretation of manuscripts, artworks, inscriptions, and archaeological discoveries. The goal is to establish AI as a reliable partner in preserving cultural heritage and assisting researchers.