Skip to content
GCC AI Research

Search

Results for "Transformer"

Continuous Saudi Sign Language Recognition: A Vision Transformer Approach

arXiv ·

The researchers introduce KAU-CSSL, the first continuous Saudi Sign Language (SSL) dataset focusing on complete sentences. They propose a transformer-based model using ResNet-18 for spatial feature extraction and a Transformer Encoder with Bidirectional LSTM for temporal dependencies. The model achieved 99.02% accuracy in signer-dependent mode and 77.71% in signer-independent mode, advancing communication tools for the SSL community.

Transformers of the handwritten word

MBZUAI ·

MBZUAI researchers have developed an AI program using vision transformers that can learn a person's handwriting style and generate text in that style. The US Patent and Trademark Office recently granted a patent for this technology, which could aid individuals with writing impairments. The system overcomes limitations of previous GAN-based approaches by processing long-range dependencies in handwriting. Why it matters: This patented AI tool enhances personalized text generation and has potential applications in assistive technology and improving handwriting recognition models.

Transformer Models: from Linguistic Probing to Outlier Weights

MBZUAI ·

Giovanni Puccetti from ISTI-CNR presented research on linguistic probing of language models like BERT and RoBERTa. The research investigates the ability of these models to encode linguistic properties, linking this ability to outlier parameters. Preliminary work on fine-tuning LLMs in Italian and detecting synthetic news generation was also presented. Why it matters: Understanding the inner workings and linguistic capabilities of LLMs is crucial for improving their reliability and adapting them to diverse languages like Arabic.

TerraFM: A Scalable Foundation Model for Unified Multisensor Earth Observation

arXiv ·

MBZUAI researchers introduce TerraFM, a scalable self-supervised learning model for Earth observation that uses Sentinel-1 and Sentinel-2 imagery. The model unifies radar and optical inputs through modality-specific patch embeddings and adaptive cross-attention fusion. TerraFM achieves strong generalization on classification and segmentation tasks, outperforming prior models on GEO-Bench and Copernicus-Bench.