Skip to content
GCC AI Research

Search

Results for "SwiftFormer"

Early and Accurate Detection of Tomato Leaf Diseases Using TomFormer

arXiv ·

Researchers introduce TomFormer, a transformer-based model for accurate and early detection of tomato leaf diseases, with the goal of deployment on the Hello Stretch robot for real-time diagnosis. TomFormer combines a visual transformer and CNN, achieving state-of-the-art results on KUTomaDATA, PlantDoc, and PlantVillage datasets. KUTomaDATA was collected from a greenhouse in Abu Dhabi, UAE.

AraModernBERT: Transtokenized Initialization and Long-Context Encoder Modeling for Arabic

arXiv ·

The paper introduces AraModernBERT, an adaptation of the ModernBERT encoder architecture for Arabic, focusing on transtokenized embedding initialization and long-context modeling up to 8,192 tokens. Transtokenization is shown to be crucial for Arabic language modeling, significantly enhancing masked language modeling performance. The model demonstrates stable and effective long-context modeling, improving intrinsic language modeling performance at extended sequence lengths. Why it matters: This research provides practical insights for adapting modern encoder architectures to Arabic and other languages using Arabic-derived scripts, advancing Arabic NLP.

Transformers of the handwritten word

MBZUAI ·

MBZUAI researchers have developed an AI program using vision transformers that can learn a person's handwriting style and generate text in that style. The US Patent and Trademark Office recently granted a patent for this technology, which could aid individuals with writing impairments. The system overcomes limitations of previous GAN-based approaches by processing long-range dependencies in handwriting. Why it matters: This patented AI tool enhances personalized text generation and has potential applications in assistive technology and improving handwriting recognition models.

Beyond Attention: Orchid’s Adaptive Convolutions for Next-Level Sequence Modeling

MBZUAI ·

A new neural network architecture called Orchid was introduced that uses adaptive convolutions to achieve quasilinear computational complexity O(N logN) for sequence modeling. Orchid adapts its convolution kernel dynamically based on the input sequence. Evaluations across language modeling and image classification show that Orchid outperforms attention-based architectures like BERT and Vision Transformers, often with smaller model sizes. Why it matters: Orchid extends the feasible sequence length beyond the practical limits of dense attention layers, representing progress toward more efficient and scalable deep learning models.

Tomato Maturity Recognition with Convolutional Transformers

arXiv ·

This paper introduces a convolutional transformer model for classifying tomato maturity, along with a new UAE-sourced dataset, KUTomaData, for training segmentation and classification models. The model combines CNNs and transformers and was tested against two public datasets. Results showed state-of-the-art performance, outperforming existing methods by significant margins in mAP scores across all three datasets.

Self-supervised DNA models and scalable sequence processing with memory augmented transformers

MBZUAI ·

Dr. Mikhail Burtsev of the London Institute presented research on GENA-LM, a suite of transformer-based DNA language models. The talk addressed the challenge of scaling transformers for genomic sequences, proposing recurrent memory augmentation to handle long input sequences efficiently. This approach improves language modeling performance and holds promise for memory-intensive applications in bioinformatics. Why it matters: This research can significantly advance AI's capabilities in genomics by enabling the processing of much larger DNA sequences, with potential breakthroughs in understanding and treating diseases.