This paper introduces a convolutional transformer model for classifying tomato maturity, along with a new UAE-sourced dataset, KUTomaData, for training segmentation and classification models. The model combines CNNs and transformers and was tested against two public datasets. Results showed state-of-the-art performance, outperforming existing methods by significant margins in mAP scores across all three datasets.
MBZUAI researchers have developed an AI program using vision transformers that can learn a person's handwriting style and generate text in that style. The US Patent and Trademark Office recently granted a patent for this technology, which could aid individuals with writing impairments. The system overcomes limitations of previous GAN-based approaches by processing long-range dependencies in handwriting. Why it matters: This patented AI tool enhances personalized text generation and has potential applications in assistive technology and improving handwriting recognition models.
The researchers introduce KAU-CSSL, the first continuous Saudi Sign Language (SSL) dataset focusing on complete sentences. They propose a transformer-based model using ResNet-18 for spatial feature extraction and a Transformer Encoder with Bidirectional LSTM for temporal dependencies. The model achieved 99.02% accuracy in signer-dependent mode and 77.71% in signer-independent mode, advancing communication tools for the SSL community.