Skip to content
GCC AI Research

Search

Results for "ResNet-18"

Designing the Architecture of a Convolutional Neural Network Automatically for Diabetic Retinopathy Diagnosis

arXiv ·

This paper introduces a method for automatically designing convolutional neural network (CNN) architectures tailored for diabetic retinopathy (DR) diagnosis using fundus images. The approach uses k-medoid clustering, PCA, and inter/intra-class variations to optimize CNN depth and width. Validated on datasets including a local Saudi dataset and Kaggle benchmarks, the custom-designed models outperform pre-trained CNNs with fewer parameters.

Tomato Maturity Recognition with Convolutional Transformers

arXiv ·

This paper introduces a convolutional transformer model for classifying tomato maturity, along with a new UAE-sourced dataset, KUTomaData, for training segmentation and classification models. The model combines CNNs and transformers and was tested against two public datasets. Results showed state-of-the-art performance, outperforming existing methods by significant margins in mAP scores across all three datasets.

Continuous Saudi Sign Language Recognition: A Vision Transformer Approach

arXiv ·

The researchers introduce KAU-CSSL, the first continuous Saudi Sign Language (SSL) dataset focusing on complete sentences. They propose a transformer-based model using ResNet-18 for spatial feature extraction and a Transformer Encoder with Bidirectional LSTM for temporal dependencies. The model achieved 99.02% accuracy in signer-dependent mode and 77.71% in signer-independent mode, advancing communication tools for the SSL community.

MedNNS: Supernet-based Medical Task-Adaptive Neural Network Search

arXiv ·

The paper introduces MedNNS, a neural network search framework designed for medical imaging, addressing challenges in architecture selection and weight initialization. MedNNS constructs a meta-space encoding datasets and models based on their performance using a Supernetwork-based approach, expanding the model zoo size by 51x. The framework incorporates rank loss and Fréchet Inception Distance (FID) loss to capture inter-model and inter-dataset relationships, improving alignment in the meta-space and outperforming ImageNet pre-trained DL models and SOTA NAS methods.

Hybrid Deep Feature Extraction and ML for Construction and Demolition Debris Classification

arXiv ·

This paper introduces a hybrid deep learning and machine learning pipeline for classifying construction and demolition waste. A dataset of 1,800 images from UAE construction sites was created, and deep features were extracted using a pre-trained Xception network. The combination of Xception features with machine learning classifiers achieved up to 99.5% accuracy, demonstrating state-of-the-art performance for debris identification.