GCC AI Research

Search

Results for "research"

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

arXiv ·

A new survey paper provides a deep dive into post-training methodologies for Large Language Models (LLMs), analyzing their role in refining LLMs beyond pretraining. It addresses key challenges such as catastrophic forgetting, reward hacking, and inference-time trade-offs, and highlights emerging directions in model alignment, scalable adaptation, and inference-time reasoning. The paper also provides a public repository to continually track developments in this fast-evolving field.

Fact-Checking Complex Claims with Program-Guided Reasoning

arXiv ·

This paper introduces ProgramFC, a fact-checking model that decomposes complex claims into simpler sub-tasks using a library of functions. The model uses LLMs to generate reasoning programs and executes them by delegating sub-tasks, enhancing explainability and data efficiency. Experiments on fact-checking datasets demonstrate ProgramFC's superior performance compared to baseline methods, with publicly available code and data.

SlimPajama-DC: Understanding Data Combinations for LLM Training

arXiv ·

Researchers at MBZUAI release SlimPajama-DC, an empirical analysis of data combinations for pretraining LLMs using the SlimPajama dataset. The study examines the impact of global vs. local deduplication and the proportions of highly-deduplicated multi-source datasets. Results show that increased data diversity after global deduplication is crucial, with the best configuration outperforming models trained on RedPajama.

Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance

arXiv ·

The paper introduces a framework for camel farm monitoring using a combination of automated annotation and fine-tune distillation. The Unified Auto-Annotation framework uses GroundingDINO and SAM to automatically annotate surveillance video data. The Fine-Tune Distillation framework then fine-tunes student models like YOLOv8, transferring knowledge from a larger teacher model, using data from Al-Marmoom Camel Farm in Dubai.

A Missing and Found Recognition System for Hajj and Umrah

arXiv ·

A proposed recognition system aims to identify missing persons, deceased individuals, and lost objects during the Hajj and Umrah pilgrimages in Saudi Arabia. The system intends to leverage facial recognition and object identification to manage the large crowds expected in the coming decade, estimated to reach 20 million pilgrims. It will be integrated into the CrowdSensing system for crowd estimation, management, and safety.

Spot-the-Camel: Computer Vision for Safer Roads

arXiv ·

Researchers in Saudi Arabia are applying computer vision techniques to reduce Camel-Vehicle Collisions (CVCs). They tested object detection models including CenterNet, EfficientDet, Faster R-CNN, SSD, and YOLOv8 on the task, finding YOLOv8 to be the most accurate and efficient. Future work will focus on developing a system to improve road safety in rural areas.

Machine Learning Advances aiding Recognition and Classification of Indian Monuments and Landmarks

arXiv ·

This paper surveys machine learning approaches using monument pictures for analyzing heritage sites in India. It addresses challenges in the tourism sector, such as the unavailability of trained personnel and the lack of accurate information. The research aims to provide insights for building an automated decision system to modernize the tourism experience for visitors in India.

Dates Fruit Disease Recognition using Machine Learning

arXiv ·

This paper proposes a machine learning method for early detection and classification of date fruit diseases, which are economically important to countries like Saudi Arabia. The method uses a hybrid feature extraction approach combining L*a*b color features, statistical features, and Discrete Wavelet Transform (DWT) texture features. Experiments using a dataset of 871 images achieved the highest average accuracy using Random Forest (RF), Multilayer Perceptron (MLP), Naïve Bayes (NB), and Fuzzy Decision Trees (FDT) classifiers.