GCC AI Research

Search

Results for "data science"

Machine Learning Risk Intelligence for Green Hydrogen Investment: Insights for Duqm R3 Auction

arXiv ·

This paper introduces an AI-driven decision support system for green hydrogen investment in Oman, specifically for the Duqm R3 auction. The system uses publicly available meteorological data to predict maintenance pressure on hydrogen infrastructure, creating a Maintenance Pressure Index (MPI). This tool supports regulatory oversight and operational decision-making by enabling temporal benchmarking against performance claims.

A Feed-Forward Artificial Intelligence Pipeline for Sustainable Desalination under Climate Uncertainties: UAE Insights

arXiv ·

Researchers developed a two-stage AI pipeline to predict desalination performance efficiency losses due to climate factors in the UAE, achieving 98% accuracy. The model forecasts aerosol optical depth (AOD) and uses it to predict desalination efficiency, incorporating meteorological data. A dust-aware control logic was developed to optimize plant operations, and an interactive dashboard was created for decision support.

SlimPajama-DC: Understanding Data Combinations for LLM Training

arXiv ·

Researchers at MBZUAI release SlimPajama-DC, an empirical analysis of data combinations for pretraining LLMs using the SlimPajama dataset. The study examines the impact of global vs. local deduplication and the proportions of highly-deduplicated multi-source datasets. Results show that increased data diversity after global deduplication is crucial, with the best configuration outperforming models trained on RedPajama.

ArabJobs: A Multinational Corpus of Arabic Job Ads

arXiv ·

The ArabJobs dataset is a new corpus of over 8,500 Arabic job advertisements collected from Egypt, Jordan, Saudi Arabia, and the UAE. The dataset contains over 550,000 words and captures linguistic, regional, and socio-economic variation in the Arab labor market. It is available on GitHub and can be used for fairness-aware Arabic NLP and labor market research.

Proceedings of Symposium on Data Mining Applications 2014

arXiv ·

The Symposium on Data Mining and Applications (SDMA 2014) was organized by MEGDAM to foster collaboration among data mining and machine learning researchers in Saudi Arabia, GCC countries, and the Middle East. The symposium covered areas such as statistics, computational intelligence, pattern recognition, databases, Big Data Mining and visualization. Acceptance was based on originality, significance and quality of contribution.