KAUST researchers introduced MOLE, a framework leveraging LLMs for automated metadata extraction from scientific papers. The system processes documents in multiple formats and validates outputs, targeting datasets beyond Arabic. A new benchmark dataset has been released to evaluate progress in metadata extraction.
Researchers created Masader, the largest public catalog for Arabic NLP datasets, containing 200 datasets annotated with 25 attributes. They developed a metadata annotation strategy applicable to other languages. The paper highlights issues within current Arabic NLP datasets and suggests recommendations. Why it matters: This curated dataset directory helps lower the barrier to entry for Arabic NLP research and development.
The Symposium on Data Mining and Applications (SDMA 2014) was organized by MEGDAM to foster collaboration among data mining and machine learning researchers in Saudi Arabia, GCC countries, and the Middle East. The symposium covered areas such as statistics, computational intelligence, pattern recognition, databases, Big Data Mining and visualization. Acceptance was based on originality, significance and quality of contribution.
This article previews a talk by Dr. Wei Cai of CUHK-Shenzhen on the history, development, and future trends of the Web3 metaverse. The talk will cover industrial Web3 metaverse cases, recent research outcomes, and the metaverse research spectrum. Dr. Cai's research interests include blockchain, Web 3.0, digital games, and computational art. Why it matters: As metaverse technologies continue to evolve, understanding the Web3 perspective and research directions is important for regional AI and technology development.
Researchers in Saudi Arabia have developed a deep learning framework for automated counting and geolocation of palm trees using aerial images. The system uses a Faster R-CNN model trained on a dataset of 10,000 palm tree instances collected in the Kharj region using DJI drones. Geolocation accuracy of 2.8m was achieved using geotagged metadata and photogrammetry techniques.
This paper introduces BRIQA, a new method for automated assessment of artifact severity in pediatric brain MRI, which is important for diagnostic accuracy. BRIQA uses gradient-based loss reweighting and a rotating batching scheme to handle class imbalance in artifact severity levels. Experiments show BRIQA improves average macro F1 score from 0.659 to 0.706, especially for Noise, Zipper, Positioning and Contrast artifacts.
A new paper from MBZUAI researchers explores using ChatGPT to combat the spread of fake news. The researchers, including Preslav Nakov and Liangming Pan, demonstrate that ChatGPT can be used to fact-check published information. Their paper, "Fact-Checking Complex Claims with Program-Guided Reasoning," was accepted at ACL 2023. Why it matters: This research highlights the potential of large language models to address the growing challenge of misinformation, with implications for maintaining information integrity in the digital age.