Skip to content
GCC AI Research

Search

Results for "leaderboard"

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

arXiv ·

Researchers from the National Center for AI in Saudi Arabia investigated the sensitivity of Large Language Model (LLM) leaderboards to minor benchmark perturbations. They found that small changes, like choice order, can shift rankings by up to 8 positions. The study recommends hybrid scoring and warns against over-reliance on simple benchmark evaluations, providing code for further research.

Introducing the Open Arabic LLM Leaderboard: Empowering the Arabic Language Modeling Community

TII ·

The Open Arabic LLM Leaderboard (OALL) has been launched to benchmark Arabic language models, addressing the gap in resources for non-English NLP. It incorporates datasets like AlGhafa, ACVA, and translated versions of MMLU and EXAMS from the AceGPT suite. The leaderboard uses normalized log likelihood accuracy for tasks, built around HuggingFace’s LightEval framework. Why it matters: This initiative promotes research and development in Arabic NLP, serving over 380 million Arabic speakers by enhancing the evaluation and improvement of Arabic LLMs.

Get in the innovation game

KAUST ·

KAUST held an Innovation & Economic Development Open House event on October 4 and 5. The event showcased industry partners in the KAUST Innovation Cluster, including Dow Chemical, SABIC, Saudi Aramco, and startups like FalconViz and NOMADD. Student groups like the Entrepreneurship Business & Innovation Group (eBIG) also participated, highlighting efforts to foster innovation within the KAUST community. Why it matters: This event demonstrates KAUST's ongoing commitment to fostering entrepreneurship and translating research into real-world applications, aligning with Saudi Arabia's broader economic diversification goals.

KAUST Ph.D. student wins Society for Industrial and Applied Mathematics award

KAUST ·

KAUST Ph.D. student Chiheb Ben Hammouda won the best poster award at the Society for Industrial and Applied Mathematics Conference on Financial Mathematics & Engineering (FM19) for his work on option pricing under the rough Bergomi model. The winning poster, titled "Hierarchical adaptive sparse grids and quasi-Monte Carlo for option pricing under the rough Bergomi model," details research carried out under the supervision of KAUST Professor Raul Tempone. The research group designed new efficient numerical methods for pricing derivatives under the rough Bergomi model by combining smoothing techniques. Why it matters: This award highlights KAUST's growing expertise in financial mathematics and its contribution to solving complex problems in the field using advanced numerical methods.

CRC Team Places 6th in Global Cyber Security Competition

TII ·

A team from the Cryptography Research Center (CRC) secured 6th place out of 210 teams in the 'Donjon CTF 2021: Capture the Fortress' cybersecurity competition. The competition featured jeopardy-style challenges covering cryptography, reverse engineering, and hardware security. The CRC team participated to improve visibility and assess team capabilities, particularly in hardware security. Why it matters: The CRC team's strong performance highlights the growing cybersecurity expertise in the UAE and its attractiveness for talent in this field.

Hybrid Deep Feature Extraction and ML for Construction and Demolition Debris Classification

arXiv ·

This paper introduces a hybrid deep learning and machine learning pipeline for classifying construction and demolition waste. A dataset of 1,800 images from UAE construction sites was created, and deep features were extracted using a pre-trained Xception network. The combination of Xception features with machine learning classifiers achieved up to 99.5% accuracy, demonstrating state-of-the-art performance for debris identification.