Skip to content
GCC AI Research

Search

Results for "leaderboard"

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

arXiv ·

Researchers from the National Center for AI in Saudi Arabia investigated the sensitivity of Large Language Model (LLM) leaderboards to minor benchmark perturbations. They found that small changes, like choice order, can shift rankings by up to 8 positions. The study recommends hybrid scoring and warns against over-reliance on simple benchmark evaluations, providing code for further research.

Introducing the Open Arabic LLM Leaderboard: Empowering the Arabic Language Modeling Community

TII ·

The Open Arabic LLM Leaderboard (OALL) has been launched to benchmark Arabic language models, addressing the gap in resources for non-English NLP. It incorporates datasets like AlGhafa, ACVA, and translated versions of MMLU and EXAMS from the AceGPT suite. The leaderboard uses normalized log likelihood accuracy for tasks, built around HuggingFace’s LightEval framework. Why it matters: This initiative promotes research and development in Arabic NLP, serving over 380 million Arabic speakers by enhancing the evaluation and improvement of Arabic LLMs.

DomiRank: DERC’s Marcus Engsig Unveils Novel Centrality Metric to Establish System Integrity

TII ·

Marcus Engsig at DERC has developed DomiRank, a new centrality metric to quantify the dominance of nodes within networks. DomiRank integrates local and global topological information to determine the importance of each node for network stability. The research demonstrates that nodes with high DomiRank values indicate vulnerable areas heavily dependent on dominant nodes. Why it matters: This metric can help identify critical infrastructure components and vulnerabilities in complex systems, enhancing resilience against targeted attacks.

CRC Team Places 6th in Global Cyber Security Competition

TII ·

A team from the Cryptography Research Center (CRC) secured 6th place out of 210 teams in the 'Donjon CTF 2021: Capture the Fortress' cybersecurity competition. The competition featured jeopardy-style challenges covering cryptography, reverse engineering, and hardware security. The CRC team participated to improve visibility and assess team capabilities, particularly in hardware security. Why it matters: The CRC team's strong performance highlights the growing cybersecurity expertise in the UAE and its attractiveness for talent in this field.

Leaders—be the impact!

KAUST ·

Fahad Alsherehey, VP at SABIC, spoke at KAUST's Winter Enrichment Program (WEP) about authentic leadership. He cited SABIC's founding as an example of how leadership can turn challenges into opportunities. Alsherehey emphasized the difference between leadership and management, advocating for listening to one's team. Why it matters: The talk highlights the importance of leadership and vision in driving technological and economic development in Saudi Arabia.

TOCKIFY TEST

KAUST ·

The provided content mentions KAUST (King Abdullah University of Science and Technology) and its association with King Abdullah bin Abdulaziz Al Saud. It also includes a copyright notice. Why it matters: This is a routine update reflecting KAUST's branding and legal information.

UrduFake@FIRE2021: Shared Track on Fake News Identification in Urdu

arXiv ·

The UrduFake@FIRE2021 shared task focused on fake news detection in the Urdu language, framed as a binary classification problem. 34 teams registered, with 18 submitting results and 11 providing technical reports, showcasing diverse approaches. The top-performing system utilized the stochastic gradient descent (SGD) algorithm, achieving an F-score of 0.679.

Finalists for WEP Poster Competition announced

KAUST ·

KAUST has announced the finalists for its Winter Enrichment Program (WEP) poster competition. The finalists, consisting of graduates, postdoctoral students, and international undergraduates, submitted research posters. The winner will be announced on January 21, 2015, during the WEP award ceremony. Why it matters: Such events promote research excellence and collaboration within KAUST and the broader academic community, fostering innovation and knowledge sharing.