MBZUAI PhD graduate William de Vazelhes is researching hard-thresholding algorithms to enable AI to work from smaller datasets. His work focuses on optimization algorithms that simplify data, making it easier to analyze and work with, useful for energy-saving and deploying AI models on low-memory devices. He demonstrated that his approach can obtain results similar to those of convex algorithms in many usual settings. Why it matters: This research could broaden AI accessibility by reducing computational costs, and has potential applications in sectors like finance, particularly for portfolio management under budgetary constraints.
MBZUAI researchers presented a new strategy for handling complex optimization problems in machine learning at ICLR 2024. The study, a collaboration with ISAM, combines zeroth-order methods with hard-thresholding to address specific settings in machine learning. This approach aims to improve convergence, ensuring algorithms reach quality solutions efficiently. Why it matters: Improving optimization techniques is crucial for advancing machine learning models used in various applications, potentially accelerating development and enhancing performance.
This talk discusses the asymptotic study of large asymmetric spiked tensor models. It explores connections between these models and equivalent random matrices constructed through contractions of the original tensor. Mohamed El Amine Seddik, currently a senior researcher at TII in Abu Dhabi, presented the work. Why it matters: The research provides theoretical foundations relevant to machine learning algorithms that leverage low-rank tensor structures, potentially impacting AI research and applications in the region.
This paper addresses exploration in reinforcement learning (RL) in unknown environments with sparse rewards, focusing on maximum entropy exploration. It introduces a game-theoretic algorithm for visitation entropy maximization with improved sample complexity of O(H^3S^2A/ε^2). For trajectory entropy, the paper presents an algorithm with O(poly(S, A, H)/ε) complexity, showing the statistical advantage of regularized MDPs for exploration. Why it matters: The research offers new techniques to reduce the sample complexity of RL, potentially enhancing the efficiency of AI agents in complex environments.
This talk explores modern machine learning through high-dimensional statistics, using random matrix theory to analyze learning models. The speaker, Denny Wu from University of Toronto and the Vector Institute, presents two examples: hyperparameter selection in overparameterized models and gradient-based representation learning in neural networks. The analysis reveals insights such as the possibility of negative optimal ridge penalty and the advantages of feature learning over random features. Why it matters: This research provides a deeper theoretical understanding of deep learning phenomena, with potential implications for optimizing training and improving model performance in the region.
Researchers developed a data-driven toolkit for short-term traffic forecasting using high-resolution traffic data from urban road sensors. The method models forecasting as a matrix completion problem, mapping inputs to a higher-dimensional space using kernels and adaptive boosting. Validated using real-world data from Abu Dhabi, UAE, the method outperforms state-of-the-art algorithms.
This paper introduces neural Bayes estimators for censored peaks-over-threshold models, enhancing computational efficiency in spatial extremal dependence modeling. The method uses data augmentation to encode censoring information in the neural network input, challenging traditional likelihood-based approaches. The estimators were applied to assess extreme particulate matter concentrations over Saudi Arabia, demonstrating efficacy in high-dimensional models. Why it matters: The research offers a computationally efficient alternative for environmental modeling and risk assessment in the region.
This article discusses a talk by Gábor Lugosi on "network archaeology," specifically the problems of root finding and broadcasting in large networks. The talk addresses discovering the past of dynamically growing networks when only a present-day snapshot is observed. Lugosi's research interests include machine learning theory, nonparametric statistics, and random structures. Why it matters: Understanding the evolution and origins of networks is crucial for various applications, including analyzing social networks, biological systems, and the spread of information.