This talk explores modern machine learning through high-dimensional statistics, using random matrix theory to analyze learning models. The speaker, Denny Wu from University of Toronto and the Vector Institute, presents two examples: hyperparameter selection in overparameterized models and gradient-based representation learning in neural networks. The analysis reveals insights such as the possibility of negative optimal ridge penalty and the advantages of feature learning over random features. Why it matters: This research provides a deeper theoretical understanding of deep learning phenomena, with potential implications for optimizing training and improving model performance in the region.
An associate professor of Statistics at the University of Toronto gave a talk on how ensemble learning stabilizes and improves the generalization performance of an individual interpolator. The talk focused on bagged linear interpolators and introduced the multiplier-bootstrap-based bagged least square estimator. The multiplier bootstrap encompasses the classical bootstrap with replacement as a special case, along with a Bernoulli bootstrap variant. Why it matters: While the talk occurred at MBZUAI, the content is about ensemble learning which is a core area for improving AI model performance, and is of general interest to the AI research community.
Soufiane Hayou of the National University of Singapore presented a talk at MBZUAI on principled scaling of neural networks. The talk covered leveraging mathematical results to efficiently scale neural networks. He obtained his PhD in statistics in 2021 from Oxford. Why it matters: Understanding neural network scaling is crucial for developing more efficient and powerful AI models in the region.
This talk discusses the asymptotic study of large asymmetric spiked tensor models. It explores connections between these models and equivalent random matrices constructed through contractions of the original tensor. Mohamed El Amine Seddik, currently a senior researcher at TII in Abu Dhabi, presented the work. Why it matters: The research provides theoretical foundations relevant to machine learning algorithms that leverage low-rank tensor structures, potentially impacting AI research and applications in the region.
This article discusses approximating a high-dimensional distribution using Gaussian variational inference by minimizing Kullback-Leibler divergence. It builds upon previous research and approximates the minimizer using a Gaussian distribution with specific mean and variance. The study details approximation accuracy and applicability using efficient dimension, relevant for analyzing sampling schemes in optimization. Why it matters: This theoretical research can inform the development of more efficient and accurate AI algorithms, particularly in areas dealing with high-dimensional data such as machine learning and data analysis.