Search

Results for "heavy-tailed distributions"

SGD from the Lens of Markov process: An Algorithmic Stability Perspective

MBZUAI · Invalid Date

A Marie Curie Fellow from Inria and UIUC presented research on stochastic gradient descent (SGD) through the lens of Markov processes, exploring the relationships between heavy-tailed distributions, generalization error, and algorithmic stability. The research challenges existing theories about the monotonic relationship between heavy tails and generalization error. It introduces a unified approach for proving Wasserstein stability bounds in stochastic optimization, applicable to convex and non-convex losses. Why it matters: The work provides novel insights into the theoretical underpinnings of stochastic optimization, relevant to researchers at MBZUAI and other institutions in the region working on machine learning algorithms.

Neural Bayes estimators for censored inference with peaks-over-threshold models

arXiv · Jun 27

This paper introduces neural Bayes estimators for censored peaks-over-threshold models, enhancing computational efficiency in spatial extremal dependence modeling. The method uses data augmentation to encode censoring information in the neural network input, challenging traditional likelihood-based approaches. The estimators were applied to assess extreme particulate matter concentrations over Saudi Arabia, demonstrating efficacy in high-dimensional models. Why it matters: The research offers a computationally efficient alternative for environmental modeling and risk assessment in the region.

Gaussian Variational Inference in high dimension

MBZUAI · Invalid Date

This article discusses approximating a high-dimensional distribution using Gaussian variational inference by minimizing Kullback-Leibler divergence. It builds upon previous research and approximates the minimizer using a Gaussian distribution with specific mean and variance. The study details approximation accuracy and applicability using efficient dimension, relevant for analyzing sampling schemes in optimization. Why it matters: This theoretical research can inform the development of more efficient and accurate AI algorithms, particularly in areas dealing with high-dimensional data such as machine learning and data analysis.

Adapting to Distribution Shifts: Recent Advances in Importance Weighting Methods

MBZUAI · Invalid Date

This article discusses distribution shifts in machine learning and the use of importance weighting methods to address them. Masashi Sugiyama from the University of Tokyo and RIKEN AIP presented recent advances in importance-based distribution shift adaptation methods. The talk covered joint importance-predictor estimation, dynamic importance weighting, and multistep class prior shift adaptation. Why it matters: Understanding and mitigating distribution shifts is crucial for deploying robust and reliable AI models in real-world scenarios within the GCC region and beyond.

Self-supervised DNA models and scalable sequence processing with memory augmented transformers

MBZUAI · Invalid Date

Dr. Mikhail Burtsev of the London Institute presented research on GENA-LM, a suite of transformer-based DNA language models. The talk addressed the challenge of scaling transformers for genomic sequences, proposing recurrent memory augmentation to handle long input sequences efficiently. This approach improves language modeling performance and holds promise for memory-intensive applications in bioinformatics. Why it matters: This research can significantly advance AI's capabilities in genomics by enabling the processing of much larger DNA sequences, with potential breakthroughs in understanding and treating diseases.

Scalable Community Detection in Massive Networks Using Aggregated Relational Data

MBZUAI · Invalid Date

A new mini-batch strategy using aggregated relational data is proposed to fit the mixed membership stochastic blockmodel (MMSB) to large networks. The method uses nodal information and stochastic gradients of bipartite graphs for scalable inference. The approach was applied to a citation network with over two million nodes and 25 million edges, capturing explainable structure. Why it matters: This research enables more efficient community detection in massive networks, which is crucial for analyzing complex relationships in various domains, but this article has no clear connection to the Middle East.

Bruteforce computing is the next “winter of AI”

MBZUAI · Invalid Date

Prof. Mérouane Debbah of the Technology Innovation Institute (TII) warns that current AI development relies on unsustainable, energy-intensive "bruteforce computing." He argues that the field needs more energy-efficient algorithms instead of simply scaling up GPUs. Debbah suggests neuromorphic computing as a potential solution, drawing inspiration from the human brain's energy efficiency. Why it matters: This critique highlights a crucial sustainability challenge for AI in the GCC and globally, as the region invests heavily in compute-intensive AI models.

Big language models (LLMs) such as ChatGPT and Gemini led the first wave of the artificial intellig.. - 매일경제

The National · Mar 23

The article discusses the rise of large language models like ChatGPT and Gemini. It highlights their role in driving the first wave of AI development. Why it matters: While lacking specifics, the article suggests ongoing interest in the impact and future of LLMs, a key area of AI research and development.