Search

Results for "sparse activation"

Understanding the mixture of the expert layer in Deep Learning

MBZUAI · Invalid Date

A Mixture of Experts (MoE) layer is a sparsely activated deep learning layer. It uses a router network to direct each token to one of the experts. Yuanzhi Li, an assistant professor at CMU and affiliated faculty at MBZUAI, researches deep learning theory and NLP. Why it matters: This highlights MBZUAI's engagement with cutting-edge deep learning research, specifically in efficient model design.

YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation

arXiv · Jan 13

The paper introduces Yet another Policy Optimization (YaPO), a reference-free method for learning sparse steering vectors in the latent space of a Sparse Autoencoder (SAE) to steer LLMs. By optimizing sparse codes, YaPO produces disentangled, interpretable, and efficient steering directions. Experiments show YaPO converges faster, achieves stronger performance, exhibits improved training stability and preserves general knowledge compared to dense steering baselines.

Training Deep Neural Networks in Tiny Subspaces

MBZUAI · Invalid Date

Xiaolin Huang from Shanghai Jiao Tong University presented a talk at MBZUAI on training deep neural networks in tiny subspaces. The talk covered the low-dimension hypothesis in neural networks and methods to find subspaces for efficient training. It suggests that training in smaller subspaces can improve training efficiency, generalization, and robustness. Why it matters: Investigating efficient training methods is crucial for resource-constrained environments and can enable broader access to advanced AI.

Emulating the energy efficiency of the brain

MBZUAI · Invalid Date

MBZUAI researchers are developing spiking neural networks (SNNs) to emulate the energy efficiency of the human brain. Traditional deep learning models like those powering ChatGPT consume significant energy, with a single query using 3.96 watts. SNNs aim to mimic biological neurons more closely to reduce energy consumption, as the human brain uses only a fraction of the energy compared to these models. Why it matters: This research could lead to more sustainable and energy-efficient AI technologies, addressing a major challenge in deploying large-scale AI systems.

Can AI Learn Like Us? Unveiling the Secrets of Spiking Neural Networks

MBZUAI · Invalid Date

MBZUAI Ph.D. graduate Hilal Mohammad Hilal AlQuabeh researched methods to improve the efficiency of machine learning algorithms, specifically focusing on pairwise learning and multi-instance learning. Pairwise learning teaches AI to make decisions by comparing options in pairs, useful for ranking and anomaly detection. Multi-instance learning involves learning from sets of data points, applicable in areas like drug discovery. Why it matters: Optimizing AI for low-resource environments expands its accessibility and applicability in critical sectors like healthcare and remote area operations.