Researchers at National Taiwan University are developing low-complexity neural network technologies using quantization to reduce model size while maintaining accuracy. Their work includes binary-weighted CNNs and transformers, along with a neural architecture search scheme (TPC-NAS) applied to image recognition, object detection, and NLP tasks. They have also built a PE-based CNN/transformer hardware accelerator in Xilinx FPGA SoC with a PyTorch-based software framework. Why it matters: This research provides practical methods for deploying efficient deep learning models on resource-constrained hardware, potentially enabling broader adoption of AI in embedded systems and edge devices.
The paper introduces Sparse-Quantized Representation (SpQR), a new compression format and quantization technique for large language models (LLMs). SpQR identifies outlier weights and stores them in higher precision while compressing the remaining weights to 3-4 bits. The method achieves less than 1% accuracy loss in perplexity for LLaMA and Falcon LLMs and enables a 33B parameter LLM to run on a single 24GB consumer GPU. Why it matters: This enables near-lossless compression of LLMs, making powerful models accessible on resource-constrained devices and accelerating inference without significant accuracy degradation.
MBZUAI Assistant Professors Bin Gu and Huan Xiong are advancing spiking neural networks (SNNs) to improve computational power and energy efficiency. They will present their latest research on SNNs at the 38th Annual AAAI Conference on Artificial Intelligence in Vancouver. SNNs process information in discrete events, mimicking biological neurons and offering improved energy efficiency compared to traditional neural networks. Why it matters: This research could enable running advanced AI applications like GPTs on mobile devices, unlocking their full potential due to the energy efficiency of SNNs.
MBZUAI researchers are developing spiking neural networks (SNNs) to emulate the energy efficiency of the human brain. Traditional deep learning models like those powering ChatGPT consume significant energy, with a single query using 3.96 watts. SNNs aim to mimic biological neurons more closely to reduce energy consumption, as the human brain uses only a fraction of the energy compared to these models. Why it matters: This research could lead to more sustainable and energy-efficient AI technologies, addressing a major challenge in deploying large-scale AI systems.
MBZUAI Ph.D. graduate Hilal Mohammad Hilal AlQuabeh researched methods to improve the efficiency of machine learning algorithms, specifically focusing on pairwise learning and multi-instance learning. Pairwise learning teaches AI to make decisions by comparing options in pairs, useful for ranking and anomaly detection. Multi-instance learning involves learning from sets of data points, applicable in areas like drug discovery. Why it matters: Optimizing AI for low-resource environments expands its accessibility and applicability in critical sectors like healthcare and remote area operations.
Xiaolin Huang from Shanghai Jiao Tong University presented a talk at MBZUAI on training deep neural networks in tiny subspaces. The talk covered the low-dimension hypothesis in neural networks and methods to find subspaces for efficient training. It suggests that training in smaller subspaces can improve training efficiency, generalization, and robustness. Why it matters: Investigating efficient training methods is crucial for resource-constrained environments and can enable broader access to advanced AI.
Soufiane Hayou of the National University of Singapore presented a talk at MBZUAI on principled scaling of neural networks. The talk covered leveraging mathematical results to efficiently scale neural networks. He obtained his PhD in statistics in 2021 from Oxford. Why it matters: Understanding neural network scaling is crucial for developing more efficient and powerful AI models in the region.
A recent talk at MBZUAI discussed "Green Learning" and Operational Neural Networks (ONNs) as efficient alternatives to CNNs. ONNs use "nodal" and "pool" operators and "generative neurons" to expand neuron learning capacity. Moncef Gabbouj from Tampere University presented Self-Organized ONNs (Self-ONNs) and their signal processing applications. Why it matters: Exploring more efficient AI models is crucial for sustainable development of AI in the region, as it addresses computational resource constraints and promotes broader accessibility.