Search

Results for "neural architecture search"

MedNNS: Supernet-based Medical Task-Adaptive Neural Network Search

arXiv · Apr 22

The paper introduces MedNNS, a neural network search framework designed for medical imaging, addressing challenges in architecture selection and weight initialization. MedNNS constructs a meta-space encoding datasets and models based on their performance using a Supernetwork-based approach, expanding the model zoo size by 51x. The framework incorporates rank loss and Fréchet Inception Distance (FID) loss to capture inter-model and inter-dataset relationships, improving alignment in the meta-space and outperforming ImageNet pre-trained DL models and SOTA NAS methods.

Uncertainty Modeling of Emerging Device-based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search

arXiv · Jul 6

This paper analyzes the impact of device uncertainties on deep neural networks (DNNs) in emerging device-based Computing-in-memory (CiM) systems. The authors propose UAE, an uncertainty-aware Neural Architecture Search scheme, to identify DNN models robust to these uncertainties. The goal is to mitigate accuracy drops when deploying trained models on real-world platforms.

Low-Complexity NN Technology: Model and Precision Search, Acceleration Circuit, and Applications

MBZUAI · Invalid Date

Researchers at National Taiwan University are developing low-complexity neural network technologies using quantization to reduce model size while maintaining accuracy. Their work includes binary-weighted CNNs and transformers, along with a neural architecture search scheme (TPC-NAS) applied to image recognition, object detection, and NLP tasks. They have also built a PE-based CNN/transformer hardware accelerator in Xilinx FPGA SoC with a PyTorch-based software framework. Why it matters: This research provides practical methods for deploying efficient deep learning models on resource-constrained hardware, potentially enabling broader adoption of AI in embedded systems and edge devices.

Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization

arXiv · Jun 30

This paper introduces Diffusion-BBO, a new online black-box optimization (BBO) framework that uses a conditional diffusion model as an inverse surrogate model. The framework employs an Uncertainty-aware Exploration (UaE) acquisition function to propose scores in the objective space for conditional sampling. The approach is shown theoretically to achieve a near-optimal solution and empirically outperforms existing online BBO baselines across 6 scientific discovery tasks.

Beyond Attention: Orchid’s Adaptive Convolutions for Next-Level Sequence Modeling

MBZUAI · Invalid Date

A new neural network architecture called Orchid was introduced that uses adaptive convolutions to achieve quasilinear computational complexity O(N logN) for sequence modeling. Orchid adapts its convolution kernel dynamically based on the input sequence. Evaluations across language modeling and image classification show that Orchid outperforms attention-based architectures like BERT and Vision Transformers, often with smaller model sizes. Why it matters: Orchid extends the feasible sequence length beyond the practical limits of dense attention layers, representing progress toward more efficient and scalable deep learning models.

Reliability Exploration of Neural Network Accelerator

MBZUAI · Invalid Date

This article discusses the reliability of Deep Neural Networks (DNNs) and their hardware platforms, especially regarding soft errors caused by cosmic rays. It highlights that while DNNs are robust against bit flips, errors can still lead to miscalculations in AI accelerators. The talk, led by Prof. Masanori Hashimoto from Kyoto University, will cover identifying vulnerabilities in neural networks and reliability exploration of AI accelerators for edge computing. Why it matters: As DNNs are deployed in safety-critical applications in the region, ensuring the reliability of AI hardware is crucial for safe and trustworthy operation.

Deep Ensembles Work, But Are They Necessary?

MBZUAI · Invalid Date

A recent study questions the necessity of deep ensembles, which improve accuracy and match larger models. The study demonstrates that ensemble diversity does not meaningfully improve uncertainty quantification on out-of-distribution data. It also reveals that the out-of-distribution performance of ensembles is strongly determined by their in-distribution performance. Why it matters: The findings suggest that larger, single neural networks can replicate the benefits of deep ensembles, potentially simplifying model deployment and reducing computational costs in the region.

How MedNNS picks the right AI model for each type of hospital scan

MBZUAI · Invalid Date

MBZUAI researchers are introducing MedNNS, a system to be presented at MICCAI 2025, that recommends the best AI architecture and initialization for medical imaging tasks. MedNNS addresses the challenge of inefficient trial-and-error in building medical imaging AI by reframing model selection as a retrieval problem. The system employs a Once-For-All ResNet-like model and a learned meta-space of 720k model-dataset pairs, using dataset embeddings to predict optimal model performance. Why it matters: By automating model selection, MedNNS promises to significantly reduce the time and resources required to develop effective AI solutions for healthcare, particularly in medical imaging.