Software-Directed Hardware Reliability for ML Systems

MBZUAI · Notable

Research Hardware ML Ethics Infrastructure

Summary

Abdulrahman Mahmoud, a postdoctoral fellow at Harvard University, discusses software-directed tools and techniques for processor design and reliability enhancement in ML systems. He emphasizes the need for a nuanced approach to numerical data formats supported by robust hardware. He advocates for integrating reliability as a foundational element in the design process. Why it matters: This research addresses the critical challenge of hardware reliability in AI processors, particularly relevant as the field moves towards hardware-software co-design for sustained growth.

Keywords

hardware · reliability · ML systems · processor design · software-directed

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Reliability Exploration of Neural Network Accelerator

MBZUAI · Invalid Date

This article discusses the reliability of Deep Neural Networks (DNNs) and their hardware platforms, especially regarding soft errors caused by cosmic rays. It highlights that while DNNs are robust against bit flips, errors can still lead to miscalculations in AI accelerators. The talk, led by Prof. Masanori Hashimoto from Kyoto University, will cover identifying vulnerabilities in neural networks and reliability exploration of AI accelerators for edge computing. Why it matters: As DNNs are deployed in safety-critical applications in the region, ensuring the reliability of AI hardware is crucial for safe and trustworthy operation.

Uncertainty Modeling of Emerging Device-based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search

arXiv · Jul 6

This paper analyzes the impact of device uncertainties on deep neural networks (DNNs) in emerging device-based Computing-in-memory (CiM) systems. The authors propose UAE, an uncertainty-aware Neural Architecture Search scheme, to identify DNN models robust to these uncertainties. The goal is to mitigate accuracy drops when deploying trained models on real-world platforms.

Hardware Security through the Lens of Dr ML

MBZUAI · Invalid Date

NYU Abu Dhabi hosted a talk by Prof. Debdeep Mukhopadhyay on the intersection of machine learning and hardware security. The talk covered using ML/DL for side-channel attacks, leakage assessment in crypto-devices, and threats to hardware security primitives. Prof. Mukhopadhyay is a visiting professor at NYU Abu Dhabi and Institute Chair Professor at IIT Kharagpur. Why it matters: The talk highlights the growing importance of hardware security in modern systems and the role of machine learning in both attacking and defending hardware vulnerabilities.

Optimizing AI Systems through Cross-Layer Design: A Data-Centric Approach

MBZUAI · Invalid Date

A Duke University professor presented a data-centric approach to optimizing AI systems by addressing the memory capacity and bandwidth bottleneck. The presentation covered collaborative optimization across algorithms, systems, architecture, and circuit layers. It also explored compute-in-memory as a solution for integrating computation and memory. Why it matters: Optimizing AI systems through a data-centric approach can improve efficiency and performance, critical for advancing AI applications in the region.

Software-Directed Hardware Reliability for ML Systems

Summary

Keywords

Related

Reliability Exploration of Neural Network Accelerator

Uncertainty Modeling of Emerging Device-based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search

Hardware Security through the Lens of Dr ML

Optimizing AI Systems through Cross-Layer Design: A Data-Centric Approach