Skip to content
GCC AI Research

Search

Results for "hyperparameter optimization"

Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization

arXiv ·

This paper introduces Diffusion-BBO, a new online black-box optimization (BBO) framework that uses a conditional diffusion model as an inverse surrogate model. The framework employs an Uncertainty-aware Exploration (UaE) acquisition function to propose scores in the objective space for conditional sampling. The approach is shown theoretically to achieve a near-optimal solution and empirically outperforms existing online BBO baselines across 6 scientific discovery tasks.

MedNNS: Supernet-based Medical Task-Adaptive Neural Network Search

arXiv ·

The paper introduces MedNNS, a neural network search framework designed for medical imaging, addressing challenges in architecture selection and weight initialization. MedNNS constructs a meta-space encoding datasets and models based on their performance using a Supernetwork-based approach, expanding the model zoo size by 51x. The framework incorporates rank loss and Fréchet Inception Distance (FID) loss to capture inter-model and inter-dataset relationships, improving alignment in the meta-space and outperforming ImageNet pre-trained DL models and SOTA NAS methods.

Bayesian Optimization-based Tire Parameter and Uncertainty Estimation for Real-World Data

arXiv ·

This paper introduces a Bayesian optimization method for estimating tire parameters and their uncertainty, addressing a gap in existing literature. The methodology uses Stochastic Variational Inference to estimate parameters and uncertainties, and it is validated against a Nelder-Mead algorithm. The approach is applied to real-world data from the Abu Dhabi Autonomous Racing League, revealing uncertainties in identifying curvature and shape parameters due to insufficient excitation. Why it matters: The research provides a practical tool for assessing tire model parameters in real-world conditions, with implications for autonomous racing and vehicle dynamics modeling in the GCC region.

A new strategy for complex optimization problems in machine learning presented at ICLR

MBZUAI ·

MBZUAI researchers presented a new strategy for handling complex optimization problems in machine learning at ICLR 2024. The study, a collaboration with ISAM, combines zeroth-order methods with hard-thresholding to address specific settings in machine learning. This approach aims to improve convergence, ensuring algorithms reach quality solutions efficiently. Why it matters: Improving optimization techniques is crucial for advancing machine learning models used in various applications, potentially accelerating development and enhancing performance.

YaPO: Learnable Sparse Activation Steering Vectors for Domain Adaptation

arXiv ·

The paper introduces Yet another Policy Optimization (YaPO), a reference-free method for learning sparse steering vectors in the latent space of a Sparse Autoencoder (SAE) to steer LLMs. By optimizing sparse codes, YaPO produces disentangled, interpretable, and efficient steering directions. Experiments show YaPO converges faster, achieves stronger performance, exhibits improved training stability and preserves general knowledge compared to dense steering baselines.

Better Optimization Algorithms for Machine Learning

MBZUAI ·

Francesco Orabona from Boston University, with a PhD from the University of Genova, researches online learning, optimization, and statistical learning theory. He previously worked at Yahoo Labs and Toyota Technological Institute at Chicago. MBZUAI hosted a panel discussion (topic not specified in provided text). Why it matters: Optimization algorithms are crucial for advancing machine learning and AI, and researchers like Orabona contribute to this field.

Accelerating neural network optimization: The power of second-order methods

MBZUAI ·

MBZUAI researchers presented a new second-order method for optimizing neural networks at NeurIPS 2024. The method addresses optimization problems related to variational inequalities common in machine learning. They demonstrated that for monotone inequalities with inexact second-order derivatives, no faster second- or first-order methods can theoretically exist, supporting this with experiments. Why it matters: This research has the potential to reduce the computational cost of training large and complex neural networks, which could accelerate AI development in the region.

Performance Prediction via Bayesian Matrix Factorisation for Multilingual Natural Language Processing Tasks

MBZUAI ·

A new Bayesian matrix factorization approach is explored for performance prediction in multilingual NLP, aiming to reduce the experimental burden of evaluating various language combinations. The approach outperforms state-of-the-art methods in NLP benchmarks like machine translation and cross-lingual entity linking. It also avoids hyperparameter tuning and provides uncertainty estimates over predictions. Why it matters: Accurate performance prediction methods accelerate multilingual NLP research by reducing computational costs and improving experimental efficiency, especially valuable for Arabic NLP tasks.