The paper introduces the Unscented Autoencoder (UAE), a novel deep generative model based on the Variational Autoencoder (VAE) framework. The UAE uses the Unscented Transform (UT) for a more informative posterior representation compared to the reparameterization trick in VAEs. It replaces Kullback-Leibler (KL) divergence with the Wasserstein distribution metric and demonstrates competitive performance in Fréchet Inception Distance (FID) scores.
This paper introduces a mutually-regularized dual collaborative variational auto-encoder (MD-CVAE) for recommendation systems, addressing the limitations of user-oriented auto-encoders (UAEs) in handling sparse ratings and new items. MD-CVAE integrates item content and user ratings within a variational framework, regularizing UAE weights with item content to avoid non-optimal convergence. A symmetric inference strategy eliminates the need for retraining when introducing new items, enhancing efficiency in dynamic recommendation scenarios. Why it matters: The MD-CVAE approach offers a practical solution for improving recommendation accuracy and efficiency, especially in scenarios with data sparsity and frequent item updates, relevant to e-commerce and content platforms in the Middle East.
This article discusses approximating a high-dimensional distribution using Gaussian variational inference by minimizing Kullback-Leibler divergence. It builds upon previous research and approximates the minimizer using a Gaussian distribution with specific mean and variance. The study details approximation accuracy and applicability using efficient dimension, relevant for analyzing sampling schemes in optimization. Why it matters: This theoretical research can inform the development of more efficient and accurate AI algorithms, particularly in areas dealing with high-dimensional data such as machine learning and data analysis.
This paper introduces a self-supervised learning method for point cloud analysis using an upsampling autoencoder (UAE). The model uses subsampling and an encoder-decoder architecture to reconstruct the original point cloud, learning both semantic and geometric information. Experiments show the UAE outperforms existing methods in shape classification, part segmentation, and point cloud upsampling tasks.
The paper introduces the Prism Hypothesis, which posits a correspondence between an encoder's feature spectrum and its functional role, with semantic encoders capturing low-frequency components and pixel encoders retaining high-frequency information. Based on this, the authors propose Unified Autoencoding (UAE), a model that harmonizes semantic structure and pixel details using a frequency-band modulator. Experiments on ImageNet and MS-COCO demonstrate that UAE effectively unifies semantic abstraction and pixel-level fidelity, achieving state-of-the-art performance.
This paper introduces Diffusion-BBO, a new online black-box optimization (BBO) framework that uses a conditional diffusion model as an inverse surrogate model. The framework employs an Uncertainty-aware Exploration (UaE) acquisition function to propose scores in the objective space for conditional sampling. The approach is shown theoretically to achieve a near-optimal solution and empirically outperforms existing online BBO baselines across 6 scientific discovery tasks.
The paper introduces UAE-3D, a multi-modal VAE for 3D molecule generation that compresses molecules into a unified latent space, maintaining near-zero reconstruction error. This approach simplifies latent diffusion modeling by eliminating the need to handle multi-modality and equivariance separately. Experiments on GEOM-Drugs and QM9 datasets show UAE-3D establishes new benchmarks in de novo and conditional 3D molecule generation, with significant improvements in efficiency and quality.