A researcher at the University of Oxford presented new findings on 3D neural reconstruction. The talk introduced a dataset comprising real-world video captures with perfect 3D models. A novel joint optimization method refines camera poses during the reconstruction process. Why it matters: High-quality 3D reconstruction has broad applicability to robotics and computer vision applications in the region.
Egor Zakharov from ETH Zurich AIT lab will present research on creating controllable and detailed 3D head avatars using data from consumer-grade devices. The presentation will cover high-fidelity image-based facial reconstruction/animation and video-based reconstruction of detailed structures like hairstyles. He will showcase integrating human-centric assets into virtual environments for real-time telepresence and entertainment. Why it matters: This research contributes to advancements in digital human modeling and telepresence, with applications in communication and gaming within the region.
The paper introduces the Prism Hypothesis, which posits a correspondence between an encoder's feature spectrum and its functional role, with semantic encoders capturing low-frequency components and pixel encoders retaining high-frequency information. Based on this, the authors propose Unified Autoencoding (UAE), a model that harmonizes semantic structure and pixel details using a frequency-band modulator. Experiments on ImageNet and MS-COCO demonstrate that UAE effectively unifies semantic abstraction and pixel-level fidelity, achieving state-of-the-art performance.
Pascal Fua from EPFL presented an approach to implementing convolutional neural nets that output complex 3D surface meshes. The method overcomes limitations in converting implicit representations to explicit surface representations. Applications include single view reconstruction, physically-driven shape optimization, and bio-medical image segmentation. Why it matters: This research advances geometric deep learning by enabling end-to-end trainable models for 3D surface mesh generation, with potential impact on various applications in computer vision and biomedical imaging in the region.
Ekaterina Radionova from Smarter AI (formerly Samsung AI Center) presented an approach to generating lifelike real-time avatars. The work focuses on generating high-quality video with authentic facial features to support online generation. Radionova's master's degree is from Skoltech on Data Science program and Bachelor degree at Moscow Institute of Physics and Technology on Applied Math. Why it matters: Achieving realistic real-time avatars is critical for applications in online communication, entertainment, and virtual reality within the region.