Holger Pirk from Imperial College London is developing a novel approach to data management system composition called BOSS. The system uses a homoiconic representation of data and code and partial evaluation of queries by components, drawing inspiration from compiler-construction research. BOSS achieves a fully composable design that effectively combines different data models, hardware platforms, and processing engines, enabling features like GPU acceleration and generative data cleaning with minimal overhead. Why it matters: This research on composable database systems can broaden the applicability of data management techniques in the GCC region, enabling more flexible and efficient data processing for various applications.
This paper introduces a unified deep autoregressive model (UAE) for cardinality estimation that learns joint data distributions from both data and query workloads. It uses differentiable progressive sampling with the Gumbel-Softmax trick to incorporate supervised query information into the deep autoregressive model. Experiments show UAE achieves better accuracy and efficiency compared to state-of-the-art methods.
The paper introduces Duet, a hybrid neural relation understanding method for cardinality estimation. Duet addresses limitations of existing learned methods, such as high costs and scalability issues, by incorporating predicate information into an autoregressive model. Experiments demonstrate Duet's efficiency, accuracy, and scalability, even outperforming GPU-based methods on CPU.
MBZUAI faculty Eric Xing and Qirong Ho are developing AI operating systems (AI OS) for efficient AI development, similar to mobile OS. They co-founded AI startup Petuum and lead the CASL community, which focuses on composable, automatic, and scalable learning. CASL provides a unified toolkit for distributed training and compositional model construction, with contributions from MBZUAI, CMU, Berkeley, and Stanford. Why it matters: The development of AI OS aims to optimize AI applications by efficiently connecting software and hardware, fostering innovation and broader adoption of AI solutions across industries in the region.
A Duke University professor presented a data-centric approach to optimizing AI systems by addressing the memory capacity and bandwidth bottleneck. The presentation covered collaborative optimization across algorithms, systems, architecture, and circuit layers. It also explored compute-in-memory as a solution for integrating computation and memory. Why it matters: Optimizing AI systems through a data-centric approach can improve efficiency and performance, critical for advancing AI applications in the region.
A presentation discusses using programmable network devices to reduce communication bottlenecks in distributed deep learning. It explores in-network aggregation and data processing to lower memory needs and increase bandwidth usage. The talk also covers gradient compression and the potential role of programmable NICs. Why it matters: Optimizing distributed deep learning infrastructure is critical for scaling AI model training in resource-constrained environments.
EURECOM researchers developed data-driven verification methods using structured datasets to assess statistical and property claims. The approach translates text claims into SQL queries on relational databases for statistical claims. For property claims, they use knowledge graphs to verify claims and generate explanations. Why it matters: The methods aim to support fact-checkers by efficiently labeling claims with interpretable explanations, potentially combating misinformation in the region and beyond.
Qirong Ho, co-founder and CTO of Petuum Inc., will be contributing to the "ML Systems for Many" initiative. Petuum is recognized for creating standardized building blocks for AI assembly. Ho also holds a Ph.D. from Carnegie Mellon University and is part of the CASL open-source consortium. Why it matters: Showcases the ongoing efforts to democratize AI development and deployment, making it more accessible and sustainable, although the specific initiative is not further detailed.