Skip to content
GCC AI Research

Search

Results for "aggregated relational data"

Scalable Community Detection in Massive Networks Using Aggregated Relational Data

MBZUAI ·

A new mini-batch strategy using aggregated relational data is proposed to fit the mixed membership stochastic blockmodel (MMSB) to large networks. The method uses nodal information and stochastic gradients of bipartite graphs for scalable inference. The approach was applied to a citation network with over two million nodes and 25 million edges, capturing explainable structure. Why it matters: This research enables more efficient community detection in massive networks, which is crucial for analyzing complex relationships in various domains, but this article has no clear connection to the Middle East.

A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation

arXiv ·

This paper introduces a unified deep autoregressive model (UAE) for cardinality estimation that learns joint data distributions from both data and query workloads. It uses differentiable progressive sampling with the Gumbel-Softmax trick to incorporate supervised query information into the deep autoregressive model. Experiments show UAE achieves better accuracy and efficiency compared to state-of-the-art methods.

Duet: efficient and scalable hybriD neUral rElation undersTanding

arXiv ·

The paper introduces Duet, a hybrid neural relation understanding method for cardinality estimation. Duet addresses limitations of existing learned methods, such as high costs and scalability issues, by incorporating predicate information into an autoregressive model. Experiments demonstrate Duet's efficiency, accuracy, and scalability, even outperforming GPU-based methods on CPU.

CTRL: Closed-Loop Data Transcription via Rate Reduction

MBZUAI ·

A talk introduces a computational framework for learning a compact structured representation for real-world datasets, that is both discriminative and generative. It proposes to learn a closed-loop transcription between the distribution of a high-dimensional multi-class dataset and an arrangement of multiple independent subspaces, known as a linear discriminative representation (LDR). The optimality of the closed-loop transcription can be characterized in closed-form by an information-theoretic measure known as the rate reduction. Why it matters: The framework unifies concepts and benefits of auto-encoding and GAN and generalizes them to the settings of learning a both discriminative and generative representation for multi-class visual data.

Explainable Fact Checking for Statistical and Property Claims

MBZUAI ·

EURECOM researchers developed data-driven verification methods using structured datasets to assess statistical and property claims. The approach translates text claims into SQL queries on relational databases for statistical claims. For property claims, they use knowledge graphs to verify claims and generate explanations. Why it matters: The methods aim to support fact-checkers by efficiently labeling claims with interpretable explanations, potentially combating misinformation in the region and beyond.

Optimizing AI Systems through Cross-Layer Design: A Data-Centric Approach

MBZUAI ·

A Duke University professor presented a data-centric approach to optimizing AI systems by addressing the memory capacity and bandwidth bottleneck. The presentation covered collaborative optimization across algorithms, systems, architecture, and circuit layers. It also explored compute-in-memory as a solution for integrating computation and memory. Why it matters: Optimizing AI systems through a data-centric approach can improve efficiency and performance, critical for advancing AI applications in the region.