Skip to content
GCC AI Research

Search

Results for "data aggregation"

Scalable Community Detection in Massive Networks Using Aggregated Relational Data

MBZUAI ·

A new mini-batch strategy using aggregated relational data is proposed to fit the mixed membership stochastic blockmodel (MMSB) to large networks. The method uses nodal information and stochastic gradients of bipartite graphs for scalable inference. The approach was applied to a citation network with over two million nodes and 25 million edges, capturing explainable structure. Why it matters: This research enables more efficient community detection in massive networks, which is crucial for analyzing complex relationships in various domains, but this article has no clear connection to the Middle East.

Managing and Analyzing Big Traffic Data — An Uncertain Time Series Approach

MBZUAI ·

This article discusses the application of uncertain time series (UTS) approach to manage and analyze big traffic data for high-resolution vehicular transportation services. The study addresses challenges such as data sparseness, decision-making among multiple UTSs, and future forecasting with spatio-temporal correlations. Jilin Hui, previously a Research Associate at the Inception Institute of Artificial Intelligence (UAE), is applying this approach to solve problems related to increased congestion, greenhouse gas emissions, and reduced air quality in urban environments. Why it matters: The application of AI techniques to traffic management could significantly improve urban mobility and environmental sustainability in the GCC region and beyond.

Student Focus: Gaurav Agarwal

KAUST ·

Gaurav Agarwal, a statistics Ph.D. student in the Environmental Statistics Group at KAUST, is researching statistical methods with environmental applications, such as understanding salt tolerance in plants. He is developing a user-friendly web application to make these methods accessible to those with limited statistical backgrounds. Agarwal also focuses on data visualization and outlier detection techniques for quality control of radiosonde wind data. Why it matters: This research contributes to environmental science by providing accessible statistical tools and methods for analyzing complex environmental data, potentially aiding in addressing challenges like plant resilience and climate monitoring.

Fact checking with ChatGPT

MBZUAI ·

A new paper from MBZUAI researchers explores using ChatGPT to combat the spread of fake news. The researchers, including Preslav Nakov and Liangming Pan, demonstrate that ChatGPT can be used to fact-check published information. Their paper, "Fact-Checking Complex Claims with Program-Guided Reasoning," was accepted at ACL 2023. Why it matters: This research highlights the potential of large language models to address the growing challenge of misinformation, with implications for maintaining information integrity in the digital age.

The complexities of identifying causality in the real world: A new study presented at ICML

MBZUAI ·

MBZUAI researchers presented a study at ICML 2024 examining how data aggregation distorts causal discovery. The study argues that current methods are misled because real-world interactions happen at a micro level while observations are aggregated. Using the example of ice cream sales and temperature, they highlight how aggregation introduces "instantaneous causality" where time-lags exist. Why it matters: The research identifies a fundamental limitation in current causal discovery methods, potentially impacting disciplines relying on accurate causal inference from observational data.

Building Planetary-Scale Collaborative Intelligence

MBZUAI ·

Sai Praneeth Karimireddy from UC Berkeley presented a talk on building planetary-scale collaborative intelligence, highlighting the challenges of using distributed data in machine learning due to data silos and ethical-legal restrictions. He proposed collaborative systems like federated learning as a solution to bring together distributed data while respecting privacy. The talk addressed the need for efficiency, reliability, and management of divergent goals in these systems, suggesting the use of tools from optimization, statistics, and economics. Why it matters: Collaborative AI systems can unlock valuable distributed data in the region, especially in sensitive sectors like healthcare, while ensuring privacy and addressing ethical concerns.

Bring Your Own Kernel! Constructing High-Performance Data Management Systems from Components

MBZUAI ·

Holger Pirk from Imperial College London is developing a novel approach to data management system composition called BOSS. The system uses a homoiconic representation of data and code and partial evaluation of queries by components, drawing inspiration from compiler-construction research. BOSS achieves a fully composable design that effectively combines different data models, hardware platforms, and processing engines, enabling features like GPU acceleration and generative data cleaning with minimal overhead. Why it matters: This research on composable database systems can broaden the applicability of data management techniques in the GCC region, enabling more flexible and efficient data processing for various applications.

Short-Term Traffic Forecasting Using High-Resolution Traffic Data

arXiv ·

Researchers developed a data-driven toolkit for short-term traffic forecasting using high-resolution traffic data from urban road sensors. The method models forecasting as a matrix completion problem, mapping inputs to a higher-dimensional space using kernels and adaptive boosting. Validated using real-world data from Abu Dhabi, UAE, the method outperforms state-of-the-art algorithms.