Skip to content
GCC AI Research

Search

Results for "data scarcity"

Addressing NLP problems in low resource settings

MBZUAI ·

Thamar Solorio from the University of Houston will discuss machine learning approaches for spontaneous human language processing. The talk will cover adapting multilingual transformers to code-switching data and using data augmentation for domain adaptation in sequence labeling tasks. Solorio will also provide an overview of other research projects at the RiTUAL lab, focusing on the scarcity of labeled data. Why it matters: This presentation addresses key challenges in Arabic NLP related to data scarcity, which is a persistent obstacle in developing effective AI applications for the region.

Many-cell sequencing: machine learning principles and methods for moving beyond single cells to population-scale analysis

MBZUAI ·

A talk discusses the challenges of single-cell data analysis, such as feature sparsity and the effects of rare cells. AI/ML strategies are uniquely positioned to model this data. ImYoo, a startup founded in 2021, is applying single-cell model architectures for unsupervised discovery of patient groupings and predicting sample-level phenotypical data in autoimmune disease. Why it matters: This highlights the growing application of AI/ML in analyzing single-cell data for population-scale human health studies, an area ripe for innovation and improvement in the Middle East's growing biotech sector.

The role of data-driven models in quantifying uncertainty

KAUST ·

KAUST Professor Raul Tempone, an expert in Uncertainty Quantification (UQ), has been appointed as an Alexander von Humboldt Professor at RWTH Aachen University in Germany. This professorship will enable him to further his research on mathematics for uncertainty quantification with new collaborators. Tempone believes the KAUST Strategic Initiative for Uncertainty Quantification (SRI-UQ) contributed to this award. Why it matters: This appointment enhances KAUST's visibility and facilitates cross-fertilization between European and KAUST research groups, benefiting both institutions and attracting talent.

Rare and revealing: A new method for uncovering hidden patterns in data

MBZUAI ·

MBZUAI researchers have developed a new kernel-based method to identify dependence patterns in data, especially in small regions exhibiting 'rare dependence' where relationships between variables differ. The method uses sample importance reweighting, assigning more importance to regions with rare dependence. Tested on synthetic and real-world data, the algorithm successfully identified relations between variables even with rare dependence, outperforming traditional methods like HSIC. Why it matters: This advancement can improve data analysis in fields like public health, economics, genomics, and AI, enabling more accurate insights from complex observational data.

Overcoming the curse of dimensionality

MBZUAI ·

MBZUAI Professor Fakhri Karray and co-authors from the University of Waterloo have published "Elements of Dimensionality Reduction and Manifold Learning," a textbook on methods for extracting useful components from large datasets. The book addresses the challenge of the "curse of dimensionality," where growth in datasets complicates their use in machine learning. Karray developed the material from a popular course he taught at Waterloo. Why it matters: The textbook provides a unified resource for students and researchers in machine learning and AI, addressing a foundational challenge in processing high-dimensional data, relevant to diverse applications in the region.

Documenting the 'dodos' of tomorrow

KAUST ·

Dr. Gustav Paulay from the Florida Museum of Natural History spoke at KAUST in 2018 about the surprisingly low level of knowledge about marine biodiversity. He noted that only a fraction of the millions of marine species are currently known and described. Paulay highlighted the effectiveness of large-scale biodiversity surveys and the use of technology like mass sampling and DNA analysis to speed up species identification. Why it matters: Understanding and documenting marine biodiversity is crucial for conservation efforts and for leveraging the potential of marine resources in the Red Sea region and beyond.

Neural Bayes estimators for censored inference with peaks-over-threshold models

arXiv ·

This paper introduces neural Bayes estimators for censored peaks-over-threshold models, enhancing computational efficiency in spatial extremal dependence modeling. The method uses data augmentation to encode censoring information in the neural network input, challenging traditional likelihood-based approaches. The estimators were applied to assess extreme particulate matter concentrations over Saudi Arabia, demonstrating efficacy in high-dimensional models. Why it matters: The research offers a computationally efficient alternative for environmental modeling and risk assessment in the region.

Are there really plenty more fish in the sea ?

KAUST ·

KAUST researchers are developing an AI tool to classify fish species based on economic value and population growth rate, aiming to aid sustainable fisheries management in Saudi Arabia. The tool will help identify species at risk of decline, supporting marine conservation and food security goals outlined in Saudi Vision 2030. Saudi Arabia aims to increase self-sufficiency in seafood production amid declining Red Sea fish populations. Why it matters: This initiative could significantly improve fisheries management and conservation efforts in the Red Sea, informing policy decisions and supporting sustainable food production in line with national objectives.