MBZUAI Visiting Professor Haiyan Huang is working on bridging biology and AI by incorporating domain knowledge into modeling frameworks. She combines statistical principles, AI tools, and domain expertise to develop scientifically informed and statistically grounded methods. Her work addresses the challenge of extracting meaningful signals from complex biological data.
MBZUAI Visiting Professor Haiyan Huang is working on bridging biology and AI by incorporating domain knowledge into modeling frameworks. She combines statistical principles, AI tools, and domain expertise to develop scientifically informed and statistically grounded methods. Her work addresses the challenge of extracting meaningful signals from complex biological data. Why it matters: This interdisciplinary approach can lead to more accurate and useful AI models for biological research and healthcare applications in the region.
Researchers at the Rosalind Franklin Institute are using generative AI, including GANs, to augment limited biological datasets, specifically mirtron data from mirtronDB. The synthetic data created mimics real-world samples, facilitating more comprehensive training of machine learning models, leading to improved mirtron identification tools. They also plan to apply Large Language Models (LLMs) to predict unknown patterns in sequence and structure biology problems. Why it matters: This research explores AI techniques to tackle data scarcity in biological research, potentially accelerating discoveries in noncoding RNA and transposable elements.
MBZUAI Professor Kun Zhang is working on applying AI to understand cause-and-effect relationships in biology, with the goal of accelerating scientific discovery and improving human health. He aims to develop foundation models for biology that can process diverse data types and provide insights into the causes and treatments of health problems. These models could help scientists develop new medicines and preventative measures for diseases. Why it matters: This research has the potential to significantly advance the field of medicine by enabling a deeper understanding of the complex biological processes that underlie disease.
Professor Eran Segal presented The Human Phenotype Project, a longitudinal cohort study with over 10,000 participants. The project aims to identify molecular markers and develop prediction models for disease using deep profiling techniques including medical history, lifestyle, blood tests, and microbiome analysis. The study provides insights into drivers of obesity, diabetes, and heart disease, identifying novel markers at the microbiome, metabolite, and immune system level. Why it matters: Such large-scale phenotyping initiatives could inform personalized medicine approaches relevant to the Middle East's specific health challenges.
KAUST alumna Sara Althubaiti (M.S. '18) is now a computer science Ph.D. student in the Bio-Ontology Research Group, focusing on using AI to prioritize cancer mutations and predict new disease treatments. Her work involves understanding the relationship between drug side effects and disease phenotypes. Althubaiti aims to continue in academia after her Ph.D., contributing to research in Saudi universities. Why it matters: This highlights KAUST's role in fostering local talent and contributing to advancements in AI-driven healthcare research within the Kingdom.
The AI4Bio Workshop at MBZUAI explored the intersection of AI and biology, focusing on AI-driven virtual organisms and foundation models. Eric Xing presented his vision of using AI to simulate biological activities, offering a safer alternative to physical experiments. Researchers like Le Song and Jen Philippe Vert are developing foundation models for biological systems, enhancing drug discovery and bioengineering. Why it matters: This signals the growing importance of AI in advancing biological research and healthcare innovation within the UAE and globally.
The paper introduces Guided Deep List, a tool for automating the generation of epidemiological line lists from open source reports. The tool uses distributed vector representations and dependency parsing to extract tabular data on disease outbreaks. It was evaluated on MERS outbreak data in Saudi Arabia, demonstrating improved accuracy over baseline methods and enabling epidemiological inferences.