MBZUAI Assistant Professor Samuel Horváth is researching federated learning to address the tension between data privacy and the predictive power of machine learning models. Federated learning trains models on decentralized data, keeping sensitive information on devices. Horváth's research focuses on designing algorithms that can efficiently train on distributed data while respecting user privacy. Why it matters: This work is crucial for advancing AI in sensitive domains like healthcare, where privacy regulations limit centralized data collection.
Researchers from MBZUAI introduce Forget-MI, a machine unlearning method tailored for multimodal medical data, enhancing privacy by removing specific patient data from AI models. Forget-MI utilizes loss functions and perturbation techniques to unlearn both unimodal and joint data representations. The method demonstrates superior performance in reducing Membership Inference Attacks and improving data removal compared to existing techniques, while preserving overall model performance and enabling data forgetting.
Xi Chen from NYU Stern gave a talk at MBZUAI on digital privacy in personalized pricing using differential privacy. The talk also covered research in Web3 and decentralized finance, including delta hedging liquidity positions on Uniswap V3. Chen highlighted open problems in decentralized finance during the presentation. Why it matters: The talk suggests MBZUAI's interest in exploring the intersection of AI, privacy, and blockchain technologies, reflecting growing trends in data protection and decentralized systems.
MBZUAI researchers Darya Taratynova and Shahad Hardan developed Forget-MI, a method for making clinical AI models "unlearn" specific patient data without retraining the entire model. Forget-MI addresses the challenge of removing patient data from AI models trained on multimodal records (like chest X-rays and reports) due to regulations like GDPR and HIPAA. The method unlearns both unimodal (image or text) and joint (image-text) associations while retaining overall accuracy using a late-fusion multimodal classifier. Why it matters: This research provides a practical solution to a critical privacy concern in healthcare AI, enabling compliance with data protection regulations and fostering trust in AI-driven medical applications.
A new dataset called the Saudi Privacy Policy Dataset is introduced, which contains Arabic privacy policies from various sectors in Saudi Arabia. The dataset is annotated based on the 10 principles of the Personal Data Protection Law (PDPL) and includes 1,000 websites, 4,638 lines of text, and 775,370 tokens. The dataset aims to facilitate research and development in privacy policy analysis, NLP, and machine learning applications related to data protection.
MBZUAI researchers developed FeSViBS, a new federated split learning technique for vision transformers that addresses data scarcity and privacy concerns in healthcare image classification. The method combines federated learning and split learning to train models collaboratively without sharing sensitive patient data directly. It overcomes limitations of traditional centralized training and vulnerabilities in federated learning. Why it matters: This approach enables the development of AI-powered healthcare applications while adhering to stringent data privacy regulations, unlocking the potential of machine learning in medical imaging.