Search

Results for "ASR"

QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus

arXiv · Jun 24

The Qatar Computing Research Institute (QCRI) has released QASR, a 2,000-hour transcribed Arabic speech corpus collected from Aljazeera news broadcasts. The dataset features multi-dialect speech sampled at 16kHz, aligned with lightly supervised transcriptions and linguistically motivated segmentation. QCRI also released a 130M word dataset to improve language model training. Why it matters: QASR enables new research in Arabic speech recognition, dialect identification, punctuation restoration, and other NLP tasks for spoken data.

Processing language like a human

MBZUAI · Invalid Date

MBZUAI's Hanan Al Darmaki is working to improve automated speech recognition (ASR) for low-resource languages, where labeled data is scarce. She notes that Arabic presents unique challenges due to dialectal variations and a lack of written resources corresponding to spoken dialects. Al Darmaki's research focuses on unsupervised speech recognition to address this gap. Why it matters: Overcoming these challenges can improve virtual assistant effectiveness across diverse languages and enable more inclusive AI applications in the Arabic-speaking world.

Machine learning improves Arabic speech transcription capabilities - MIT Technology Review

Qatar Foundation · Nov 24

MIT Technology Review reports on advancements in machine learning techniques that are significantly improving Arabic speech transcription capabilities. These developments aim to enhance the accuracy and robustness of Automatic Speech Recognition (ASR) systems for the complexities of the Arabic language, including its various dialects. The improvements are designed to overcome previous challenges in processing diverse phonetic patterns and linguistic nuances. Why it matters: This progress is vital for the development of more effective voice-enabled technologies, accessibility tools, and AI applications specifically tailored for Arabic-speaking populations in the Middle East and beyond.