Skip to content
GCC AI Research

NatiQ: An End-to-end Text-to-Speech System for Arabic

arXiv · · Significant research

Summary

Qatar Computing Research Institute (QCRI) has developed NatiQ, an end-to-end text-to-speech (TTS) system for Arabic utilizing encoder-decoder architectures. The system employs Tacotron-based models and Transformer models to generate mel-spectrograms, which are then synthesized into waveforms using vocoders like WaveRNN, WaveGlow, and Parallel WaveGAN. Trained on in-house speech data featuring a neutral male voice (Hamza) and an expressive female voice (Amina), NatiQ achieves a Mean Opinion Score (MOS) of 4.21 and 4.40, respectively. Why it matters: This research advances Arabic language technology, providing high-quality TTS synthesis that can enhance accessibility and usability of digital content for Arabic speakers.

Keywords

Text-to-speech · Arabic · TTS · QCRI · NatiQ

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs

arXiv ·

The Qatar Computing Research Institute (QCRI) has released SpokenNativQA, a multilingual spoken question-answering dataset for evaluating LLMs in conversational settings. The dataset contains 33,000 naturally spoken questions and answers across multiple languages, including low-resource and dialect-rich languages. It aims to address the limitations of text-based QA datasets by incorporating speech variability, accents, and linguistic diversity. Why it matters: This benchmark enables more robust evaluation of LLMs in speech-based interactions, particularly for Arabic dialects and other low-resource languages.