Javaabu releases 16.5-hour Dhivehi speech dataset to support AI research

Javaabu releases 16.5-hour Dhivehi speech dataset to support AI research

Malé, Maldives, July 23, 2023 Javaabu has released a new single-speaker Dhivehi speech dataset designed to support researchers and developers working on speech technologies for the Dhivehi language. The dataset contains approximately 16.5 hours of narrated audio contributed by voice artist Muhammad Shaafiu.

The team says the dataset is intended to accelerate work on Dhivehi speech-to-text (STT) and automatic speech recognition (ASR) systems, which remain underdeveloped compared to technologies available for larger languages. By making high-quality Dhivehi audio resources available to the public, Javaabu aims to help advance local AI research and strengthen the ecosystem around Dhivehi digital tools.

The dataset is publicly accessible through Hugging Face, one of the world’s leading repositories for machine learning resources.