Open Speech and Language Resources

High quality TTS data for Javanese.

Identifier: SLR41

Summary: Multi-speaker TTS data for Javanese (jv-ID)

Category: Speech

License: Attribution-ShareAlike 4.0 (CC BY-SA 4.0)

Downloads (use a mirror closer to you): [967M]   (Javanese data from female speakers )   Mirrors: [China] [923M]   (Javanese data from female speakers )   Mirrors: [China]  
LICENSE [20K]   (License information )   Mirrors: [China]  

About this resource:

This data set contains high-quality transcribed audio data for Javanese. The data set consists of wave files, and a TSV file. The file line_index.tsv contains a filename and the transcription of audio in the file. Each filename is prepended with a speaker identification number.

The data set has been manually quality checked, but there might still be errors.

This dataset was collected by Google in collaboration with Gadjah Mada University in Indonesia.

See LICENSE file for license information.

Copyright 2016, 2017, 2018 Google LLC

If you use this data in publications, please cite it as follows:

    title = {{A Step-by-Step Process for Building TTS Voices Using Open Source Data and Framework for Bangla, Javanese, Khmer, Nepali, Sinhala, and Sundanese}},
    author = {Keshan Sodimana and Knot Pipatsrisawat and Linne Ha and Martin Jansche and Oddur Kjartansson and Pasindu De Silva and Supheakmungkol Sarin},
    booktitle = {Proc. The 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU)},
    year  = {2018},
    address = {Gurugram, India},
    month = aug,
    pages = {66--70},
    URL   = {}