Open Speech and Language Resources

Phone: 425 247 4129
(Daniel Povey)


Identifier: SLR6

Summary: English and Czech data, mirrored from the Vystadial project

Category: Speech

License: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0 US)

data_voip_cs.tgz [1.5G]   ( Czech speech and transcripts )
data_voip_en.tgz [2.7G]   ( English speech and transcripts )

About this resource:

This data is transcribed telephone converation data, in English and Czech.

The data collection process and development of these training scripts was partly funded by the Ministry of Education, Youth and Sports of the Czech Republic under the grant agreement LK11221 and core research funding of Charles University in Prague.

You can cite the data using the following BibTeX entry:

  title={{Free English and Czech telephone speech corpus shared under the CC-BY-SA 3.0 license}},
  author={Korvas, Mat\v{e}j and Pl\'{a}tek, Ond\v{r}ej and Du\v{s}ek, Ond\v{r}ej and \v{Z}ilka, Luk\'{a}\v{s} and Jur\v{c}\'{i}\v{c}ek, Filip},
  booktitle={Proceedings of the Eigth International Conference on Language Resources and Evaluation (LREC 2014)},
  pages={To Appear},

External URLs:   (Czech data )   (English data )