Open Speech and Language Resources



Contact
dpovey@gmail.com
Phone: 425 247 4129
(Daniel Povey)

TED-LIUM

Identifier: SLR7

Summary: English speech recognition training corpus from TED talks, created by Laboratoire d’Informatique de l’Université du Maine (LIUM) (mirrored here)

Category: Speech

License: Creative Commons BY-NC-ND 3.0 (attribution/non-commercial/no-derivatives).

Download: TEDLIUM_release1.tar.gz [21G]   The first release

About this resource:

The TED-LIUM corpus (mirrored here) is English-language TED talks, with transcriptions, sampled at 16kHz. It contains about 118 hours of speech.

The original page requests that you cite the following paper if you make use of this corpus:

A. Rousseau, P. Deléglise, and Y. Estève, "TED-LIUM: an automatic speech recognition dedicated corpus",
in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), May 2012.

External URL: http://www-lium.univ-lemans.fr/en/content/ted-lium-corpus   Original source