This is the TED-LIUM corpus release 3, licensed under Creative Commons BY-NC-ND 3.0.

All talks and text are property of TED Conferences LLC.

This new TED-LIUM release was made through a collaboration between the Ubiqus company and the LIUM (University of Le Mans, France)


Two corpus distributions:

More details are given in this paper:

François Hernandez, Vincent Nguyen, Sahar Ghannay, Natalia Tomashenko, and Yannick Estève, “TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation”, submitted to the 20th International Conference on Speech and Computer (SPECOM 2018), September 2018, Leipzig, Germany

A preprint version is available on arxiv (and in the doc/ directory).

Source page of this corpus: here.

SPH format info:
Channels: 1
Sample Rate: 16000
Precision: 16-bit
Bit Rate: 256k
Sample Encoding: 16-bit Signed Integer PCM