This dataset contains 10,083 recorded utterances in French, Maninka, Pular and Susu from 49 speakers (16 female and 33 male) ranging from 5 to 76 years old on a variety of devices.

Please see our paper for more details on this dataset. Additional resources can be found in the following git repository: https://github.com/mdoumbouya/nicolingua

You can cite our work using the following BibTeX entry.

 @inproceedings{doumbouya2021usingradio,
    title={Using Radio Archives for Low-Resource Speech Recognition: Towards an Intelligent Virtual Assistant for Illiterate Users},
    author={Doumbouya, Moussa and Einstein, Lisa and Piech, Chris},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={35},
    year={2021}
  }