LibriSpeech-PC: A dataset based on LibriSpeech* with restored punctuation and capitalization. *V. Panayotov, G. Chen, D. Povey and S. Khudanpur, "LibriSpeech: An ASR corpus based on public domain audio books," 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, QLD, Australia, 2015, pp. 5206-5210, doi: 10.1109/ICASSP.2015.7178964.

You can cite the data using the following BibTeX entry:

@article{meister2023librispeechpc,
        title={LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models}, 
        author={A. Meister and M. Novikov and N. Karpov and E. Bakhturina and V. Lavrukhin and B. Ginsburg},
        journal={arXiv preprint arXiv:2310.02943},
        year={2023},
  }