Skip to content

Audio & Speech Datasets

Free datasets for speech recognition, music analysis, sound classification, and audio processing.

Speech Recognition

Dataset Hours Size Language License Link
LibriSpeech 1,000 60 GB English CC-BY-4.0 openslr.org
Common Voice 19,000+ 70+ GB 100+ languages CC0 commonvoice.mozilla.org
VoxPopuli 400K+ 1.8 TB 23 EU languages CC0 HuggingFace
GigaSpeech 10,000 700 GB English Apache-2.0 github.com/SpeechColab
TED-LIUM 452 35 GB English CC-BY-NC-ND openslr.org
FLEURS 12+ per lang 350 GB 102 languages CC-BY-4.0 HuggingFace

Sound Classification

Dataset Clips Classes Size License Link
AudioSet 2M+ 632 YouTube IDs CC-BY-4.0 research.google.com
ESC-50 2,000 50 600 MB CC-BY-NC github.com/karolpiczak
UrbanSound8K 8,732 10 6.8 GB CC-BY-NC urbansounddataset.weebly.com
FSD50K 51,197 200 27 GB CC-BY-4.0 zenodo.org

Music

Dataset Tracks Size License Link
GTZAN 1,000 1.2 GB Research marsyas.info
MagnaTagATune 25,863 3 GB Research mirg.city.ac.uk
Free Music Archive 106,574 879 GB CC variants github.com/mdeff/fma
MusicNet 330 11 GB CC-BY-4.0 zenodo.org

Speaker Recognition

Dataset Speakers Hours License Link
VoxCeleb1 1,251 352 CC-BY-4.0 robots.ox.ac.uk
VoxCeleb2 6,112 2,442 CC-BY-4.0 robots.ox.ac.uk
VCTK 110 44 CC-BY-4.0 datashare.ed.ac.uk