Where to get audio for machine learning: a selection of open libraries licensed under Creative Commons

A small digest for those who develop machine learning models.

Under the cut - data sets with speech, music and noise of industrial units.


Photo Emily Morter / Unsplash



Audioset


This dataset is overseen by engineers from the Machine Perception lab , part of Google. It contains more than two million voice clips from YouTube videos up to ten seconds long. All of them are divided into 632 classes that describe what is happening in the video. Here are just a couple of examples: music, laughter, snoring, an explosion, the noise of a lawn mower, the murmur of a stream, the barking of a dog.

AudioSet offers three sets: test , balanced and unbalanced. The first includes 20,383 video segments, which are sorted into 527 sound classes. Each of them contains at least 59 clips. The balanced set is similar to the test set, with one exception - it has 22,176 segments. As for unbalanced, it contains all two million samples without any sorting.

The data for downloading are presented in two formats : as text CSV files and as audio features extracted from videos by a convolutional neural network. To unload all the videos based on which the data is collected, you can use the python module - youtube-dl . The dataset is licensed under CC BY 4.0 . Updates can be monitored in the Google group: audioset-users .



MIMII Dataset


Hitachi engineers presented an audio recording base with the sounds of working industrial equipment. The dataset is suitable for the development of machine learning models that determine the malfunctions of industrial units . The selection contains the noise of valves, pumps and fans. More than 26 thousand ten-second samples are devoted to equipment operating in the normal mode.

Another 6 thousand files are records of machines operating in imperfect conditions: without lubrication, with broken blades or damaged guides.

All recordings are made in WAV format with a sampling frequency of 16 kHz - their total weight exceeds 150 GB . You can listen to the examples here . The kit is licensed under CC BY-SA .




Photo Nathan Roser / Unsplash



Libripepeech


This data set includes a thousand hours of English speech (16 kHz). He is supervised by engineers Vasil Panayotov and Daniel Povey of Johns Hopkins University. The data is taken from audio books created by the nonprofit LibriVox project. They are written down by volunteers reading texts that are in the public domain in the USA - for example, from the Gutenberg project .

In addition to the dataset itself , on the site you can download all MP3-files with recordings (this is 87 GB ) and metadata to them . The installed license is CC BY 4.0. You can evaluate acoustic models trained using this dataset at kaldi-asr.org .



Million Song Dataset


A free collection of audio tags and metadata for a million popular tracks. It does not contain the audio recordings themselves, however, the original tracks can be “tightened up" using the code provided by the developers. They were engineers from the US National Science Foundation, responsible for the development of science and technology in the country. One of the first data for the dataset was provided by The Echo Nest analytical platform, which Spotify has owned since 2014 . Last.fm, Musixmatch and SecondHandSongs also contributed.

The entire base weighs about 300 GB . But the authors offer a small test sample of 10 thousand songs - this is 1.8 GB. All of them are divided into categories, among which we can distinguish: artist, genre, release date, mood and others.



More collections in our “Hi-Fi World”:

Where to get audio samples for your projects: a collection of nine thematic resources
12 thematic resources with tracks licensed under Creative Commons
Where to get audio for game development and other commercial projects



Until April 5, we froze prices for a number of goods. This is a great opportunity to purchase a gadget that you have been eyeing for a long time. For example, acoustics or “turntable” up to 25 thousand rubles. :


PS The indicated prices are relevant only on the date of publication. Check the official website of Audiomania to choose the audio gadget that suits your taste.

All Articles