AUTOMATIC MUSIC GENRE CLASSIFICATION USING MACHINE LEARNING

Authors

  • Nemanja Rašajski Autor

DOI:

https://doi.org/10.24867/01BE05Rasajski

Keywords:

automatic music genre clasification, machine learning, GTZAN dataset

Abstract

Music genres are conventional categories that are used for describing music. Today they are most often used for classifying growing music collections, for easier access and recommendation. This paper has analyzed a number of methods for automatic music classification, which include convolutional neural networks (CNN), re­current neural networks (RNN), support vector machines (SVM), random forests (RF), AdaBoost, voting classifier and one versus rest (OVR). Features used for audio classification were created using mel-frequency cepstrum coefficients (MFCC), and spectrograms which were used in combination with CNNs. The accuracy achieved (~60%) was not at the human level (~70%), but it was not too far off, and it is in line with other similar approaches. Thus, methods presented in this paper can be used for automatic classification of music, by radio stations or web portals that distribute and recommend music.

References

[1] http://marsyasweb.appspot.com/download/data_sets/, Datum pristupa: 25.3.2017.
[2] Muda, Lindasalwa, Mumtaj Begam, and Irraivan Elamvazuthi. "Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques." arXiv preprint arXiv:1003.4083 (2010).
[3] Tzanetakis, George, and Perry Cook. "Musical genre classification of audio signals." IEEE Transactions on speech and audio processing 10.5 (2002): 293-302.
[4] http://zone.ni.com/reference/en-XX/help/371361E-01/lvanls/stft_spectrogram_core/#details, Datum pristupa: 15.8.2018.
[5] Sturm, Bob L. "An analysis of the GTZAN music genre dataset." Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies. ACM, 2012.
[6] Müller, Meinard. Information retrieval for music and motion. Vol. 2. Heidelberg: Springer, 2007.
[7] https://ccrma.stanford.edu/~jos/sasp/, Datum pristupa: 15.8.2018.
[8] Mandel, Michael I., and Dan Ellis. "Song-Level Features and Support Vector Machines for Music Classification." ISMIR. Vol. 2005. 2005.
[9] Stanley, Kenneth O., and Risto Miikkulainen. "Evolving neural networks through augmenting topologies." Evolutionary computation 10.2 (2002): 99-127.
[10] Breiman, Leo. "Random forests." Machine learning 45.1 (2001): 5-32.
[11] D. Perrot and R. Gjerdigen, “Scanning the dial: An exploration of factors in identification of musical style,” in Proc. Soc. Music Perception Cognition, 1999
[12] http://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score, Datum pristupa: 15.8.2018.

Published

2018-12-19

Issue

Section

Electrotechnical and Computer Engineering