Domen Strupeh (2010) Classification of vocal folk music recordings. EngD thesis.
Abstract
In this thesis the automatic recognition of groups in singing recordings is presented. The classification of audio recordings or their parts into defined classes is useful particularly at large record sets that carry a variety of useful research information. The manual record annotation could be replaced with automatic. Nevertheless, the classification accuracy is in great deal conditioned by the type of classification recordings. The scope of my research in this case is singing with accordance to specificity of ethnomusicological record sets. Two systems that are based on the acoustic pattern recognition are implemented for classification. The first one uses Support vector machines method (SVM) and the second one Gaussian mixture models (GMM). In both systems the binary classifier is used to distinguish solo singing and multi-voice singing or singing and singing with accompaniment. Classification is based on Mel-frequency cepstral coefficients (MFCC) and delta-MFCC, which are a compact representation of tone colour and are frequently used in automatic classification of recordings. Tests have showed that the classification accuracy depends on the number of used MFCC coefficients while the choice of classifier has no significant impact. With classification of solo singing and multi-voice singing, the implemented system correctly classified 78,6% of instances by using the GMM method and 89,5% of instances by using the SVM method on records of singing and singing with accompaniment.
Actions (login required)