ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Ensemble recognition in folk song recordings

Miha Krašovec (2011) Ensemble recognition in folk song recordings. EngD thesis.

[img] PDF
Download (1046Kb)


    More and more researchers are starting to explore the field of automatic recognition of musical instruments within audio recordings, but so far their presented solutions cannot compete with the human ability of instrument recognition. This is especially true for polyphonic recordings. Algorithms participating in the MIREX competition usually achieve 70 to 75 percent recognition accuracy. In my thesis I am presenting automatic recognition of musical instrument groups, which is very similar to instrument recognition. The problem was simplified by limiting the recordings to Slovene folk music. Audio recordings are first divided into 10 second segments. For each of the segments nine audio features are calculated: MFCC, tempo, frequency of note onsets, zero-crossing rate, spectral roll-off, sound brightness, spectral irregularity, spectral centroid and spectral flatness. MIRToolbox (a MATLAB plug-in) was used for feature extraction in which all of the most commonly used algorithms already implemented. A machine learning algorithm LMT, implemented in Weka, is then used on these features to classify audio segments into five classes (solo accordion, Bela krajina, Prekmurje, Resian music and Resian singing). Results obtained by this method were good. 10-fold cross-validation used to test training data correctly classified 94% of recordings. For the next test I used recordings that belonged to one of the five classes. Classification accuracy achieved this way was 83%. In the last part, unedited field recordings were used, where 86% of segments were correctly classified. To conclude I also suggested a few possible improvements to the algorithm which could increase its accuracy and robustness.

    Item Type: Thesis (EngD thesis)
    Keywords: musical instrument recognition, feature extraction, machine learning, sound properties
    Number of Pages: 34
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    doc. dr. Matija Marolt271UNSPECIFIED
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=00008396884)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 1357
    Date Deposited: 18 May 2011 10:49
    Last Modified: 13 Aug 2011 00:39
    URI: http://eprints.fri.uni-lj.si/id/eprint/1357

    Actions (login required)

    View Item