The Open UniversitySkip to content

Enhancing music information retrieval by incorporating image-based local features

Kaliciak, Leszek; Horsburgh, Ben; Song, Dawei; Wiratunga, Nirmalie and Pan, Jeff (2012). Enhancing music information retrieval by incorporating image-based local features. In: Eighth Asia Information Retrieval Societies Conference (AIRS 2012), 17-19 December 2012, Tianjin, China.

Full text available as:
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (491Kb) | Preview
Google Scholar: Look up in Google Scholar


This paper presents a novel approach to Music Information Retrieval. Having represented the music tracks in the form of two dimensional images, we apply the "bag of visual words" method from visual IR in order to classify the songs into 19 genres. By switching to visual domain we can abstract from musical concepts such as melody, timbre and rhythm. We obtained classification accuracy of 46% (with 5% theoretical baseline for random classification) which is comparable with existing state-of-the-art approaches. Moreover, the novel features characterize different properties of the signal than standard methods. Therefore, the combination of them should further improve the performance of existing techniques.

Item Type: Conference Item
Copyright Holders: 2012 Springer
Keywords: local features; co-occurrence matrix; colour moments; K-means algorithm; Fourier transform
Academic Unit/Department: Mathematics, Computing and Technology
Mathematics, Computing and Technology > Computing & Communications
Related URLs:
Item ID: 34651
Depositing User: Dawei Song
Date Deposited: 16 Oct 2012 12:27
Last Modified: 21 Jan 2016 23:20
Share this page:

Actions (login may be required)

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340