Okay, so this is a *very* long overdue post about my project. I have an hour before lectures, and I’m running a couple of computer jobs that will take a while. (edit: about half an hour)
As some of you will know, my 4th year (MEng) project is all about music visualisation. The idea is to create a system that will take MP3 files, and turn them into thumbnail images. Songs which sound similar should also *look* similar. The idea is that it should act as a visual memory aid for DJs.
Right at the moment, I have a “baseline” system, which produces images like
this. Looking at the images from the baseline system, there don’t seem to be many similar-looking images (If you see any other than The Fox and Christopher Columbus, post a comment below).
So what’s going wrong?
There are a lot of configurable parameters of the system, so it might just be that it needs tuning. If you want to compare the performance of the system with a few parameter changes, try exploring the matrix found here It might also be that I’m trying to pack too much information into each (very small) image. Currently I’m trying to squeeze 20 independent (scalar) pieces of information into each 20×20 image. What I need to try next is cutting down to the 3 to 10 pieces of information which are actually relevant, and making 50×50 images. I think I will also need to gather different pieces of information to include (initially extracted by hand, and then automatically extracted).
Also, the human eye is not very good at comparing brightness. I will try adding fake colour to the images, and see whether a different colour map performs better.
I’ll post something else this afternoon/evening. Have to run to lecture now.