Friday, October 07, 2005

Spatial hierarchical set clustering, anyone ?

Yes, that was the best way to describe the problem I have at hand while designing/implementing the new thumbnail browser widget for imgSeek2. I have a collection of images, each belonging to one or more sets, and I want to implement an algorithm to find the best spatial representation of this on a grid.

I have conventional data structures telling me which images belong to each set and which sets an image belongs to. In other words (I mean, pictures), I have this:



and want to draw this:


So, which approach would you recommend ? Keep in mind that drawing thumbnails using a grid like on the 2nd picture is just as hard as drawing them like Venn diagrams on the 1st picture.

It's really hard to google for such an algorithm so I'll have to come up with something on my own. My first thought is to iterate all sets and generate a tree (not a binary one since a set may have more than 2 subsets) as balanced as possible using a combination of the number of images on the set and number of images shared with other sets as a weight to each node.

I found an interesting usenet post and it seems that R-Tree is the name of the spatial data structure I'm looking for.

Any other ideas ?

2 comments:

Rob Lake said...

Have you looked at Kohonen's Self Organizing Maps? You would need a measure of the similarity between each pair of images.

Rob Lake

Rob Lake said...

Further to my prior comment:
Karhunen-Loeve Transforms can produce an efficient set of 2-D basis functions for representing a set of images. The difference between a pair of images might then be calculated as the sum of he differences between the low-order K-L basis functions for each of the images.

Rob Lake