Find “similar things” in Ruby

rubyFor several Ruby on Rails projects I had to come up with “similar” results. These are often results (video’s, products,places, hobbies, etc) with the greatest number of tags.

Say, you are tagging car pictures on a website, and have the following:

image1.jpg -> [“honda”,”s2000″,”convertible”,”black”]
image2.jpg -> [“honda”,”civic”,”blue”]
image3.jpg -> [“lexus”,”is300″,”blue”]
image4.jpg -> [“s2000″,”honda”,”convertible”,”silver”]
image5.jpg -> [“toyota”,”starlet”,”black”]

Seeing this, you’d know that image1.jpg and image4.jpg are similar pictures. Or rather “more similar” than , say, image1.jpg and image3.jpg. For this, I wrote below snippet of code. This goes in the model file, and can be called as “object.similar”. It returns an array of similar “things”, sorted on most similar to less similar (hence the results.reverse at the end)

For example:

  img = Image.find(params[:id])
  @similar_images = img.similar[0..10]

Will give you the 10 “most similar” images as img. Well, it gives you the files with the most similar tags.

def similar
  tags = self.tags
  results = []
  tags.each do |tag|
    results = results + # or tag.things, tag.products, ... 

  # make array into hash
  h =
  results.each do |r|      
    h[r] = h[r].to_i + 1

  # sort on values    
  tmp = h.sort {|a,b| a[1]<=>b[1]}
  results = []
  tmp.each do |t|
    results << t[0]
  results.reverse # return all items, products, ...

This was written for a new project coming up, and will be used to do better "similarities matching" for, though for the latter we also had to sort on distance. (For it's vicinity)





Leave a Reply

Your email address will not be published. Required fields are marked *