Our final group presentation was on the topic of movie clustering. The group compared the results of clustering the list of movies from IMDB based on genre and then keywords. The clustering was very memory intensive and took quite a while to complete.Here are some of the things the group found:First, when the group clustered the movies based on genre they were some interesting results. The groups were much broader and general. Some of the clusters made sense, while other had no correlation whatsoever. Here are three pictures of what we found after some genre clustering:
This is a snippet of the, "Scarface," genre clustering:

This is a snippet of the war movie genre clustering:

This is a snippet of the, "Last Action Hero," genre clustering:

Now, the group decided to cluster the same list of movies based on the keywords associated with them rather than their genre. These took much longer but produced much more accurate results. Here some of the results:
Here are the results of the, "Scarface," keywords clustering:

Now, the results of war movie keyword clustering:

And finally, the results from, "Shanghai Noon," keyword clustering:

As you can see by the pictures, the keyword clustering was much more specific and more accurate. The genre clustering was very general. One conclusion that can be made is that with clustering you are faced with an age old engineering trade-off. More effort with better results or less effort with okay results.
No comments:
Post a Comment