The objective of this paper is to study the existing methods for unsupervised object recognition and image categorization and propose a model that can learn directly from the output of image search engines, e.g. Google Images, bypassing the need to manually collect large quantities of training data. This model can then be used to refine the quality of the image search, or to search through other sources of images. This integrated scheme has been implemented and optimized to be used in The Semantic Robot Vision Challenge as a new test-bed for research in the areas of image understanding and knowledge retrieval in large unstructured image databases.