The purpose of this paper is threefold. First, we study the evolution of the web based on data available from an earlier snapshot of the web and compare the results with those predicted in [2]. Secondly, we establish whether the WT10G dataset, a popular benchmark for the development and evaluation of internet based applications is appropriate for the tasks. Finally, is there a need for a collection of a new dataset for such purposes. The findings are that the appropriateness of using the popular WT10G dataset in recent Internetbased experiments is questionable and that there is a need for a new collection of dataset for development and evaluation purposes of algorithms related to Internet search engine developments. Categories and Subject Descriptors: H.5.4 Information Interfaces and Presentation: Hypertext/Hypermedia. H.4.m Information Systems: Miscellaneous. General Terms: Experimentation, Verification.