We introduce a new method to improve web site text content by identifying the most relevant free text in the web pages. In order to understand the variations in web page text, we c...
Finding a set of web pages relevant to a user’s information goal is difficult due to the enormous size of the Internet. Search engines are able to find a set of pages that mat...
We present a highly accurate method for classifying web pages based on link percentage, which is the percentage of text characters that are parts of links normalized by the number...
Web page clustering is a technology that puts semantically related web pages into groups and is useful for categorizing, organizing, and refining search results. When clustering ...
Today's Web sites are intricate but not intelligent; while Web navigation is dynamic and idiosyncratic, all too often Web sites are fossils cast in HTML. In response, this pa...