Sciweavers

DL
2000
Springer

Scalable browsing for large collections: a case study

13 years 8 months ago
Scalable browsing for large collections: a case study
Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that uses an automatically constructed phrase hierarchy to facilitate browsing of an ordinary large Web site. Phrases are extracted from the full text using a novel combination of rudimentary syntactic processing and sequential grammar induction techniques. The interface is simple, robust and easy to use. To convey a feeling for the quality of the phrases that are generated automatically, a thesaurus used by the organization responsible for the Web site is studied and its degree of overlap with the phrases in the hierarchy is analyzed. Our ultimate goal is to amalgamate hierarchical phrase browsing and hierarchical thesaurus browsing: the latter provides an authoritative domain vocabulary and the former augments coverage in areas the thesaurus does not reach.
Gordon W. Paynter, Ian H. Witten, Sally Jo Cunning
Added 02 Aug 2010
Updated 02 Aug 2010
Type Conference
Year 2000
Where DL
Authors Gordon W. Paynter, Ian H. Witten, Sally Jo Cunningham, George Buchanan
Comments (0)