Probabilistic term variant generator for biomedical terms

11 years 6 months ago
Probabilistic term variant generator for biomedical terms
This paper presents an algorithm to generate possible variants for biomedical terms. The algorithm gives each variant its generation probability representing its plausibility, which is potentially useful for query and dictionary expansions. The probabilistic rules for generating variants are automatically learned from raw texts using an existing abbreviation extraction technique. Our method, therefore, requires no linguistic knowledge or labor-intensive natural language resource. We conducted an experiment using 83,142 MEDtracts for rule induction and 18,930 abstracts for testing. The results indicate that our method will significantly increase the number of retrieved documents for long biomedical terms. Categories and Subject Descriptors I.2.7 [Computing Methodologies]: Natural Language Processing—Language Generation; H.3.1 [Information Systems]: Content Analysis and Indexing—Thesauruses General Terms Algorithms Keywords spelling variation, query expansion, dictionary expansion
Yoshimasa Tsuruoka, Jun-ichi Tsujii
Added 05 Jul 2010
Updated 05 Jul 2010
Type Conference
Year 2003
Authors Yoshimasa Tsuruoka, Jun-ichi Tsujii
Comments (0)