Sciweavers

IJDAR
2010

Locating and parsing bibliographic references in HTML medical articles

13 years 3 months ago
Locating and parsing bibliographic references in HTML medical articles
The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if references do not constitute such a field, they can be useful as a preprocessing step in the automated extraction of other bibliographic data from articles, as well as in computer-assisted indexing of articles. Automation in data extraction and indexing to minimize human labor is key to the affordable creation and maintenance of large bibliographic databases. Extracting the components of references, such as author names, article title, journal name, publication date and other entities, is therefore a valuable and sometimes necessary task. This paper describes a two-step process using statistical machine learning algorithms, to first locate the references in HTML medical articles and then to parse them. Reference locating identifies the reference section in an article and then decomposes it into individual referenc...
Jie Zou, Daniel X. Le, George R. Thoma
Added 27 Jan 2011
Updated 27 Jan 2011
Type Journal
Year 2010
Where IJDAR
Authors Jie Zou, Daniel X. Le, George R. Thoma
Comments (0)