Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

13

IJMSO
2008

favoriteEmaildiscussreport

149views more IJMSO 2008»

Categorisation of web documents using extraction ontologies

13 years 4 months ago

Categorisation of web documents using extraction ontologies

Download www.deg.byu.edu

: Automatically recognising which HTML documents on the Web contain items of interest for a user is non-trivial. As a step toward solving this problem, we propose an approach based on information-extraction ontologies. Given HTML documents, tables, and forms, our document recognition system extracts expected ontological vocabulary (keywords and keyword phrases) and expected ontological instance data (particular values for ontological concepts). We then use machine-learned rules over this extracted information to determine whether an HTML document contains items of interest. Experimental results show that our ontological approach to categorisation works well, having achieved F-measures above 90% for all applications we tried.

Li Xu, David W. Embley

Real-time Traffic

HTML Document | IJMSO 2008 | Ontological | Ontological Instance Data |

claim paper

Related Content

» Categorisation by Context

» Classification of Web Documents Using Concept Extraction from Ontologies

» Learning domain ontologies for Web service descriptions an experiment in bioinformatics

» Extracting and Modeling the Semantic Information Content of Web Documents to Support Seman...

» Extracting Instances of Relations from Web Documents Using Redundancy

» Incremental OntologyBased Extraction and Alignment in Semistructured Documents

» Boosting Biomedical Entity Extraction by Using Syntactic Patterns for Semantic Relation Di...

» From Software APIs to Web Service Ontologies A Semiautomatic Extraction Method

» GlossOnt A Conceptfocused Ontology Building Tool

Post Info
More Details (n/a)

Added	12 Dec 2010
Updated	12 Dec 2010
Type	Journal
Year	2008
Where	IJMSO
Authors	Li Xu, David W. Embley

Comments (0)