As the proliferation of the Internet, especially World Wide Web, numerous information resources have been constructed. The characteristics of information resources on the Internet...
Kangchan Lee, Jae Hong Min, Kishik Park, Kyuchul L...
We present a robust method for gathering relational facts from the Web, based on matching generalized patterns which are automatically learned from seed facts for relations of int...
Ndapandula Nakashole, Martin Theobald, Gerhard Wei...
Standard algorithms for template-based information extraction (IE) require predefined template schemas, and often labeled data, to learn to extract their slot fillers (e.g., an ...
As XML has become an emerging standard for information exchange on the World Wide Web, it has gained attention in database communities to extract information from XML sees as a dat...
Nathalia Devina Widjaya, David Taniar, J. Wenny Ra...
Link spam deliberately manipulates hyperlinks between web pages in order to unduly boost the search engine ranking of one or more target pages. Link based ranking algorithms such ...