Sciweavers

2677 search results - page 205 / 536
» Extracting Structured Data from Web Pages
Sort
View
143
Voted
CIDR
2009
148views Algorithms» more  CIDR 2009»
15 years 4 months ago
The Case for a Structured Approach to Managing Unstructured Data
The challenge of managing unstructured data represents perhaps the largest data management opportunity for our community since managing relational data. And yet we are risking let...
AnHai Doan, Jeffrey F. Naughton, Akanksha Baid, Xi...
130
Voted
WWW
2010
ACM
15 years 3 months ago
Talking about data: sharing richly structured information through blogs and wikis
Abstract. Several projects have brought rich data semantics to collaborative wikis, but blogging platforms remain primarily limited to text. As blogs comprise a significant portion...
Edward Benson, Adam Marcus 0002, Fabian Howahl, Da...
132
Voted
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
15 years 10 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
172
Voted
ADVIS
2006
Springer
15 years 9 months ago
Structural and Event Based Multimodal Video Data Modeling
Investments on multimedia technology enable us to store many more reflections of the real world in digital world as videos. By recording videos about real world entities, we carry...
Hakan Öztarak, Adnan Yazici
131
Voted
EMNLP
2008
15 years 5 months ago
Mining and Modeling Relations between Formal and Informal Chinese Phrases from Web Corpora
We present a novel method for discovering and modeling the relationship between informal Chinese expressions (including colloquialisms and instant-messaging slang) and their forma...
Zhifei Li, David Yarowsky