When scanning documents with a large number of pages such as books, it is often feasible to provide a minimal number of training samples to personalize the system to compensate fo...
This paper presents a quantitative performance analysis of two different approaches to the lemmatization of the Czech text data. The first one is based on manually prepared diction...
The increasing amount and complexity of data in toxicity prediction calls for new approaches based on hybrid intelligent methods for mining the data. This focus is required even mo...
Emilio Benfenati, Paolo Mazzatorta, Daniel Neagu, ...
Abstract--Although various Logical Story Unit (LSU) segmentation methods based on visual content have been presented in literature, a common ground for comparison is missing. We pr...
1 This paper defines a new stacked generalization framework in the context of information extraction (IE) from online sources. The proposed setting removes the constraint of apply...