Sciweavers

LREC
2010

STeP-1: A Set of Fundamental Tools for Persian Text Processing

13 years 6 months ago
STeP-1: A Set of Fundamental Tools for Persian Text Processing
Many NLP applications need fundamental tools to convert the input text into appropriate form or format and extract the primary linguistic knowledge of words and sentences. These tools perform segmentation of text into sentences, words and phrases, checking and correcting the spellings, doing lexical and morphological analysis, POS tagging and so on. Persian is among languages with complex preprocessing tasks. Having different writing prescriptions, spacings between or within words, character codings and spellings are some of the difficulties and challenges in converting various texts into a standard one. The lack of fundamental text processing tools such as morphological analyser (especially for derivational morphology) and POS tagger is another problem in Persian text processing. This paper introduces a set of fundamental tools for Persian text processing in STeP-1 package. STeP-1 (Standard Text Preparation for Persian language) performs a combination of tokenization, spell checking,...
Mehrnoush Shamsfard, Hoda Sadat Jafari, Mahdi Ilbe
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Mehrnoush Shamsfard, Hoda Sadat Jafari, Mahdi Ilbeygi
Comments (0)