Please use this identifier to cite or link to this item:
http://hdl.handle.net/10174/2562
|
Title: | Enhancing a Portuguese text classifier using part-of-speech tags |
Authors: | Gonçalves, Teresa Quaresma, Paulo |
Keywords: | machine learning Text classification |
Issue Date: | 2005 |
Publisher: | Springer-Verlag |
Abstract: | Support Vector Machines have been applied to text classification with great success. In this paper, we apply and evaluate the impact of using part-of- speech tags (nouns, proper nouns, adjectives and verbs) as a feature selection procedure in a European Portuguese written dataset – the Portuguese Attorney General’s Office documents.
From the results, we can conclude that verbs alone don’t have enough informa- tion to produce good learners. On the other hand, we obtain learners with equiva- lent performance and a reduced number of features (at least half) if we use specific part-of-speech tags instead of all words. |
URI: | http://hdl.handle.net/10174/2562 |
Type: | article |
Appears in Collections: | INF - Artigos em Livros de Actas/Proceedings
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|