Repositório Digital de Publicações Científicas: Enhancing a Portuguese text classifier using part-of-speech tags

Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/2562

Title:	Enhancing a Portuguese text classifier using part-of-speech tags
Authors:	Gonçalves, Teresa Quaresma, Paulo
Keywords:	machine learning Text classification
Issue Date:	2005
Publisher:	Springer-Verlag
Abstract:	Support Vector Machines have been applied to text classification with great success. In this paper, we apply and evaluate the impact of using part-of- speech tags (nouns, proper nouns, adjectives and verbs) as a feature selection procedure in a European Portuguese written dataset – the Portuguese Attorney General’s Office documents. From the results, we can conclude that verbs alone don’t have enough informa- tion to produce good learners. On the other hand, we obtain learners with equiva- lent performance and a reduced number of features (at least half) if we use specific part-of-speech tags instead of all words.
URI:	http://hdl.handle.net/10174/2562
Type:	article
Appears in Collections:	INF - Artigos em Livros de Actas/Proceedings

Files in This Item:

File	Description	Size	Format
tcg05a-enhancingPT.pdf	Artigo	116.8 kB	Adobe PDF	View/Open