Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/2562

Title: Enhancing a Portuguese text classifier using part-of-speech tags
Authors: Gonçalves, Teresa
Quaresma, Paulo
Keywords: machine learning
Text classification
Issue Date: 2005
Publisher: Springer-Verlag
Abstract: Support Vector Machines have been applied to text classification with great success. In this paper, we apply and evaluate the impact of using part-of- speech tags (nouns, proper nouns, adjectives and verbs) as a feature selection procedure in a European Portuguese written dataset – the Portuguese Attorney General’s Office documents. From the results, we can conclude that verbs alone don’t have enough informa- tion to produce good learners. On the other hand, we obtain learners with equiva- lent performance and a reduced number of features (at least half) if we use specific part-of-speech tags instead of all words.
URI: http://hdl.handle.net/10174/2562
Type: article
Appears in Collections:INF - Artigos em Livros de Actas/Proceedings

Files in This Item:

File Description SizeFormat
tcg05a-enhancingPT.pdfArtigo116.8 kBAdobe PDFView/Open
FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Dspace Dspace
DSpace Software, version 1.6.2 Copyright © 2002-2008 MIT and Hewlett-Packard - Feedback
UEvora B-On Curriculum DeGois