Please use this identifier to cite or link to this item:
http://hdl.handle.net/10174/2557
|
Title: | Using IR techniques to improve Automated Text Classification |
Authors: | Gonçalves, Teresa Quaresma, Paulo |
Keywords: | machine learning Text classification |
Issue Date: | 2004 |
Publisher: | Springer-Verlag |
Abstract: | This paper performs a study on the pre-processing phase of the automated text classification problem. We use the linear Support Vector Machine paradigm applied to datasets written in the English and the European Portuguese languages – the Reuters and the Portuguese Attorney General’s Office datasets, respectively.
The study can be seen as a search, for the best document representa- tion, in three different axes: the feature reduction (using linguistic in- formation), the feature selection (using word frequencies) and the term weighting (using information retrieval measures). |
URI: | http://hdl.handle.net/10174/2557 |
Type: | article |
Appears in Collections: | INF - Artigos em Livros de Actas/Proceedings
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|