Please use this identifier to cite or link to this item:

Title: Is linguistic information relevant for the classification of legal texts?
Authors: Gonçalves, Teresa
Quaresma, Paulo
Keywords: Text classification
Issue Date: 2005
Publisher: ACM
Abstract: Text classification is an important task in the legal domain. In fact, most of the legal information is stored as text in a quite unstructured format and it is important to be able to automatically classify these texts into a predefined set of concepts. Support Vector Machines (SVM), a machine learning al- gorithm, has shown to be a good classifier for text bases [Joachims, 2002]. In this paper, SVMs are applied to the classification of European Portuguese legal texts – the Por- tuguese Attorney General’s Office Decisions – and the rele- vance of linguistic information in this domain, namely lem- matisation and part-of-speech tags, is evaluated. The obtained results show that some linguistic information (namely, lemmatisation and the part-of-speech tags) can be successfully used to improve the classification results and, simultaneously, to decrease the number of features needed by the learning algorithm.
ISBN: ISBN 1-59593-081-7
Type: article
Appears in Collections:INF - Artigos em Livros de Actas/Proceedings

Files in This Item:

File Description SizeFormat
tcg05b-linguistic.pdfArtigo195.51 kBAdobe PDFView/Open
FacebookTwitterDeliciousLinkedInDiggGoogle BookmarksMySpaceOrkut
Formato BibTex mendeley Endnote Logotipo do DeGóis 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Dspace Dspace
DSpace Software, version 1.6.2 Copyright © 2002-2008 MIT and Hewlett-Packard - Feedback
UEvora B-On Curriculum DeGois