Please use this identifier to cite or link to this item:
http://hdl.handle.net/10174/2556
|
Title: | Using Linguistic Information and Machine Learning Techniques to Identify Entities from Juridical Documents |
Authors: | Gonçalves, Teresa Quaresma, Paulo |
Keywords: | machine learning named entity recognition |
Issue Date: | 2010 |
Publisher: | Springer-Verlag |
Abstract: | Information extraction from legal documents is an important and open problem. A mixed approach, using linguistic information and machine learning techniques, is described in this paper. In this approach, top-level legal concepts are identified and used for document classifica- tion using Support Vector Machines. Named entities, such as, locations, organizations, dates, and document references, are identified using se- mantic information from the output of a natural language parser. This information, legal concepts and named entities, may be used to popu- late a simple ontology, allowing the enrichment of documents and the creation of high-level legal information retrieval systems.
The proposed methodology was applied to a corpus of legal documents - from the EUR-Lex site – and it was evaluated. The obtained results were quite good and indicate this may be a promising approach to the legal information extraction problem. |
URI: | http://hdl.handle.net/10174/2556 |
ISBN: | 978-3-642-12836-3 |
Type: | article |
Appears in Collections: | INF - Artigos em Livros de Actas/Proceedings
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|