Repositório Digital de Publicações Científicas: In search of reputation assessment: experiences with polarity classification in RepLab 2013


Sign on to:
	Login
	My DSpace authorized users
	Edit Profile
	Receive email updates

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Helps
	Regulamento RDPC
	Guia do Utilizador RDPC
	Depósito RDPC
	Faq's RDPC

	Integração CV DeGóis
	Workshop Open Access

	Newsletter Open Access


	About Dspace
	DSpace Software

Repositorio Digital de Publicacoes Cientificas da Universidade de Evora

/ Departamento de Informática / INF - Artigos em Livros de Actas/Proceedings /

Please use this identifier to cite or link to this item: http://hdl.handle.net/10174/10352

Title:	In search of reputation assessment: experiences with polarity classification in RepLab 2013
Authors:	Saias, José
Editors:	Forner, Pamela Navigli, Roberto Tufis, Dan
Keywords:	opinion mining reputation assessment NLP Machine Learning
Issue Date:	Sep-2013
Publisher:	clef2013.org
Citation:	José Saias. In search of reputation assessment: Experiences with polarity classification in replab 2013. In Pamela Forner, Roberto Navigli, and Dan Tufis, editors, CLEF 2013 Evaluation Labs and Workshop Online Working Notes - Online Reputation Management (RepLab), Valencia, Spain, September 2013.
Abstract:	The diue system uses a supervised Machine Learning approach for the polarity classification subtask of RepLab. We used the Python NLTK for preprocessing, including file parsing, text analysis and feature extraction. Our best solution is a mixed strategy, combining bag-of-words with a limited set of features based on sentiment lexicons and superficial text analysis. This system begins by applying tokenization and lemmatization. Then each tweet content is analyzed and 18 features are obtained, related to presence of polarized term, negation before polarized expression and entity reference. For the first run, the learning and classification were performed with the Decision Tree algorithm, from the NLTK framework. In the second run, we used a pipeline of classifiers. The first classifier applies Naive Bayes in a bag-of-words feature model, with the 1500 most frequent words in the training set. The second classifier used the features from the first run plus another feature with the result from the previous classifier. Our system's best result had 0.54694 Accuracy and 0.31506 in F measure.
URI:	http://www.clef-initiative.eu/documents/71612/10fcd949-e5f0-4f00-8e01-cbd2a213e147 http://hdl.handle.net/10174/10352
ISBN:	978-88-904810-5-5
Type:	article
Appears in Collections:	INF - Artigos em Livros de Actas/Proceedings

Files in This Item:

File	Description	Size	Format
CLEF2013wn-RepLab-Saias2013.pdf		147.15 kB	Adobe PDF	View/Open

Serviços de Ciência e Cooperação - Universidade de Évora