Supervised Sentiment Analysis of Science Topics: Developing a Training Set of Tweets in Spanish

Sánchez Holgado, Patricia; Arcila Calderón, Carlos

doi:10.4018/JITR.2020070105

Título

Supervised Sentiment Analysis of Science Topics: Developing a Training Set of Tweets in Spanish

Autor(es)

Sánchez Holgado, Patricia

Arcila Calderón, Carlos

Palabras clave

Big Data

Machine Learning

Science Communication

Sentiment Analysis

Social Media

Spanish

Twitter

Clasificación UNESCO

63 Sociología

6308 Comunicaciones Sociales

Fecha de publicación

2020-07-01

Editor

IGI Global Scientific Publishing

Citación

Sánchez-Holgado, P., & Arcila-Calderón, C. (2020). Supervised sentiment analysis of science topics: Developing a training set of tweets in Spanish. Journal of Information Technology Research, 13(3), 80-94. https://doi.org/10.4018/JITR.2020070105

Resumen

[EN]Twitter is one of the largest sources of real-time information on the Internet and is continuously fed by millions of users around the world. Each of these users publishes text messages with their opinions, concerns, information, or simply their daily happenings. It is a challenge to address the analysis of massive data in the network, just as it is an objective to look for ways to understand everything that data can offer today in terms of knowledge of society and the market. The sector of science communication is still discovering everything that the web 2.0 and social networks can offer to reach all audiences. This article develops a classification model of messages launched on Twitter, on science topics, in Spanish, with machine learning techniques. The training of this type of models requires the creation of a specific corpus in Spanish for the subject of science, which is one of the most laborious tasks. The classifier is able to predict the sentiment of the message in real time on Twitter, with a confidence interval greater than 80%. The results of its evaluation are at 72% accuracy.

URI

https://hdl.handle.net/10366/160791

ISSN

1938-7857

DOI

10.4018/JITR.2020070105

Versión del editor

https://doi.org/10.4018/JITR.2020070105

Aparece en las colecciones