• español
  • English
  • français
  • Deutsch
  • português (Brasil)
  • italiano
  • Contacto
  • Sugerencias
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    JavaScript is disabled for your browser. Some features of this site may not work without it.
    Gredos. Repositorio documental de la Universidad de SalamancaUniversidad de Salamanca
    Consorcio BUCLE Recolector

    Listar

    Todo GredosComunidades y ColeccionesPor fecha de publicaciónAutoresMateriasTítulosEsta colecciónPor fecha de publicaciónAutoresMateriasTítulos

    Mi cuenta

    AccederRegistro

    Estadísticas

    Ver Estadísticas de uso
    Estadísticas totales de uso y lectura

    ENLACES Y ACCESOS

    Derechos de autorPolíticasGuías de autoarchivoFAQAdhesión USAL a la Declaración de BerlínProtocolo de depósito, modificación y retirada de documentos y datosSolicitud de depósito, modificación y retirada de documentos y datos

    COMPARTIR

    Ver ítem 
    •   Gredos Principal
    • Repositorio Científico
    • Publicaciones periódicas EUSAL
    • ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal
    • ADCAIJ - 2020
    • ADCAIJ, Vol.9, n.2
    • Ver ítem
    •   Gredos Principal
    • Repositorio Científico
    • Publicaciones periódicas EUSAL
    • ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal
    • ADCAIJ - 2020
    • ADCAIJ, Vol.9, n.2
    • Ver ítem

    Compartir

    Exportar

    RISMendeleyRefworksZotero
    • edm
    • marc
    • xoai
    • qdc
    • ore
    • ese
    • dim
    • uketd_dc
    • oai_dc
    • etdms
    • rdf
    • mods
    • mets
    • didl
    • premis

    Citas

    Título
    Influence of Pre-Processing Strategies on the Performance of ML Classifiers Exploiting TF-IDF and BOW Features
    Autor(es)
    Pimpalkar, Amit Purushottam
    Retna Raj, R. Jeberson
    Palabras clave
    Pre-Processing
    Sentiment Analysis
    BOW
    TF-IDF
    Evaluations Metrics
    Twitter Classifier
    Twitter
    Pre-Processing
    Sentiment Analysis
    BOW
    TF-IDF
    Evaluations Metrics
    Twitter Classifier
    Twitter
    Fecha de publicación
    2020-06-18
    Editor
    Ediciones Universidad de Salamanca (España)
    Citación
    ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 9 (2020)
    Resumen
    Data analytics and its associated applications have recently become impor-tant fields of study. The subject of concern for researchers now-a-days is a massive amount of data produced every minute and second as people con-stantly sharing thoughts, opinions about things that are associated with them. Social media info, however, is still unstructured, disseminated and hard to handle and need to be developed a strong foundation so that they can be utilized as valuable information on a particular topic. Processing such unstructured data in this area in terms of noise, co-relevance, emoticons, folksonomies and slangs is really quite challenging and therefore requires proper data pre-processing before getting the right sentiments. The dataset is extracted from Kaggle and Twitter, pre-processing performed using NLTK and Scikit-learn and features selection and extraction is done for Bag of Words (BOW), Term Frequency (TF) and Inverse Document Frequency (IDF) scheme. /nFor polarity identification, we evaluated five different Machine Learning (ML) algorithms viz Multinomial Naive Bayes (MNB), Logistic Regression (LR), Decision Trees (DT), XGBoost (XGB) and Support Vector Machines (SVM). We have performed a comparative analysis of the success for these algorithms in order to decide which algorithm works best for the given data-set in terms of recall, accuracy, F1-score and precision. We assess the effects of various pre-processing techniques on two datasets; one with domain and other not. It is demonstrated that SVM classifier outperformed the other classifiers with superior evaluations of 73.12% and 94.91% for accuracy and precision respectively. It is also highlighted in this research that the selection and representation of features along with various pre-processing techniques have a positive impact on the performance of the classification. The ultimate outcome indicates an improvement in sentiment classification and we noted that pre-processing approaches obviously suggest an improvement in the efficiency of the classifiers.
    URI
    https://hdl.handle.net/10366/146091
    ISSN
    2255-2863
    Aparece en las colecciones
    • ADCAIJ, Vol.9, n.2 [9]
    Mostrar el registro completo del ítem
    Ficheros en el ítem
    Nombre:
    Influence_of_Pre-Processing_Strategies_o.pdf
    Tamaño:
    474.8Kb
    Formato:
    Adobe PDF
    Thumbnail
    Visualizar/Abrir
     
    Universidad de Salamanca
    AVISO LEGAL Y POLÍTICA DE PRIVACIDAD
    2024 © UNIVERSIDAD DE SALAMANCA
     
    Universidad de Salamanca
    AVISO LEGAL Y POLÍTICA DE PRIVACIDAD
    2024 © UNIVERSIDAD DE SALAMANCA