Pengaruh Metode Penyeimbangan Kelas Terhadap Tingkat Akurasi Analisis Sentimen pada Tweets Berbahasa Indonesia

Ivan Nathaniel Husada; Hapnes Toba

doi:10.28932/jutisi.v6i2.2743

PDF (English)

Diterbitkan: Aug 11, 2020

DOI: https://doi.org/10.28932/jutisi.v6i2.2743

Ivan Nathaniel Husada

Universitas Kristen Maranatha

Hapnes Toba

Abstrak

Nowadays internet access is getting easier to get. Because of the ease of access to the internet, almost all internet users have social media. Social media is widely used by users to call out their opinions or even to make complaints about a matter and also discuss a topic with other social media users. From many existing social media, one that is popularly used for that activity is Twitter. Sentiment analysis on Twitter has become possible because of the activities of these Twitter users. In this research, the authors explore sentiment analysis with bag-of-words and Term Frequency Inverse Document Frequency (TF-IDF) features extraction based on tweets from Indonesian Twitter users. The data obtained is in imbalanced condition, so that it requires a method to overcome them. The method for overcoming imbalanced dataset uses a resampling approach which combines over and under sampling strategies. The results of sentiment analysis accuracies with Naïve Bayes and neural networks before and after input data resampling are also compared. Naïve Bayes methods that will be used are Multinomial Naïve Bayes and Complement Naïve Bayes, while the Neural Network architecture that will be used as a comparison are Recurrent Neural Networks, Long Short-Term Memory, Gated Recurrent Units, Convolutional Neural Networks, and a combination of Convolutional Neural Networks and Long Short-Term Memory. Our experiments show the following harmonic scores (F1) of the sentiment analysis models: the Multinomial Naïve Bayes F1 score is 55.48, Complement Naïve Bayes is 51.33, Recurrent Neural Network is 75.70, Long Short-Term Memory is 78.36, Gated Recurrent Unit is 77.96, Convolutional Neural Network is 76.12, and finally the combination of Convolutional Neural Networks and Long Short-Term Memory achieves 81.14.

Unduhan

Data unduhan belum tersedia.

Cara Mengutip

[1]

I. N. Husada dan H. Toba, “Pengaruh Metode Penyeimbangan Kelas Terhadap Tingkat Akurasi Analisis Sentimen pada Tweets Berbahasa Indonesia”, JuTISI, vol. 6, no. 2, Agu 2020.

Terbitan

Vol 6 No 2 (2020): JuTISI

Bagian

Articles

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial used, distribution and reproduction in any medium.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Bilah Samping Artikel

Isi Artikel Utama

Abstrak

Unduhan

Rincian Artikel

Artikel paling banyak dibaca berdasarkan penulis yang sama