Sentiment Analysis of Telkom University using the Long Short-Term Memory and Word2Vec Feature Expansion

Ahmad Alfarel; Hasmawati Hasmawati; Bunyamin Bunyamin

doi:10.24036/jtip.v17i2.889

Ahmad Alfarel Telkom University
Hasmawati Hasmawati Telkom University
Bunyamin Bunyamin Telkom University

DOI: https://doi.org/10.24036/jtip.v17i2.889

Keywords: Sentiment Analysis Word2Vec Confusion Matrix LSTM Feature Expansion

Abstract

One of Indonesia's top private universities is Telkom University, and branding is an important aspect of maintaining its reputation. In the digital era, social media has become the main platform for people to express their opinions on various topics, including educational institutions. This research aims to analyze public sentiment towards Telkom University on platform X (formerly Twitter) by using the Long Short-Term Memory (LSTM) method and Word2Vec Feature expansion. The data used consists of 6,627 tweets collected between November 2022 and November 2023. Sentiments were categorized into "Positive," "Negative," and "Neutral". The research stages include data collection, preprocessing, feature extraction using TF-IDF, and feature expansion with Word2Vec. The research results evaluated by calculating accuracy, F1-Score, Precision, and Recall with the help of a confusion matrix. There is a very severe data imbalance in Negative Negative sentiment compared to other sentiments. By doing SMOTE oversampling, feature extraction, and also feature expansion combined with LSTM, the best results are obtained with 91% accuracy, 91% F-1 Score, 91% Precision, and 91% Recall. These results can help Telkom University understand public perception and manage its brand image more effectively.

References

M. Parvez Mollah, “An LSTM model for Twitter Sentiment Analysis,” arXiv e-prints, p. arXiv-2212, 2022.

D. Normawati and S. A. Prayogi, “Implementasi Naïve Bayes Classifier Dan Confusion Matrix Pada Analisis Sentimen Berbasis Teks Pada Twitter,” J-SAKTI (Jurnal Sains Komputer dan Informatika), vol. 5, no. 2, pp. 697–711, Sep. 2021, doi: 10.30645/J-SAKTI.V5I2.369.

I. B. Prakoso, D. Richasdy, and M. D. Purbolaksono, “Sentiment Analysis of Telkom University as the Best BPU in Indonesia Using the Random Forest Method,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 6, no. 4, pp. 2050–2055, Oct. 2022, doi: 10.30865/mib.v6i4.4567.

S. A. Ryanto, D. Richasdy, and W. Astuti, “Partner Sentiment Analysis for Telkom University on Twitter Social Media Using Decision Tree (CART) Algorithm,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 6, no. 4, pp. 1940–1948, Oct. 2022, doi: 10.30865/mib.v6i4.4533.

P. F. Muhammad, R. Kusumaningrum, and A. Wibowo, “Sentiment Analysis Using Word2vec And Long Short-Term Memory (LSTM) For Indonesian Hotel Reviews,” Procedia Comput Sci, vol. 179, pp. 728–735, Jan. 2021, doi: 10.1016/J.PROCS.2021.01.061.

R. Al Bashaireh, V. Sabeeh, and M. Zohdy, “Towards a new indicator for evaluating universities based on twitter sentiment analysis,” Proceedings - 6th Annual Conference on Computational Science and Computational Intelligence, CSCI 2019, pp. 1398–1404, Dec. 2019, doi: 10.1109/CSCI49370.2019.00261.

S. F. Pane and J. Ramdan, “Pemodelan Machine Learning : Analisis Sentimen Masyarakat Terhadap Kebijakan PPKM Menggunakan Data Twitter,” Jurnal Sistem Cerdas, vol. 5, no. 1, pp. 12–20, May 2022, doi: 10.37396/JSC.V5I1.191.

A. I. Kadhim, “Term Weighting for Feature Extraction on Twitter: A Comparison between BM25 and TF-IDF,” 2019 International Conference on Advanced Science and Engineering, ICOASE 2019, pp. 124–128, Apr. 2019, doi: 10.1109/ICOASE.2019.8723825.

D. Atika, A. Ari Aldino, S. Informasi, J. Pagar Alam No, L. Ratu, and K. Kedaton, “TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY SUPPORT VECTOR MACHINE UNTUK ANALISIS SENTIMEN OPINI MASYARAKAT TERHADAP TEKANAN MENTAL PADA MEDIA SOSIAL TWITTER,” Jurnal Teknologi dan Sistem Informasi, vol. 3, no. 4, pp. 86–97, 2022, Accessed: Aug. 01, 2024. [Online]. Available: https://jim.teknokrat.ac.id/index.php/sisteminformasi/article/view/2054

B. Jang, I. Kim, and J. W. Kim, “Word2vec convolutional neural networks for classification of news articles and tweets,” PLoS One, vol. 14, no. 8, p. e0220976, Aug. 2019, doi: 10.1371/JOURNAL.PONE.0220976.

G. Di Gennaro, A. Buonanno, and F. A. N. Palmieri, “Considerations about learning Word2Vec,” Journal of Supercomputing, vol. 77, no. 11, pp. 12320–12335, Nov. 2021, doi: 10.1007/S11227-021-03743-2/FIGURES/7.

D. Jatnika, M. A. Bijaksana, and A. A. Suryani, “Word2Vec Model Analysis for Semantic Similarities in English Words,” Procedia Comput Sci, vol. 157, pp. 160–167, Jan. 2019, doi: 10.1016/J.PROCS.2019.08.153.

A. R. Royyan and E. B. Setiawan, “Feature Expansion Word2Vec for Sentiment Analysis of Public Policy in Twitter,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 1, pp. 78–84, Feb. 2022, doi: 10.29207/RESTI.V6I1.3525.

G. Van Houdt, C. Mosquera, and G. Nápoles, “A review on the long short-term memory model,” Artif Intell Rev, vol. 53, no. 8, pp. 5929–5955, Dec. 2020, doi: 10.1007/S10462-020-09838-1/TABLES/1.

S. F. Handayani et al., “Analisis Sentimen pada Data Ulasan Twitter dengan Long-Short Term Memory,” JTERA (Jurnal Teknologi Rekayasa), vol. 7, no. 1, pp. 39–46, Jun. 2022, doi: 10.31544/jtera.v7.i1.2022.39-46.

D. Krstinić, M. Braović, L. Šerić, and D. Božić-Štulić, “Multi-label classifier performance evaluation with confusion matrix,” Computer Science & Information Technology, vol. 1, 2020.

Ž. Vujović, “Classification model evaluation metrics,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 6, pp. 599–606, 2021.

M. Rahayu, A. Luthfiarta, L. Cahyaningrum, and A. N. Azzahra, “Pengaruh Oversampling dan Cross Validation Pada Model Machine Learning Untuk Sentimen Analisis Kebijakan Luaran Kelulusan Mahasiswa,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 8, no. 1, pp. 163–172, Jan. 2024, doi: 10.30865/MIB.V8I1.7012.

H. Kaur, H. S. Pannu, and A. K. Malhi, “A Systematic Review on Imbalanced Data Challenges in Machine Learning,” ACM Computing Surveys (CSUR), vol. 52, no. 4, Aug. 2019, doi: 10.1145/3343440.

P. Vuttipittayamongkol, E. Elyan, and A. Petrovski, “On the class overlap problem in imbalanced data classification,” Knowl Based Syst, vol. 212, p. 106631, Jan. 2021, doi: 10.1016/J.KNOSYS.2020.106631.

K. E. Turner, A. Thompson, I. Harris, M. Ferguson, and F. Sohel, “Deep learning based classification of sheep behaviour from accelerometer data with imbalance,” Information Processing in Agriculture, vol. 10, no. 3, pp. 377–390, Sep. 2023, doi: 10.1016/J.INPA.2022.04.001.

S. Riyanto, I. S. Sitanggang, T. Djatna, and T. D. Atikah, “Comparative Analysis using Various Performance Metrics in Imbalanced Data for Multi-class Text Classification,” IJACSA) International Journal of Advanced Computer Science and Applications, vol. 14, no. 6, p. 2023, Accessed: Aug. 08, 2024. [Online]. Available: http://gcancer.org/pdr.