Analysis of Provocative Speech During the 2025 DPR Demonstration on X Using the IndoBERTweet Method

Nazhrin Nazarudin Achmad; Yuliant Sibaroni; Sri Suryani Prasetyowati

doi:10.24036/jtip.v19i2.1127

Authors

Nazhrin Nazarudin Achmad Universitas Telkom
Yuliant Sibaroni Universitas Telkom
Sri Suryani Prasetyowati Universitas Telkom

DOI:

https://doi.org/10.24036/jtip.v19i2.1127

Keywords:

Provocative Speech, IndoBERTweet, Text Classification, Natural Language Processing, Social Media Analysis

Abstract

Social media platforms have become important channels for public discussion during political events. During the DPR demonstrations in August 2025, online discussions on X (formerly Twitter) contained various forms of expressions, including provocative speech that may influence public opinion and collective behavior. Detecting such content automatically is challenging due to the informal language, slang, and contextual nuances commonly found in social media texts. This study aims to analyze provocative speech on the social media platform X using text classification techniques and transformer-based models. A total of 8,899 Indonesian tweets related to the demonstration period from August 25 to August 31, 2025 was collected using the Tweet Harvest crawling tool. The dataset was manually labeled into two categories, namely provocative and non-provocative, using a majority voting approach by three annotators. Several preprocessing steps were applied, including cleaning, normalization, stemming, tokenization, and stopword removal. Several models were evaluated, including Multinomial Naïve Bayes, Linear Support Vector Machine, BiLSTM, IndoBERT, and IndoBERTweet. Experimental results show that transformer-based models outperform traditional machine learning approaches. The best performance was achieved by the IndoBERTweet model with a learning rate of 3×10⁻⁵, achieving an accuracy of 93.07% and an F1-score of 91.56%. These findings indicate that domain-specific language models are effective for detecting provocative speech in Indonesian social media discussions related to political events.

References

“Viral Demo DPR 25 Agustus 2025, Netizen Ramai Komentar Begini.” Accessed: Nov. 17, 2025. [Online]. Available: https://www.cnbcindonesia.com/tech/20250825133814-37-661108/viral-demo-dpr-25-agustus-2025-netizen-ramai-komentar-begini

“Indonesian police clash with students protesting lawmakers’ salaries | Protests News | Al Jazeera.” Accessed: Apr. 04, 2026. [Online]. Available: https://www.aljazeera.com/news/2025/8/26/indonesian-police-clash-with-students-protesting-lawmakers-salaries

“Arti kata provokatif - Kamus Besar Bahasa Indonesia (KBBI) Online.” Accessed: Nov. 18, 2025. [Online]. Available: https://kbbi.web.id/provokatif

E. W. Pamungkas, D. G. P. Putri, and A. Fatmawati, “Hate Speech Detection in Bahasa Indonesia: Challenges and Opportunities,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 6, pp. 1175–1181, 2023, doi: 10.14569/IJACSA.2023.01406125.

M. O. Ibrohim and I. Budi, “Hate speech and abusive language detection in Indonesian social media: Progress and challenges,” Heliyon, vol. 9, no. 8, Aug. 2023, doi: 10.1016/J.HELIYON.2023.E18647.

H. Rahman, Y. H. Putra, H. Syarif, E. Delenia, Y. Findawati, and D. Purwitasari, “Dangerous Speech in Indonesian Twitter Posts: A Literature Review,” Proceeding - 2024 International Conference on Information Technology Research and Innovation, ICITRI 2024, pp. 269–274, 2024, doi: 10.1109/ICITRI62858.2024.10698923.

J. Forry Kusuma and A. Chowanda, “Indonesian Hate Speech Detection Using IndoBERTweet and BiLSTM on Twitter,” JOIV International Journal on Informatics Visualization, vol. 7, pp. 773–780, 2023, [Online]. Available: www.joiv.org/index.php/joiv

A. P. J. Dwitama, D. H. Fudholi, and S. Hidayat, “Indonesian Hate Speech Detection Using Bidirectional Long Short-Term Memory (Bi-LSTM),” Jurnal RESTI, vol. 7, no. 2, pp. 302–309, Apr. 2023, doi: 10.29207/resti.v7i2.4642.

A. R. P. Dewi, S. Riyadi, D. Cahya, N. A. M. Isa, and A. D. Andriyani, “Sentiment Analysis of Pro-Israel Product Boycott Action Using IndoBERT Method on Unbalanced Data,” JUITA: Jurnal Informatika, vol. 13, pp. 187–197, 2025.

P. Sayarizki, Hasmawati, and H. Nurrahmi, “Implementation of IndoBERT for Sentiment Analysis of Indonesian Presidential Candidates,” Journal on Computing, vol. 9, no. 2, pp. 61–72, 2024, doi: 10.34818/indojc.2024.9.2.934.

“GitHub - helmisatria/tweet-harvest: Scrape tweets from Twitter search results based on keywords and date range using Playwright. Save scraped tweets in a CSV file for easy analysis · GitHub.” Accessed: Mar. 09, 2026. [Online]. Available: https://github.com/helmisatria/tweet-harvest

“GitHub - stopwords-iso/stopwords-id: Indonesian stopwords collection · GitHub.” Accessed: Mar. 07, 2026. [Online]. Available: https://github.com/stopwords-iso/stopwords-id

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” Accessed: Dec. 04, 2025. [Online]. Available: https://www.researchgate.net/publication/328230984_BERT_Pre-training_of_Deep_Bidirectional_Transformers_for_Language_Understanding

F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference, pp. 757–770, Nov. 2020, doi: 10.18653/v1/2020.coling-main.66.

F. Koto, J. H. Lau, and T. Baldwin, “IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization,” EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings, pp. 10660–10668, Sep. 2021, doi: 10.18653/v1/2021.emnlp-main.833.

J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), vol. 1, pp. 328–339, 2018, doi: 10.18653/V1/P18-1031.

T. Wolf et al., “HuggingFace’s Transformers: State-of-the-art Natural Language Processing,” Journal of Machine Learning Research, Oct. 2019, Accessed: Mar. 09, 2026. [Online]. Available: https://arxiv.org/abs/1910.03771v5

S. Eltahier, O. Dawood, and I. Saeed, “BERT Fine-Tuning for Software Requirement Classification: Impact of Model Components and Dataset Size,” Information 2025, Vol. 16, Page 981, vol. 16, no. 11, p. 981, Nov. 2025, doi: 10.3390/INFO16110981.

M. Bilal and A. A. Almazroi, “Effectiveness of Fine-tuned BERT Model in Classification of Helpful and Unhelpful Online Customer Reviews,” Electronic Commerce Research, vol. 23, no. 4, pp. 2737–2757, Dec. 2023, doi: 10.1007/S10660-022-09560-W.

B. T. Yulianto Darmawan, B. R. Irnawan, and Y. Suzuki, “Indonesian Hate Speech and Abusive Tweets Classification with Deep Learning Pre-trained Language Models,” Proceedings - 2023 6th International Conference on Computer and Informatics Engineering: AI Trust, Risk and Security Management (AI Trism), IC2IE 2023, pp. 30–35, 2023, doi: 10.1109/IC2IE60547.2023.10331354.

R. A. Saputra and Y. Sibaroni, “Multilabel Hate Speech Classification in Indonesian Political Discourse on X using Combined Deep Learning Models with Considering Sentence Length,” Jurnal Ilmu Komputer dan Informasi, vol. 18, no. 1, pp. 113–125, Feb. 2025, doi: 10.21609/JIKI.V18I1.1440.

M. I. Wijanarko, L. Susanto, P. A. Pratama, I. Idris, T. Hong, and D. Wijaya, “Monitoring Hate Speech in Indonesia: An NLP-based Classification of Social Media Texts,” EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Proceedings of System Demonstrations, pp. 142–152, 2024, doi: 10.18653/V1/2024.EMNLP-DEMO.15.

L. Susanto et al., “IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language,” Jun. 2024, Accessed: Apr. 04, 2026. [Online]. Available: https://arxiv.org/pdf/2406.19349

E. F. Cahyani, A. Nur Ikhsan, D. N. Astrida, and A. N. Ikhsan, “Event-Based Detection of Provocative Political Discourse on Indonesian Twitter: A Comparative Study of SVM and IndoBERT,” Journal of Information Systems and Informatics, vol. 8, no. 1, pp. 530–548, Feb. 2026, doi: 10.63158/JOURNALISI.V8I1.1409.

P. A. Mufva, K. H. Chandra, K. F. Aji, I. A. Iswanto, and S. Joddy, “Performance comparison of deep learning approaches for Indonesian twitter hate speech detection using IndoBERTweet embedding,” Procedia Comput. Sci., vol. 269, pp. 1663–1671, 2025, doi: 10.1016/J.PROCS.2025.09.109.

Analysis of Provocative Speech During the 2025 DPR Demonstration on X Using the IndoBERTweet Method

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License