Comparative Performance of Indonesian Stemming: PL/SQL Implementation of Nazief and Adriani Algorithm versus Sastrawi Library
DOI:
https://doi.org/10.24036/jtip.v19i2.1098Keywords:
Accuracy, Computing Time, Nazief and Adriani Algorithm, PL/SQL, Sastrawi LibraryAbstract
Text preprocessing in Indonesian applications commonly relied on external libraries such as Sastrawi. However, performing this task outside the database layer often introduced significant latency due to data communication overhead between the application and the server. This study proposed and evaluated a native stemming mechanism utilizing the Nazief and Adriani algorithm implemented directly within an Oracle PL/SQL environment. The primary objective was to determine whether in-database processing could offer better performance than the standard application-layer approach. The assessment compared the PL/SQL implementation against the Python-based Sastrawi library using a comprehensive dataset of 54,715 words sourced from the Kamus Besar Bahasa Indonesia (KBBI). Performance metrics focused on stemming accuracy and total execution time. The empirical results revealed that the proposed PL/SQL method achieved an accuracy of 96.82%, which proved slightly superior to the 96.58% accuracy obtained by Sastrawi. Furthermore, the stored procedure implementation demonstrated significant efficiency, completing the process in 602.22 seconds, whereas the baseline method required 1,259.28 seconds. It was concluded that migrating the stemming logic into the database layer effectively reduced execution time by approximately 52.18% while maintaining high precision. These findings suggested that native database implementation provided a more robust solution for systems requiring high-performance text processing.
References
M. Adriani, J. Asian, B. Nazief, H. E. Williams, and S. M. Tahaghoghi, "Stemming Indonesian: A confix-stripping approach," ACM Transactions on Asian Language Information Processing, vol. 6, no. 4, pp. 1–33, 2007, doi: 10.1145/1316457.1316459.
P. G. S. C. Nugraha and N. W. Wardani, "Stemming dokumen teks bahasa Bali dengan metode rule base approach," Jurnal Teknik Informatika dan Sistem Informasi (JATISI), vol. 7, no. 3, pp. 510-521, 2020, doi: 10.35957/jatisi.v7i3.538.
A. S. Rizki, A. Tjahyanto, and R. Trialih, "Comparison of stemming algorithms on Indonesian text processing," TELKOMNIKA, vol. 17, no. 1, pp. 95-102, 2019, doi: 10.12928/telkomnika.v17i1.10183.
A. Librian, "Sastrawi," GitHub, 2017. [Online]. Available: https://github.com/sastrawi/sastrawi.
M. A. Rosid, A. S. Fitrani, I. R. I. Astutik, N. I. Mulloh, and H. A. Keller, "Improving text preprocessing for student complaint document classification using Sastrawi," in IOP Conference Series: Materials Science and Engineering, vol. 874, no. 1, p. 012017, 2020, doi: 10.1088/1757-899X/874/1/012017.
S. Feuerstein and B. Pribyl, Oracle PL/SQL Programming. Sebastopol: O'Reilly Media, Inc., 2005.
M. Blacher, J. Giesen, S. Laue, J. Klaus, and V. Leis, "Machine learning, linear algebra, and more: Is SQL all you need?," in CIDR, 2022. [Online]. Available: https://www.cidrdb.org/cidr2022/papers/p17-blacher.pdf.
K. T. Wirawan, I. M. Sukarsa, and I. P. A. Bayupati, "Balinese historian chatbot using full-text search and artificial intelligence markup language method," International Journal of Intelligent Systems and Applications, vol. 11, no. 8, pp. 21–34, 2019, doi: 10.5815/ijisa.2019.08.03.
Jumadi, D. Sartika, and M. I. Musrifin, "Comparison of Nazief-Adriani and Paice-Husk algorithm for Indonesian text stemming process," IOP Conference Series: Materials Science and Engineering, vol. 1098, no. 3, p. 032044, 2021, doi: 10.1088/1757-899X/1098/3/032044.
D. E. Cahyani, L. M. T. Utami, and H. Setiadi, "Clustering of Javanese news in krama alus level with Javanese stemming," in International Conference on Information and Communications Technology (ICOIACT), 2019, pp. 462-467, doi: 10.1109/ICOIACT46704.2019.8938438.
N. W. Wardani and P. G. S. C. Nugraha, "Stemming teks bahasa Bali dengan algoritma enhanced confix stripping," International Journal of Natural Science and Engineering, vol. 4, no. 3, pp. 103–113, 2020, doi: 10.23887/ijnse.v4i3.30309.
M. F. Tanjung, "Boosting stemmer performance using cache method," Jurnal Matematika Dan Ilmu Pengetahuan Alam LLDikti Wilayah 1 (JUMPA), vol. 1, no. 1, pp. 6-9, 2021, doi: 10.54076/jumpa.v1i1.34.
I. Ahmad, P. M. R. I. B., and S. Samsugi, “Software development dengan extreme programming (XP) pada aplikasi deteksi kemiripan judul skripsi berbasis Android,” Jurnal Inovtek Polbeng Seri Informatika, vol. 5, no. 2, pp. 297-307, 2020, doi: 10.35314/isi.v5i2.1654.
K. Beck and C. Andres, Extreme Programming Explained: Embrace Change, 2nd ed. Boston: Addison-Wesley Professional, 2004.
Sugiyono, Metode Penelitian Kuantitatif, Kualitatif dan R&D. Bandung: Alfabeta, 2013.
T. Connolly and C. Begg, Database Systems: A Practical Approach to Design, Implementation, and Management, 6th ed. London: Pearson Education, 2015.
N. Pamungkas, A. P. Kharisma, and L. Fanani, “Comparison of stemming test results of Tala algorithms with Nazief Adriani in abstract documents and national news,” Jurnal Ilmiah Bidang Teknologi Informasi Dan Komunikasi (Inform), vol. 8, no. 1, pp. 33-41, 2023, doi: 10.25139/inform.v8i1.5569.
D. Mustikasari, D. S. Aga, S. M. Arifin, and N. Y. Setiawan, "Comparison of effectiveness of stemming algorithms in Indonesian documents," in Proceedings of the 2nd Borobudur International Symposium on Science and Technology (BIS-STE 2020), 2021, pp. 154-158, doi: 10.2991/aer.k.210810.025.
M. Hatta, “Stemmer bahasa Indonesia dengan pendekatan aturan,” Jurnal Teknologi Pintar (JUTP), vol. 2, no. 7, pp. 1-11, 2022. [Online]. Available: http://teknologipintar.org/index.php/teknologipintar/article/view/206.
W. A. Rifai and E. Winarko, “Modification of stemming algorithm using a non deterministic approach to Indonesian text,” Indonesian Journal of Computing and Cybernetics Systems (IJCCS), vol. 13, no. 4, pp. 379-388, 2019, doi: 10.22146/ijccs.49072.
A. T. Ni’mah, S. Rochimah, and S. Y. J. Prasetyo, “Autonomy stemmer algorithm for legal and illegal affix detection use finite-state automata method,” EPI International Journal of Engineering, vol. 2, no. 1, pp. 46-55, 2019, doi: 10.25042/epi-ije.022019.09.
N. Yusliani, R. Primartha, and M. D. Marieska, “Multiprocessing stemming: A case study of Indonesian stemming,” International Journal of Computer Applications, vol. 182, pp. 15-19, 2019, doi: 10.5120/ijca2019918476.
F. Riza, D. A. Ostheider, and M. A. Wibowo, “Information retrieval technique for Indonesian PDF document with modified stemming Porter method using PHP,” Journal of Physics: Conference Series, vol. 1477, no. 2, p. 032016, 2020, doi: 10.1088/1742-6596/1477/3/032016.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Jurnal Teknologi Informasi dan Pendidikan

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.













.png)













