The Design of Automatic Summarization of Indonesian Texts Using a Hybrid Approach

Kania Evita Dewi; Nelly Indriani Widiastuti

doi:10.24036/jtip.v15i1.451

Kania Evita Dewi Universitas Komputer Indonesia
Nelly Indriani Widiastuti Universitas Komputer Indonesia

DOI: https://doi.org/10.24036/jtip.v15i1.451

Keywords: Automatic Summarization, Indonesian Language

Abstract

This study aims to design a model for automatic text summarization in Indonesian. Automatic text summarization is a system that reduces the number of sentences without losing important information in the document. There are three approaches in making automatic text summarization, namely: extractive, abstractive, and hybrid. The extractive approach is to select a core sentence without changing it to a new sentence. The abstractive approach is to construct new sentences that describe the contents of the document. Meanwhile, the hybrid approach designed in this study is a combination of extractive and abstractive approaches. By designing automatic text summarization in Indonesian with a hybrid approach, it is hoped that the results issued by the system will be more like man-made summaries and have higher readability. The design made is specifically for the input of a document. The stages of automatic text summarization are divided into two, namely preprocessing and process. In the Preprocessing stage, sentence separation, tokenization, coreference resolution, stop words removal, feature extraction are carried out. There are two stages of making the summary, namely the extraction stage, to select important sentences and the abstraction stage, to select words and arrange them into summary sentences. Further research can be carried out for input of longer documents and the input is in the form of multi documents

Author Biographies

Kania Evita Dewi, Universitas Komputer Indonesia

Departemen Teknik Informatika

Nelly Indriani Widiastuti, Universitas Komputer Indonesia

Departemen Teknik Informatika

References

Torres-Moreno, Juan-Manuel, ed. Automatic text summarization. John Wiley & Sons, 2014.

Luhn, Hans Peter. "The automatic creation of literature abstracts." IBM Journal of research and development 2.2 (1958): 159-165.

Fitriaman, D., Khodra, M. L., and Trilaksono, B. R. 2011. Peringkasan Teks Otomatis Berita Berbahasa Indonesia Pada Multi-Document Menggunakan Metode Support Vector Machines [Theses] (Bandung: Institut Teknologi Bandung)

Rainarli, E., and K. E. Dewi. "Relevance Vector Machine for Summarization." IOP Conference Series: Materials Science and Engineering. Vol. 407. No. 1. IOP Publishing, 2018.

Raharjo, Suwanto, and Edi Winarko. "Klasterisasi, klasifikasi dan peringkasan teks berbahasa indonesia." Prosiding KOMMIT (2014).

Ammar, Ade Naufal, and Suyanto Suyanto. "Peringkasan Teks Ekstraktif Menggunakan Binary Firefly Algorithm." Indonesia Journal on Computing (Indo-JC) 5.2 (2020): 31-42.

Zulkifli, Zulkifli, Agung Toto Wibowo, and Gia Septiana. "Pembobotan Fitur Ekstraksi Pada Peringkasan Teks Bahasa Indonesia Menggunakan Algoritma Genetika." eProceedings of Engineering 2.2 (2015).

Rachmatullah, M. Naufal, and Anggina Primanita. "Implementasi Jaringan Syaraf Tiruan Pada sistem Peringkasan Teks Otomatis Menggunakan Ekstraksi Ciri." Seminar Nasional Teknologi Informasi dan Komunikasi (SENTIKA).

Slamet, Cepi, et al. "Automated text summarization for indonesian article using vector space model." IOP Conference Series: Materials Science and Engineering. Vol. 288. No. 1. IOP Publishing, 2018.

Widiastutik, Rully, P. C. S. W. Lukman Zaman, and Joan Santoso. "Peringkasan Teks Ekstraktif pada Dokumen Tunggal Menggunakan Metode Restricted Boltzmann Machine."

Indrianto, Rachmad. Peringkasan Teks Otomatis Pada Artikel Berita Kesehatan Menggunakan K-Nearest Neighbor Berbasis Fitur Statistik. Diss. Universitas Brawijaya, 2017.

Mustaqhfiri, Muchammad, Zainal Abidin, and Ririen Kusumawati. "Peringkasan teks otomatis berita berbahasa Indonesia menggunakan metode Maximum Marginal Relevance." Matics (2011).

Fang, Changjian, et al. "Word-sentence co-ranking for automatic extractive text summarization." Expert Systems with Applications 72 (2017): 189-195.

Ivanedra, Kasyfi, and Metty Mustikasari. "IMPLEMENTASI METODE RECURRENT NEURAL NETWORK PADA TEXT SUMMARIZATION DENGAN TEKNIK ABSTRAKTIF." Jurnal Teknologi Informasi dan Ilmu Komputer 6.4 (2019): 377-382.

Wang, Shuai, et al. "Integrating extractive and abstractive models for long text summarization." 2017 IEEE International Congress on Big Data (BigData Congress). IEEE, 2017.

Hadyan, Fadhlil, and Moch Arif Bijaksana. "Comparison of Document Index Graph Using TextRank and HITS Weighting Method in Automatic Text Summarization." Journal of Physics: Conference Series. Vol. 801. No. 1. IOP Publishing, 2017.

Silpi, Agna, and Suyanto. “Pointer Generator dan Coverage Weighting untuk Memperbaiki Peringkasan Abstraktif”. Ind Journal on Computing. Vol. 4 Issue 2. 2019.

Tala, Fadillah. "A study of stemming effects on information retrieval in Bahasa Indonesia." (2003).

Gotami, Nurina Savanti Widya, and Ratih Kartika Dewi Indriati. "Peringkasan Teks Otomatis Secara Ekstraktif Pada Artikel Berita Kesehatan Berbahasa Indonesia Dengan Menggunakan Metode Latent Semantic Analysis." Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer e-ISSN 2548 (2018): 964X.

Najibullah, Ahmad, and Wang Mingyan. "Otomatisasi peringkasan dokumen sebagai pendukung sistem manajemen surat." Register: Jurnal Ilmiah Teknologi Sistem Informasi 1.1 (2015): 1-6.

Zaman, Badrus, and Edi Winarko. "Analisis Fitur Kalimat untuk Peringkas Teks Otomatis pada Bahasa Indonesia." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 5.2 (2011).

Fattah, Mohamed Abdel, and Fuji Ren. "GA, MR, FFNN, PNN and GMM based models for automatic text summarization." Computer Speech & Language 23.1 (2009): 126-144.