ANALYSIS AND DETECTION OF HOAX CONTENTS IN INDONESIAN NEWS BASED ON MACHINE LEARNING

Authors

  • Tansa TA Putri Universitas Prima Indonesia
  • Hendryx Warra S Universitas Prima Indonesia
  • Irma Yanti Sitepu Universitas Prima Indonesia
  • Marita Sihombing Universitas Prima Indonesia
  • Silvi Silvi Universitas Prima Indonesia

Abstract

Hoax newsthat contain incorrect (false) information often become public consumption on social media today. This hoax phenomenon raises doubts about information and makes confusion in the community. In this study, experiments conducted aimed at selecting the best algorithm in classifying hoax and non-hoax news with the number of data in 251 articles in Indonesian language (100 hoax articles and 151 non-hoax articles) using text mining method and machine learning based approaches. This research undergoes the text preprocessing phase which consists of tokenizing, case folding, filtering, stopwords removing, stemming and TF-IDF weighting using unigram and bigram combine features before processing it into classification text. The results of this research is the Random Forest algorithm that gets the best accuracy in classifying hoax and non-hoax news compared to the Multilayer Perceptron algorithm, Naïve Bayes, Support Vector Machine, and Decision Tree with an accuracy value of 76.47%.

Author Biographies

Tansa TA Putri, Universitas Prima Indonesia

TEKNIK INFORMATIKA

Hendryx Warra S, Universitas Prima Indonesia

TEKNIK INFORMATIKA

Irma Yanti Sitepu, Universitas Prima Indonesia

TEKNIK INFORMATIKA

Marita Sihombing, Universitas Prima Indonesia

TEKNIK INFORMATIKA

Silvi Silvi, Universitas Prima Indonesia

TEKNIK INFORMATIKA

References

Asiyah. S. N., &Fithriasari. K., (2016): Klasifikasi Berita Online Menggunakan Metode Support Vector Machine dan K-Nearest Neighbor, Surabaya: Jurnal Sains dan Seni ITS.

Binarwati. L., Mukhlash. I., &Soetrisno. S., (2017): Implementasi Algoritma Genetika untuk Optimalisasi Random Forest dalam Proses Klasifikasi Penerimaan Tenaga Kerja Baru :Studi Kasus PT.XYZ, Surabaya: Jurnal Sains dan Seni ITS.

Breiman. L., (2001): Random Forests, Berkeley: University of California.

Dahlan. M. A., (2017): Ahli: “Hoax” MerupakanKabar yang Direncanakan, Jakarta: ANTARA News.

Ghosh. S., Biswas. S., Sarkar. D., & Sarkar. P. P., (2014): A Novel Neuro-fuzzy Classification Technique for Data Mining, India: Egyptian Informatics Journal, 129-147. Harlian. M., (2006): Machine Learning Text Kategorization, Austin: University of Texas.

Juditha. C., (2018): Interaksi Simbolik dalam Komunitas Virtual Anti Hoak suntuk Mengurangi Penyebaran Hoaks, Jakarta: Jurnal PIKOM, vol. 19, no. 1, Kementerian Komunikasi dan Informatika RI.

Manning. C. D., Raghavan. P., &Schutze. H., (2009): An Introduction to Information Retrieval, Cambridge: Cambridge University Press.

Mitchell. T. M., (1997): Machine Learning, Singapore: McGraw-Hill.

Monohevita. L., (2017): Stop Menyebarkan Hoax, Depok: Universitas Indonesia.

Negnevitsky. M., (2005): Artificial Intelligence: A Guide to Intelligent System (2nd Ed), Harlow: Pearson Education.

Nugroho. Y. S., (2014): Penerapan Algoritma C4.5 untuk Klasifikasi Predikat Kelulusan Mahasiswa Fakultas Komunikasi dan Informatika Universitas Muhammadiyah Surakarta, Yogyakarta: Prosiding Seminar Nasional Aplikasi Sains & Teknologi (SNAST).

Nugroho. Y. S., & Emiliyawati. N., (2017): Sistem Klasifikasi Variabel Tingkat Penerimaan Konsumen Terhadap Mobil Menggunakan Metode Random Forest, Surakarta: Jurnal Teknik Elektro, vol. 9, no. 1.

Prasojo. L. D., &Riyanto, (2011): Teknologi Informasi Pendidikan, Yogyakarta: Gava Media. ISBN: 978-602-8545-28-0.

Rasywir. E., & Purwarianti. A., (2015): Eksperimen pada Sistem Klasifikasi Berita Hoax Berbahasa Indonesia Berbasis Pembelajaran Mesin. Bandung: Jurnal Cybermatika, vol. 3, no. 2.

Republik Indonesia, (1946): Undang-Undang Republik Indonesia No. 1 Tahun 1946 Tentang Peraturan Hukum Pidana, Jakarta: Sekretariat Negara.

Republik Indonesia, (2008): Undang-Undang Republik Indonesia No. 11 Tahun 2008 Tentang Informasidan Transaksi Elektronik, Jakarta: Sekretariat Negara.

Warsita. B., (2008): Teknologi Pembelajaran: Landasan dan Aplikasinya, Jakarta: Rineka Cipta.

Weddiningrum. F. G., (2018): Deteksi Konten Hoax Berbahasa Indonesia pada Media Sosial menggunakan Metode Levenshtein Distance, Surabaya: Universitas Islam Negeri sunan Ampel.

Harlian. M., (2006): Machine Learning Text Kategorization, Austin: University of Texas

Downloads

Published

2019-03-28