THE SEGMENTASI DOKUMEN TEKS DENGAN METODE TEXTTILING

Authors

  • Chintalya Magdalena Universitas Kristen Indonesia
  • Bangun Hyolister Tambun Universitas Kristen Indonesia

DOI:

https://doi.org/10.33884/jif.v10i01.4509

Keywords:

Segmentasi, Dokumen Teks, TextTiling

Abstract

In this paper, we will report our work on text segmentation on Indonesian speech documents. As a result of using Automatic Speech Recognition (ASR), the speech documents are transcribed into the text without any boundary for each document. The documents are certainly needed to be segmented regarding to its topics. We apply TextTiling method with various term weighted techniques such as TF-IDF, TF-IDF-Mutual Information, TF-IDF Mutual Information-Word Similarity, and TF-IDF-Word Frequency for measuring the similarity between segments. The result show TF-IDF-Mutual Information performed better in most of the collections.

Downloads

Published

2022-03-01

How to Cite

Magdalena , C. ., & Tambun , B. H. . (2022). THE SEGMENTASI DOKUMEN TEKS DENGAN METODE TEXTTILING. JURNAL ILMIAH INFORMATIKA, 10(01), 8–14. https://doi.org/10.33884/jif.v10i01.4509