THE SEGMENTASI DOKUMEN TEKS DENGAN METODE TEXTTILING
Keywords:Segmentasi, Dokumen Teks, TextTiling
In this paper, we will report our work on text segmentation on Indonesian speech documents. As a result of using Automatic Speech Recognition (ASR), the speech documents are transcribed into the text without any boundary for each document. The documents are certainly needed to be segmented regarding to its topics. We apply TextTiling method with various term weighted techniques such as TF-IDF, TF-IDF-Mutual Information, TF-IDF Mutual Information-Word Similarity, and TF-IDF-Word Frequency for measuring the similarity between segments. The result show TF-IDF-Mutual Information performed better in most of the collections.
How to Cite
Copyright (c) 2022 JURNAL ILMIAH INFORMATIKA
This work is licensed under a Creative Commons Attribution 4.0 International License.