Similarity-Based Estimation for Document Summarization using Fuzzy Sets
Masrah Azrifah Azmi Murad, Trevor Martin
Pages - 1 - 12     |    Revised - 15-12-2007     |    Published - 30-12-2007
Volume - 1   Issue - 4    |    Publication Date - December 2007  Table of Contents
fuzzy sets, mass assignment, asymmetric word similarity, topic similarity, summarization
Information is increasing every day and thousands of documents are produced and made available in the Internet. The amount of information available in documents exceeds our capacity to read them. We need access to the right information without having to go through the whole document. Therefore, documents need to be compressed and produce an overview so that these documents can be utilized effectively. Thus, we propose a similarity model with topic similarity using fuzzy sets and probability theories to extract the most representative sentences. Sentences with high weights are extracted to form a summary. On average, our model (known as MySum) produces summaries that are 60% similar to the manually created summaries, while tf.isf algorithm produces summaries that are 30% similar. Two human summarizers, named P1 and P2, produce summaries that are 70% similar to each other using similar sets of documents obtained from TREC.
CITED BY (9)  
1 Ahmed, W. A., & Shamsuddin, S. M. Integration of Least Recently Used Algorithm and Neuro-Fuzzy System into Client-side Web Caching.
2 S. Mansor , R. B. Din and A. Samsudin , “Analysis of Natural Language Steganography”, International Journal of Computer Science and Security (IJCSS), 3(2), pp. 113 – 125, 2009.
3 W. A. Ahmed and S. M. Shamsuddin , “Integration of Least Recently Used Algorithm and Neuro-Fuzzy System into Client-side Web Caching” , International Journal of Computer Science and Security (IJCSS), 3(1), pp. 1 – 15, 2009.
4 R. Ahmad and A. Khanum , “Document Topic Generation in Text Mining by Using Cluster Analysis with EROCK”, International Journal of Computer Science and Security (IJCSS), 4(2), pp. 176 – 182, 2010.
5 M. S. Binwahlan, N. Salim and L. Suanmalui, “Fuzzy Swarm Diversity Hybrid Model for Text Summarization”, Information Processing & Management, 46(5), pp. 571–588, 2010.
6 Andriansyah¹, F., Baizal, Z. A., & Kurniati, A. P.Analisis peningkatan kualitas peringkasan teks menggunakan metode fuzzy dan algoritma genetika.
7 Wenerstrom, B., Ragade, R., & Kantardzic, M. (2012).ReClose Fuzz: Improved Automatic Summary Generation using Fuzzy Sets. ICSIIT 2012, 8.
8 Kavila, S. D., & Radhika, Y. (2015).Extractive Text Summarization Using Modified Weighing and Sentence Symmetric Feature Methods.
9 Barve, S., Desai, S., & Sardinha, R. (2016).Query-Based Extractive Text Summarization for Sanskrit. In Proceedings of the 4th International Conference on Frontiers in Intelligent Computing: Theory and Applications (FICTA) 2015 (pp. 559-568). Springer India.
