Metaheuristic Aided Improved LSTM for Multi-document Summarization: A Hybrid Optimization Model
DOI:
https://doi.org/10.13052/jwe1540-9589.2246Keywords:
Multi-document summarization, LSTM, Score generation, BMICO, OptimizationAbstract
Multi-document summarization (MDS) is an automated process designed to extract information from various texts that have been written regarding the same subject. Here, we present a generic, extractive, MDS approach that employs steps like preprocessing, feature extraction, score generation, and summarization. The input text goes preprocessing steps such as lemmatization, stemming, and tokenization in the first stage. After preprocessing, features are extracted, including improved semantic similarity-based features, term frequency-inverse document frequency (TF-IDF-based features), and thematic-based features. Finally, an improved LSTM model will be proposed to summarize the document based on the scores considered under the objectives such as content coverage and redundancy reduction. The Blue Monkey Integrated Coot Optimization (BMICO) algorithm is proposed in this paper for fine-tuning the optimal weight of the LSTM model that ensures precise summarization. Finally, the suggested BMICO’s effectiveness is evaluated, and the outcome is successfully verified.
Downloads
References
Jesus M. Sanchez-Gomez a, Miguel A. Vega-Rodríguez, Carlos J. Pérez, “A decomposition-based multi-objective optimization approach for extractive multi-document text summarization”, Applied Soft Computing Journal, vol. 91, 2020.
Taner Uçkan, Ali Karcı, “Extractive multi-document text summarization based on graph independent sets”, Egyptian Informatics Journal, vol. 21, 2020.
Tran, NT., Nghiem, MQ., Nguyen, N.T.H. et al. ViMs: a high-quality Vietnamese dataset for abstractive multi-document summarization. Lang Resources and Evaluation 54, 893–920 (2020). https://doi.org/10.1007/s10579-020-09495-4.
Khaleghi, Z., Fakhredanesh, M. and Hourali, M. MSCSO: Extractive Multi-document Summarization Based on a New Criterion of Sentences Overlapping. Iran J Sci Technol Trans Electr Eng 45, 195–205 (2021). https://doi.org/10.1007/s40998-020-00361-1.
Roul, R.K. Topic modeling combined with classification technique for extractive multi-document text summarization. Soft Comput 25, 1113–1127 (2021). https://doi.org/10.1007/s00500-020-05207-w.
Min Yang, Xintong Wang, Yao Lu, Jianming Lv, Ying Shen, Chengming Li, “Plausibility-promoting generative adversarial network for abstractive text summarization with multi-task constraint”, Information Sciences, vol. 521, 2020.
Salima Lamsiyah, Abdelkader El Mahdaouy, Bernard Espinasse, Saïd El Alaoui Ouatik, “An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings”, Expert Systems with Applications, vol. 167, 2021.
Lamsiyah, S., El Mahdaouy, A., Ouatik El Alaoui, S. et al. Unsupervised query-focused multi-document summarization based on transfer learning from sentence embedding models, BM25 model, and maximal marginal relevance criterion. J Ambient Intell Human Comput (2021). https://doi.org/10.1007/s12652-021-03165-1.
Alireza Ghadimi, Hamid Beigy, “Deep submodular network: An application to multi-document summarization”, Expert Systems With Applications, vol. 152, 2020.
Gao, Y., Meyer, C.M. and Gurevych, I. Preference-based interactive multi-document summarisation. Inf Retrieval J 23, 555–585 (2020). https://doi.org/10.1007/s10791-019-09367-8.
Minakshi Tomer, Manoj Kumar, “Multi-document extractive text summarization based on firefly algorithm”, Journal of King Saud University – Computer and Information Sciences, 2021.
Hou Pong Chan, Irwin King, “A condense-then-select strategy for text summarization”, Knowledge-Based Systems, vol. 227, 2021.
Ramesh Chandra Belwal, Sawan Rai, Atul Gupta, “Text summarization using topic-based vector space model and semantic measure”, Information Processing and Management, vol. 58, 2021.
Srivastava, A.K., Pandey, D. and Agarwal, A. Extractive multi-document text summarization using dolphin swarm optimization approach. Multimed Tools Appl 80, 11273–11290 (2021). https://doi.org/10.1007/s11042-020-10176-1.
Mohammad Mojrian, Seyed Abolghasem Mirroshandel, “A novel extractive multi-document text summarization system using quantum-inspired genetic algorithm: MTSQIGA”, Expert Systems With Applications, vol. 121, 2021.
Shirin Akther Khanam, Fei Liu, Yi-Ping Phoebe Chen, “Joint knowledge-powered topic level attention for a convolutional text summarization model”, Knowledge-Based Systems, vol. 228, 2021.
Jesus M. Sanchez-Gomez, Miguel A. Vega-Rodríguez, Carlos J. Pérez, “The impact of term-weighting schemes and similarity measures on extractive multi-document text summarization”, Expert Systems With Applications, vol. 169, 2021.
Jesus M. Sanchez-Gomez, Miguel A. Vega-Rodríguez, Carlos J. Pérez, “A decomposition-based multi-objective optimization approach for extractive multi-document text summarization”, Applied Soft Computing Journal, 2020.
Leonhard Hennig and Berlin, “Topic-based Multi-Document Summarization with Probabilistic Latent Semantic Analysis”, International Conference RANLP 2009 – Borovets, Bulgaria.
Mohammad Bidoki, Mohammad R. Moosavi, Mostafa Fakhrahmad, “A semantic approach to extractive multi-document summarization: Applying sentence expansion for tuning of conceptual densities”, Information Processing and Management, vol. 57, 2020.
R. Alqaisi, W. Ghanem and A. Qaroush, “Extractive Multi-Document Arabic Text Summarization Using Evolutionary Multi-Objective Optimization With K-Medoid Clustering, “in IEEE Access, vol. 8, pp. 228206–228224, 2020, DOI: 10.1109/ACCESS.2020.3046494.
W. Li and H. Zhuge, “Abstractive Multi-Document Summarization Based on Semantic Link Network, “in IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 1, pp. 43–54, 1 Jan. 2021, DOI: 10.1109/TKDE.2019.2922957.
Lucie Skorkovska, “Application of Lemmatization and Summarization Methods in Topic Identification Module for Large Scale Language Modeling Data Filtering”, DOI: 10.1007/978-3-642-32790-2_23, 2012.
Marzieh Berenjkoub, Razieh Mehri, Hadi Khosravi Farsani, Mohammad Ali Nematbakhsh, “A method for stemming and eliminating common words for Persian text summarization”, DOI: 10.1109/NLPKE.2009.5313836, 2009.
R. Alqaisi, W. Ghanem and A. Qaroush, “Extractive Multi-Document Arabic Text Summarization Using Evolutionary Multi-Objective Optimization With K-Medoid Clustering, “in IEEE Access, vol. 8, pp. 228206–228224, 2020, DOI: 10.1109/ACCESS.2020.3046494.
Sahar Sohangir and Dingding Wang, Improved sqrt-cosine similarity measurement, Sohangir and Wang J Big Data (2017) 4:25, DOI: 10.1186/s40537-017-0083-6, 2017.
Bijoyan Das and Sarit Chakraborty, “An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation”, 2022.
Puruso Muhammad Hanunggul and Suyanto Suyanto, “The Impact of Local Attention in LSTM for Abstractive Text Summarization”, 2019 International seminar on information technology and intelligent systems (ISRITI).
Ruby, Usha, and Vamsidhar Yendapalli. “Binary cross entropy with deep learning technique for image classification.” Int. J. Adv. Trends Comput. Sci. Eng 9.10 (2020).
Iraj Naruei and Farshid Keynia, “A new optimization method based on COOT bird natural life model “, Expert Systems With Applications, vol. 183, 2021.
Maha Mahmood and Belal Al-Khateeb, “The blue monkey: A new nature inspired metaheuristic optimization algorithm”, Periodicals of Engineering and Natural Sciences, vol. 7, no. 3, 2019.
https://duc.nist.gov/data.html.
Song, S., Huang, H. and Ruan, T. Abstractive text summarization using LSTM-CNN based deep learning. Multimed Tools Appl 78, 857–875 (2019). https://doi.org/10.1007/s11042-018-5749-3.
Kasimahanthi Divya, Kambala Sneha, Baisetti Sowmya, G Sankara Rao,”Text Summarization using Deep Learning”, International Research Journal of Engineering and Technology (IRJET), vol. 7, 2020.