A Novel Framework for Semantic Oriented Abstractive Text Summarization

  • N. Moratanch Research Scholar, Department of Computer Science and Engineering, College of Engineering, Anna University, Chennai
  • S. Chitrakala Professor, Department of Computer Science and Engineering, College of Engineering, Anna University, Chennai
Keywords: Abstractive Summarization, Predicate Sense Disambiguation, Semantic Role Labelling, Genetic algorithm, Language generation

Abstract

Internet continues to be the most communal of all the mass media turning information content from being scarce to superabundant that is evidenced by its increase tenfold every five years. A powerful text summarizer can aid in balancing this overload but generating quality summaries of the target content thereby reducing time and effort to mine the required information. The proposed system aims is to develop a Semantic Oriented Abstractive Summarization to generate abstractive summaries with increased readability and qualitative content. The contribution of our works are Joint Model Predicate Sense Disambiguation and Semantic Role Labelling termed as Joint (PSD+SRL) is proposed to better capture the semantic representation of text. The content selection involves semantic based content selection and feature extraction are selected by Genetic Algorithm. The proposed system can be very useful for the students who want to read a whole book in a short time. Our experimental study is carried out using DUC, a typical corpus for text summarization.

Downloads

Download data is not yet available.

Author Biographies

N. Moratanch, Research Scholar, Department of Computer Science and Engineering, College of Engineering, Anna University, Chennai

N. Moratanch is currently pursuing Ph.D. in Anna University, Chennai, Tamil Nadu, India. She has published 4 IEEE papers in International Conferences and one book chapter in Springer. Area of interest towards Data Mining, Natural Language Processing, Information Retrieval and Deep Learning.

S. Chitrakala, Professor, Department of Computer Science and Engineering, College of Engineering, Anna University, Chennai

S. Chitrakala is a Professor, Department of Computer Science and Engineering at Anna University, Chennai, Tamil Nadu, India. Her research interests include data mining, computer vision, artificial intelligence, web information retrieval and natural language processing, text mining, specifically application of statistical and NLP techniques in big data. Her research contributions have culminated in 108 publications which include 43 international journals and 65 international conferences. She is the reviewer for various journals and international conferences. She is a life member of CSI and life member of Indian Society of Technical Education ISTE, New Delhi.

References

Achananuparp XH Palakorn, Yang CC (2009) “addressing the variability of natural language expression in sentence similarity with semantic structure of the sentences”. In: Proceedings of Advances in Knowledge Discovery and Data Mining. Springer Berlin Heidelberg, Vol. 5476, pp. 548–555.

Avanija J, Ramar K (2013) A hybrid approach using pso and k-means for semantic clustering of web documents. Journal of Web Engineering 12, no 3&4 pp: 249–264.

Azmi AM, Altmami NI (2018)An abstractive arabic text summarizer with user controlled granularity. Information Processing and Management 54(6):903–921.

Bakaev SHVK Maxim, Gaedke M (2019) Auto-extraction and integration of metrics for web user interfaces. Journal of Web Engineering 17, no 6&7: 561–590.

Barros C, Lloret E, Saquete E, Navarro-Colorado B (2019) Natsum: Narrative abstractive summarization through cross-document timeline generation. Information Processing and Management.

Che W, Liu T (2010) Using word sense disambiguation for semantic role labeling. Tech. rep., In Universal Communication Symposium (IUCS), 4th International, pp. 167–174. IEEE.

Chen YC, Bansal M (2018) Fast abstractive summarization with reinforce-selected sentence rewriting. arXiv preprint arXiv:180511080.

Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. Journal of Machine Learning Research 12(Aug):2493–2537.

Del Corro L, Gemulla R (2013) Clausie: clause-based open information extraction. In: Proceedings of the 22nd international conference on World Wide Web, ACM, pp 355–366.

DUC (2002) Document understanding conference(duc) dataset. In: http://duc.nist.gov/data.html

Gatt A, Reiter E (2009) Simplenlg: A realisation engine for practical applications. In: Proceedings of the 12th European Workshop on Natural Language Generation, Association for Computational Linguistics, pp. 90–93.

Genest PE, Lapalme G (2010) Text generation for abstractive summarization. In: TAC.

Genest PE, Lapalme G (2011) Framework for abstractive summarization using text-to-text generation. In: Proceedings of the Workshop on Monolingual Text-To-Text Generation, Association for Computational Linguistics, Stroudsburg, PA, USA, MTTG ’11, pp 64–73, URL http://dl.acm.org/citation.cfm?id=2107679.2107687

Greenbacker CF (2011) Towards a framework for abstractive summarization of multimodal documents. In: Proceedings of the ACL Student Session, Association for Computational Linguistics, pp. 75–80.

GUO Y, PENG Y (2018) Semantic emotion-topic model in social media environment. Journal of Web Engineering 17, no 1&2 p: 073–092.

Hou JL, Chen YJ (2013) Development and application of optimization model for customized text summarization. In: Computer Supported Cooperative Work in Design (CSCWD), IEEE 17th International Conference on, IEEE, pp. 246–250.

Jadhav N, Bhattacharyya P (2014) Dive deeper: deep semantics for sentiment analysis. ACL p 113.

Jiang JJ, Conrath DW (1997) Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint cmp-lg/9709008.

Kallimani JS, Srinivasa K, et al. (2011) Information extraction by an abstractive text summarization for an indian regional language. In: 7th International Conference on Natural Language Processing and Knowledge Engineering, IEEE, pp. 319–322.

Karwa S, Chatterjee N (2014) Discrete differential evolution for text summarization. In: International Conference on Information Technology (ICIT), IEEE, pp. 129–133.

Khan A, Salim N, Kumar YJ (2015) A framework for multi-document abstractive summarization based on semantic role labelling. Applied Soft Computing 30:737–747.

Khan A, Salim N, Kumar YJ (2015) Genetic semantic graph approach for multi-document abstractive summarization. In: Fifth International Conference on Digital Information Processing and Communications (ICDIPC), IEEE, pp. 173–181.

Khan A, Salim N, Farman H, Khan M, Jan B, Ahmad A, Ahmed I, Paul A (2018) Abstractive text summarization based on improved semantic graph approach. International Journal of Parallel Programming 46(5):992–1016.

Knight K, Marcu D (2000) Statistics-based summarization-step one: Sentence compression. AAAI/IAAI 2000:703–710.

Kupiec J, Pedersen J, Chen F (1995) A trainable document summarizer. In: Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, pp. 68–73.

Larsen B (1999) A trainable summarizer with knowledge acquired from robust nlp techniques. Advances in automatic text summarization p. 71.

Liao K, Lebanoff L, Liu F (2018) Abstract meaning representation for multi-document summarization. arXiv preprint arXiv:180605655.

Liu F, Flanigan J, Thomson S, Sadeh N, Smith NA (2018) Toward abstractive summarization using semantic representations. arXiv preprint arXiv:180510399.

Lloret E, Palomar M (2012) Text summarisation in progress: a literature review. Artificial Intelligence Review 37(1):1–41.

Lloret E, Boldrini E, Vodolazova T, Martínez-Barco P, Muñoz R, Palomar M (2015) A novel concept-level approach for ultra-concise opinion summarization. Expert Systems with Applications 42(20): 7148–7156.

Lowd D, Domingos P (2007) Efficient weight learning for markov logic networks. In: European Conference on Principles of Data Mining and Knowledge Discovery, Springer, pp. 200–211.

Luhn HP (1958) The automatic creation of literature abstracts. IBM Journal of research and development 2(2):159–165.

Mehta P, Majumder P (2018) Effective aggregation of various summarization techniques. Information Processing and Management 54(2): 145–158.

Mitkov R (2014) Anaphora resolution. Routledge.

Moawad IF, Aref M (2012) Semantic graph reduction approach for abstractive text summarization. In: Computer Engineering and Systems (ICCES), 2012 Seventh International Conference on, IEEE, pp. 132–138.

Moratanch N, Chitrakala S (2016) A survey on abstractive text summarization. In: International Conference on Circuit, Power and Computing Technologies (ICCPCT), IEEE, pp. 1–7.

Munot N, Govilkar SS (2014) Comparative study of text summarization methods. International Journal of Computer Applications 102(12).

Ng JP, Abrecht V (2015) Better summarization evaluation with word embeddings for rouge. arXiv preprint arXiv:150806034.

Persson J, Johansson R, Nugues P (2009) Text categorization using predicate–argument structures. In: Proceedings of NODALIDA, pp. 142–149.

Porter MF (2001) Snowball: A language for stemming algorithms.

Saggion H, Poibeau T (2013) Automatic text summarization: Past, present and future. In: Multi-source, multilingual information extraction and summarization, Springer, pp 3–21.

Saif H, He Y, Fernandez M, Alani H (2016) Contextual semantics for sentiment analysis of twitter. Information Processing and Management 52(1):5–19.

Salim N, Suanmali L, Binwahlan M (2010) Srl-gsm: a hybrid approach based on semantic role labeling and general statistic method for text summarization. Journal of Applied science, Vol. l10, no. 3, p:166–173.

Shehata S, Karray F, Kamel MS (2013) An efficient concept-based retrieval model for enhancing text retrieval quality. Knowledge and information systems pp. 1–24.

Titov I, Klementiev A (2012) A bayesian approach to unsupervised semantic role induction. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, pp 12–22.

Tuffy (0.3) A scalable markov logic network (mln) inference engine. In: http://i.stanford.edu/hazy/tuffy/

Zhang J, Zhou Y, Zong C (2016) Abstractive cross-language summarization via translation model enhanced predicate argument structure fusing. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24(10):1842–1853.

Published
2019-01-01
Section
Articles