News Article Based Industry Risk Index Prediction for Industry-Specific Evaluation




Industry evaluation, industry-specific risk prediction, unstructured data, multiple classification, time-series data analysis


The existing industry evaluation method utilizes the method of collecting the structured information such as the financial information of the companies included in the relevant industry and deriving the industrial evaluation index through the statistical analysis model. This method takes a long time to calculate the structured data and cause the time delay problem. In this paper, to solve this time delay problem, we derive monthly industry-specific interest and likability as a time series data type, which is a new industry evaluation indicator based on unstructured data. In addition, we propose a method to predict the industrial risk index, which is used as an important factor in industrial evaluation, based on derived industry-specific interest and likability time series data.


Download data is not yet available.

Author Biographies

Kyungwon Kim, Korea Electronics Technology Institute, Mapo-gu, Seoul, 03924, Republic of Korea

Kyungwon Kim received the B.S. and M.S. degrees in computer science and engineering from Hankuk University of Foreign Studies, Seoul, Korea, in 2001 and 2003, respectively, and the Ph.D. degree in computer, information and communications engineering from Konkuk University, Seoul, Korea, in 2018. He has been a Managerial Researcher with Korea Electronics Technology Institute, Seoul, Korea, since 2004. His current research interests include the unstructured data analysis and data inference modeling.

Kyoungro Yoon, Konkuk University, Gwangjin-gu, Seoul, 05029, Republic of Korea

Kyoungro Yoon received the B.S. degree in computer and electronic engineering from Yonsei University, Seoul, Korea, in 1987, the M.S.E. degree in electrical engineering/systems from the University of Michigan, Ann Arbor, MI, USA, in 1989, and the Ph.D. degree in computer and information science from Syracuse University, Syracuse, NY, USA, in 1999. He was a principal researcher and a group leader at the Mobile Multimedia Research Lab, LG Electronics Institute of Technology from 1999 to 2003. He joined the school of Computer Science and Engineering in 2003 as an Assistant Professor and became a full Professor in 2012. He has been with the Department of Smart ICT Convergence, since 2017. He served as a Co-chair of Ad Hoc Group on User Preferences and the Chair of Ad Hoc Group on MPEG Query Format and Ad Hoc Group on MPEG-V of ISO/IEC JTC1 SC29 WG11 (a.k.a. MPEG). He also served as the Chair of the Metadata Subgroup and JPSearch Ad Hoc Group of ISO/IEC JTC1 SC29 WG1 (a.k.a. JPEG). He is serving as an Editor of various international standards such as ISO IS 15938-12, 23005-2, 23005-5, 23005-6, 24800-3, 24800-5, and 24800-6. His main research interests include smart media system, image processing, and multimedia information and metadata processing.


A. Li, K. Chen, H. Song, Y. Lei, ‘The Industry Data Analysis Processing Model Design’, International Conference on Cloud Computing and Big Data, 2014.

R. P. Schumaker, H. Chen, ‘Textual Analysis of Stock Market Prediction Using Breaking Financial News’, ACM Transactions on Information Systems, Vol. 27, No. 2, 2009.

M. L. Mitchell, J. H. Mulherin, ‘The Impact of Public Information on the Stock Market’, The Journal of Finance, Vol. XL, No. 3, 1994.

M. A. Mittermayer, G. F. Knolmayer, ‘NewsCATS: A News Categorization And Trading System’, Proceedings of the International Conference in Data Mining, 2006.

A. Esuli, F. Sebastiani, ‘Page Ranking WordNet Synsets-An Application Opinion Mining’, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 424–431, 2007.

A. Guo, T. Yang, ‘Research and Improvement of Feature Words Weight based on TFIDF Algorithm’, IEEE Information Technology, Networking, Electronic and Automation Control Conference, pp. 415–419, 2016.

X. Wang, J. Cao, Y. Liu, S. Gao, X. Deng, ‘Text Clustering based on the Improved TFIDF by the Iterative Algorithm’, IEEE Symposium on Electrical &Electronics Engineering, pp. 140–143, 2012.

Mohamed Hamroun, Mohamed Salah Gouider, Lamjed Ben Said, ‘Lexico Semantic Patterns for Customer Intentions Analysis of Microblogging’, International Conference on Semantics, Knowledge and Grid, 2015.

J. Cha, D. Lee, ‘Korean Standard Statistical Classification’,, accessed March. 2018.

United Nations Statistics Division, ‘International Standard Industrial Classification of All Economic Activities Rev.4’,, accessed March. 2018.

UN, ‘Handbook of input-output table compilation and analysis’, 1999.

A. Lendasse, J. Lee, E. Bodt, V. Wertz, M. verleysen, ‘Input Data Reduction for the Prediction of Financial Time Series’, European Symposium on Artificial Neural Networks, pp. 237–244, April 2001.

H. Rajaguru, S. K. Prabhakar, ‘Analyzing Dimensionality Reduction with Softmax Discriminant Classifier for Epilepsy Classification’, International Conference on Communication and Electronics Systems, pp. 565–568, 2017.

H. Rajaguru, S. K. Prabhakar, ‘Softmax Discriminant Classifier for Detection of Risk Levels in Alcoholic EEG Signals’, International Conference on Computing Methodologies and Communication, pp. 989–991, 2017.

M. Santini, A. Tettamanzi, ‘Genetic Programming for Financial Time Series Prediction’, European Conference EuropGP, pp. 361–370, 2001.

R. P. Schumaker, H. Chen. ‘A Discrete Stock Price Prediction Engine based on Financial News’, COPMUTER-IEEE Computer Society, vol. 43, no. 2, pp. 51–56, 2010.

X. Yp, ‘Applying Modified KMV Model to Analyze the Credit Risk of Listed Firms in Chinese Cement Industry’, International Conference in Management Science & Engineering, pp. 983–989, 2012.

N. G. Pavlidis, D. K. Tasoulis, M. N. Vrahatis, ‘Time Series Forecasting Methodology for Multiple-Step-Ahead Prediction’, Computational Intelligence, pp. 456–461, 2005.

D. Ye, J. Lu, X. Zhu, H. Lin, ‘Generlized Cross Correlation Time Delay Estimation Based on Improved Wavelet Threshold Function’, International Conference on Instrumentation & Measurement, Computer, Communication and Control, pp. 629–633, 2016.

W. Max-Moerbeck, J. L. Richards, T. Hovatta, V. Pavlidou, T. J. Pearson, A. C. S. Readhead, ‘A Method for the Estimation of the Significance of Cross-Correlations in Unevenly Sampled Red-Noise Time Series’, Monthly Notices of the Royal Astronomical Society, Vol. 445, pp. 437–459, 2014.

N. G. Pavlidis, D. K. Tasoulis, M. N. Vrahatis, ‘Time series forecasting methodology for multiple-step-ahead prediction’, Computational Intelligence, 2005, pp. 456–461.

T. Teoh, S. Cho, Y. Nguwi, ‘Emotional Prediction using Time Series Multiple-Regression Genetic Algorithm for Autistic Syndrome Disorder’, International Conference on Computer Science & Education, pp. 9–12, 2012.

R. Gonzalez, C. A. Catania, ‘Time-Delayed Multiple Linear Regression for Increasing MEMS Inertial Sensor Performance by using Observations from a Navigation-Grade IMU’, IEEE/ION Position, Location and Navigation Symposium, pp. 15–20, 2016.

B. Siregar, E. B. Nababan, A. Yap, U. Andayani, Fahmi, ‘Forecasting of Raw Material needed for Plastic Products based in Income Data using ARIMA Method’, International Conference on Electrical, Electronics and Information Engineering, pp. 135–139, 2017.

Z. Zhao, C. Wang, M. Nokleby, C. J. Miller, ‘Improving Short-Term Electricity Price Forecasting using Day-Ahead LMP with ARIMA Models’, IEEE Power & Energy Society General, pp. 1–5, 2017.





Communication, Multimedia and Learning Technology through Future Web Engineering