首页 | 官方网站   微博 | 高级检索  
     


Time gap analysis by the topic model-based temporal technique
Affiliation:1. Korea Institute of Science and Technology Information (KISTI), 245 Daehak-ro, Yuseong-gu, Daejeon 305-806, South Korea;2. Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 120-749, South Korea;1. Laboratory for Studies of Research and Technology Transfer, Institute for System Analysis and Computer Science (IASI-CNR), National Research Council of Italy, Viale Manzoni 30, 00185 Rome, Italy;2. Department of Engineering and Management, University of Rome “Tor Vergata”, Via del Politecnico 1, 00133 Rome, Italy;1. Institute for Education and Information Sciences, IBW, University of Antwerp, Venusstraat 35, Antwerp B-2000, Belgium;2. KU Leuven, Department of Mathematics, Celestijnenlaan 200B, B-3000 Leuven, Belgium;3. Library of Tongji University, Tongji University, Siping Street 1239, Shanghai 200092, China;4. SPRU, School of Business, Management and Economics, University of Sussex, Falmer, Brighton BN1 9SL, UK;5. Institute for Education and Information Sciences, IBW, University of Antwerp, Venusstraat 35, Antwerp B-2000, Belgium;1. School of Management, Harbin Institute of Technology, Harbin 150001, PR China;2. Dipartimento Di Economia Politica E Statistica, Università Di Siena, Siena 53100, Italy;3. School of Software, Harbin Institute of Technology, Harbin 150001, PR China
Abstract:This study proposes a temporal analysis method to utilize heterogeneous resources such as papers, patents, and web news articles in an integrated manner. We analyzed the time gap phenomena between three resources and two academic areas by conducting text mining-based content analysis. To this end, a topic modeling technique, Latent Dirichlet Allocation (LDA) was used to estimate the optimal time gaps among three resources (papers, patents, and web news articles) in two research domains. The contributions of this study are summarized as follows: firstly, we propose a new temporal analysis method to understand the content characteristics and trends of heterogeneous multiple resources in an integrated manner. We applied it to measure the exact time intervals between academic areas by understanding the time gap phenomena. The results of temporal analysis showed that the resources of the medical field had more up-to-date property than those of the computer field, and thus prompter disclosure to the public. Secondly, we adopted a power-law exponent measurement and content analysis to evaluate the proposed method. With the proposed method, we demonstrate how to analyze heterogeneous resources more precisely and comprehensively.
Keywords:Text mining  Topic modeling  Latent Dirichlet Allocation (LDA)  Content analysis  Temporal analysis  Multiple resources
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号