首页 | 官方网站   微博 | 高级检索  
     


Utilizing the multiple facets of WWW contents
Authors:Yakov Kogan  David Michaeli  Yehoshua Sagiv  Oded Shmueli
Affiliation:

a The Hebrew University, Jerusalem 91904, Israel

b Technion, Haifa, 32000, Israel

Abstract:Current query languages for the Web (e.g., W3QL, WebLog and WebSQL) explore the structure of the Web. However, usually, the structure of the Web has little to do with the semantics of the data. Therefore, it is practically difficult to pose database queries over the Web. We introduce a new type of tags for denoting the semantics of data stored in HTML pages. These semantic tags (implemented as HTML comments) superimpose on HTML pages semistructured objects in the style of the OEM model. The paper discusses two implemented tools for fully utilizing the semantics. The first is a visualization tool for displaying both the HTML reading of Web pages and the OEM reading of Web pages. The second tool is a query language, similar to LOREL, that can query the HTML structure and/or the OEM reading. The above formalism and tools provide data-modeling capabilities for the Web that fit its heterogeneous nature. Real database queries, taking the OEM point of view, can be formulated, including queries about the schema as well as queries about the HTML structure of Web pages. Therefore, the query language is not restricted to portions of the Web in which semantic tags are used.
Keywords:Lorel  OEM  OHTML  Query language  Semantic tags  Semistructured data  WWW  W3LOREL
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号