Utilizing the multiple facets of WWW contents |
| |
Authors: | Yakov Kogan David Michaeli Yehoshua Sagiv Oded Shmueli |
| |
Affiliation: | a The Hebrew University, Jerusalem 91904, Israel b Technion, Haifa, 32000, Israel |
| |
Abstract: | Current query languages for the Web (e.g., W3QL, WebLog and WebSQL) explore the structure of the Web. However, usually, the structure of the Web has little to do with the semantics of the data. Therefore, it is practically difficult to pose database queries over the Web. We introduce a new type of tags for denoting the semantics of data stored in HTML pages. These semantic tags (implemented as HTML comments) superimpose on HTML pages semistructured objects in the style of the OEM model. The paper discusses two implemented tools for fully utilizing the semantics. The first is a visualization tool for displaying both the HTML reading of Web pages and the OEM reading of Web pages. The second tool is a query language, similar to LOREL, that can query the HTML structure and/or the OEM reading. The above formalism and tools provide data-modeling capabilities for the Web that fit its heterogeneous nature. Real database queries, taking the OEM point of view, can be formulated, including queries about the schema as well as queries about the HTML structure of Web pages. Therefore, the query language is not restricted to portions of the Web in which semantic tags are used. |
| |
Keywords: | Lorel OEM OHTML Query language Semantic tags Semistructured data WWW W3LOREL |
本文献已被 ScienceDirect 等数据库收录! |
|