期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Selectivity estimation of range queries based on data density approximation via cosine series

Feng Wen-Chi Zhewei Cheng Qiang 《Data & Knowledge Engineering》2007,63(3):855-878

Selectivity estimation is an integral part of query optimization. In this paper, we propose to approximate data density functions of relations by cosine series and use the approximations to estimate selectivities of range queries. We lay down the foundation for applying cosine series to range query size estimation and compare it with some notable approaches, such as the wavelets, DCT, kernel-spline, sketch, and Legendre polynomials. Experimental results have shown that our approach is simple to construct, easy to update, and fast to estimate. It also yields accurate estimates, especially in multi-dimensional cases. 相似文献

2.

Processing k-skyband,constrained skyline,and group-by skyline queries on incomplete data

《Expert systems with applications》2014,41(10):4959-4974

The skyline operator has been extensively explored in the literature, and most of the existing approaches assume that all dimensions are available for all data items. However, many practical applications such as sensor networks, decision making, and location-based services, may involve incomplete data items, i.e., some dimensional values are missing, due to the device failure or the privacy preservation. This paper is the first, to our knowledge, study of k-skyband (kSB) query processing on incomplete data, where multi-dimensional data items are missing some values of their dimensions. We formalize the problem, and then present two efficient algorithms for processing it. Our methods introduce some novel concepts including expired skyline, shadow skyline, and thickness warehouse, in order to boost the search performance. As a second step, we extend our techniques to tackle constrained skyline (CS) and group-by skyline (GBS) queries over incomplete data. Extensive experiments with both real and synthetic data sets demonstrate the effectiveness and efficiency of our proposed algorithms under various experimental settings. 相似文献

3.

Optimizing data stream processing for large‐scale applications

下载免费PDF全文

Paolo Cappellari Mark Roantree Soon Ae Chun 《Software》2018,48(9):1607-1641

Stream processing systems are designed to analyze data arriving in real time and using continuous queries and respond when a specific event or sequence of events are detected. An important aspect of these systems is Streaming Analytics, which facilitates statistical calculations on continuous data within the stream. These systems must be designed to handle high volumes of data, be scalable, and accommodate a multitude of long‐lived concurrently running analytics. The challenges involved in the development of stream processing include on‐the‐fly transformation of data streams to match the query needs of users and the ability to model stream transformations to detect overlaps and possibilities for optimizations and to specify a methodology to deliver optimizations. In particular, this work focuses on exposing data stream application internals in order to detect reusable parts and then consolidate applications to optimize computational resource usage. The Streaming Data Analytics Model presented in this paper adopts a declarative approach that enables processing and manipulation of data streams in a simple manner while facilitating powerful optimizations necessary for managing high volumes of streaming data in real time. An evaluation is provided to demonstrate in both theoretical and quantitative aspects the high performance offered by our approach. 相似文献

4.

输变电工程造价大数据平台构建与智能分析管控应用研究

王鑫《电力大数据》2018,21(11)

输变电工程项目建设规模不断壮大,相应的电网造价数据量也持续增加,但目前对于造价数据的挖掘处理远远不够,亟待提升造价管理的信息化水平,构建输变电工程造价“大数据”体系。本文基于大数据应用背景下输变电工程造价数据管理运用现状,分析大数据理论在工程造价管理中应用的必要性及意义,探讨了输变电工程造价数据采集、存储技术及关键影响因素技术的应用,系统梳理与总结造价数据来源类型、数据分析与处理流程、数据信息挖掘与应用方向等内容,构建了基于大数据应用的输变电工程造价系统平台,并结合实际应用案例,通过分析找到关键影响因素,采用支持向量机方法、粒子群等方法,建立起科学的造价预测模型,可为电网企业输变电工程造价管控水平的提升提供重要支撑。相似文献