首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
angular velocity control dynamic system guides the agent's direction angle, while another dynamic system selects the environmental input that will be used in the control system. The agent interacts with the environment through its knowledge of the position of stationary and moving objects. In our system agents automatically avoid stationary and moving obstacles to reach the desired target(s). This approach allows us to prove the stability conditions that result in a principled methodology for the computation of the system's dynamic parameters. We present a variety of real-time simulations that illustrate the power of our approach.  相似文献   

2.
Efficient similarity search for market basket data   总被引:2,自引:0,他引:2  
Several organizations have developed very large market basket databases for the maintenance of customer transactions. New applications, e.g., Web recommendation systems, present the requirement for processing similarity queries in market basket databases. In this paper, we propose a novel scheme for similarity search queries in basket data. We develop a new representation method, which, in contrast to existing approaches, is proven to provide correct results. New algorithms are proposed for the processing of similarity queries. Extensive experimental results, for a variety of factors, illustrate the superiority of the proposed scheme over the state-of-the-art method. Edited by R. Ng. Received: August 6, 2001 / Accepted: May 21, 2002 Published online: September 25, 2002  相似文献   

3.
Association Rule Mining algorithms operate on a data matrix (e.g., customers products) to derive association rules [AIS93b, SA96]. We propose a new paradigm, namely, Ratio Rules, which are quantifiable in that we can measure the “goodness” of a set of discovered rules. We also propose the “guessing error” as a measure of the “goodness”, that is, the root-mean-square error of the reconstructed values of the cells of the given matrix, when we pretend that they are unknown. Another contribution is a novel method to guess missing/hidden values from the Ratio Rules that our method derives. For example, if somebody bought $10 of milk and $3 of bread, our rules can “guess” the amount spent on butter. Thus, unlike association rules, Ratio Rules can perform a variety of important tasks such as forecasting, answering “what-if” scenarios, detecting outliers, and visualizing the data. Moreover, we show that we can compute Ratio Rules in a single pass over the data set with small memory requirements (a few small matrices), in contrast to association rule mining methods which require multiple passes and/or large memory. Experiments on several real data sets (e.g., basketball and baseball statistics, biological data) demonstrate that the proposed method: (a) leads to rules that make sense; (b) can find large itemsets in binary matrices, even in the presence of noise; and (c) consistently achieves a “guessing error” of up to 5 times less than using straightforward column averages. Received: March 15, 1999 / Accepted: November 1, 1999  相似文献   

4.
Abstract. This paper presents structural recursion as the basis of the syntax and semantics of query languages for semistructured data and XML. We describe a simple and powerful query language based on pattern matching and show that it can be expressed using structural recursion, which is introduced as a top-down, recursive function, similar to the way XSL is defined on XML trees. On cyclic data, structural recursion can be defined in two equivalent ways: as a recursive function which evaluates the data top-down and remembers all its calls to avoid infinite loops, or as a bulk evaluation which processes the entire data in parallel using only traditional relational algebra operators. The latter makes it possible for optimization techniques in relational queries to be applied to structural recursion. We show that the composition of two structural recursion queries can be expressed as a single such query, and this is used as the basis of an optimization method for mediator systems. Several other formal properties are established: structural recursion can be expressed in first-order logic extended with transitive closure; its data complexity is PTIME; and over relational data it is a conservative extension of the relational calculus. The underlying data model is based on value equality, formally defined with bisimulation. Structural recursion is shown to be invariant with respect to value equality. Received: July 9, 1999 / Accepted: December 24, 1999  相似文献   

5.
Spatial indexing of high-dimensional data based on relative approximation   总被引:2,自引:0,他引:2  
We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly and thus affect the tree configuration both quantitatively and qualitatively. First, since tree nodes can contain a large number of VBR entries, fanout becomes large, which increases search speed. More importantly, we have a free hand in arranging MBRs and VBRs in the tree nodes. Each A-tree node contains an MBR and its children VBRs. Therefore, by fetching an A-tree node, we can obtain information on the exact position of a parent MBR and the approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the A-tree outperforms the SR-tree and the VA-file in all dimensionalities up to 64 dimensions, which is the highest dimension in our experiments. Additionally, we propose a cost model for the A-tree. We verify the validity of the cost model for synthetic and real data sets. Edited by T. Sellis. Received: December 8, 2000 / Accepted: March 20, 2002 Published online: September 25, 2002  相似文献   

6.
为了解决单一聚类算法存在结果不准确和随机性大,且现有算法对分类数据聚类时将其装换成数值型会产生误差等问题,提出了一种面向分类属性数据的聚类融合算法。算法利用原有分类属性值的差异产生聚类成员,然后采用相似度方法进行划分,通过寻求目标函数最小的划分来简化聚类过程。算法在UCI数据集上进行了验证,结果表明算法的效率和精度都优于现有算法,说明算法的设计和更新策略是有效的。  相似文献   

7.
Distance-based outliers: algorithms and applications   总被引:20,自引:0,他引:20  
This paper deals with finding outliers (exceptions) in large, multidimensional datasets. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card fraud, and even the analysis of performance statistics of professional athletes. Existing methods that we have seen for finding outliers can only deal efficiently with two dimensions/attributes of a dataset. In this paper, we study the notion of DB (distance-based) outliers. Specifically, we show that (i) outlier detection can be done efficiently for large datasets, and for k-dimensional datasets with large values of k (e.g., ); and (ii), outlier detection is a meaningful and important knowledge discovery task. First, we present two simple algorithms, both having a complexity of , k being the dimensionality and N being the number of objects in the dataset. These algorithms readily support datasets with many more than two attributes. Second, we present an optimized cell-based algorithm that has a complexity that is linear with respect to N, but exponential with respect to k. We provide experimental results indicating that this algorithm significantly outperforms the two simple algorithms for . Third, for datasets that are mainly disk-resident, we present another version of the cell-based algorithm that guarantees at most three passes over a dataset. Again, experimental results show that this algorithm is by far the best for . Finally, we discuss our work on three real-life applications, including one on spatio-temporal data (e.g., a video surveillance application), in order to confirm the relevance and broad applicability of DB outliers. Received February 15, 1999 / Accepted August 1, 1999  相似文献   

8.
Clustering categorical data sets using tabu search techniques   总被引:2,自引:0,他引:2  
Clustering methods partition a set of objects into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria. The fuzzy k-means-type algorithm is best suited for implementing this clustering operation because of its effectiveness in clustering data sets. However, working only on numeric values limits its use because data sets often contain categorical values. In this paper, we present a tabu search based clustering algorithm, to extend the k-means paradigm to categorical domains, and domains with both numeric and categorical values. Using tabu search based techniques, our algorithm can explore the solution space beyond local optimality in order to aim at finding a global solution of the fuzzy clustering problem. It is found that the clustering results produced by the proposed algorithm are very high in accuracy.  相似文献   

9.
Recent technology advances have made multimedia on-demand services, such as home entertainment and home-shopping, important to the consumer market. One of the most challenging aspects of this type of service is providing access either instantaneously or within a small and reasonable latency upon request. We consider improvements in the performance of multimedia storage servers through data sharing between requests for popular objects, assuming that the I/O bandwidth is the critical resource in the system. We discuss a novel approach to data sharing, termed adaptive piggybacking, which can be used to reduce the aggregate I/O demand on the multimedia storage server and thus reduce latency for servicing new requests.  相似文献   

10.
Numerous applications such as stock market or medical information systems require that both historical and current data be logically integrated into a temporal database. The underlying access method must support different forms of “time-travel” queries, the migration of old record versions onto inexpensive archive media, and high insertion and update rates. This paper presents an access method for transaction-time temporal data, called the log-structured history data access method (LHAM) that meets these demands. The basic principle of LHAM is to partition the data into successive components based on the timestamps of the record versions. Components are assigned to different levels of a storage hierarchy, and incoming data is continuously migrated through the hierarchy. The paper discusses the LHAM concepts, including concurrency control and recovery, our full-fledged LHAM implementation, and experimental performance results based on this implementation. A detailed comparison with the TSB-tree, both analytically and based on experiments with real implementations, shows that LHAM is highly superior in terms of insert performance, while query performance is in almost all cases at least as good as for the TSB-tree; in many cases it is much better. Received March 4, 1999 / Accepted September 28, 1999  相似文献   

11.
To provide high accessibility of continuous-media (CM) data, CM servers generally stripe data across multiple disks. Currently, the most widely used striping scheme for CM data is round-robin permutation (RRP). Unfortunately, when RRP is applied to variable-bit-rate (VBR) CM data, load imbalance across multiple disks occurs, thereby reducing overall system performance. In this paper, the performance of a VBR CM server with RRP is analyzed. In addition, we propose an efficient striping scheme called constant time permutation (CTP), which takes the VBR characteristic into account and obtains a more balanced load than RRP. Analytic models of both RRP and CTP are presented, and the models are verified via trace-driven simulations. Analysis and simulation results show that CTP can substantially increase the number of clients supported, though it might introduce a few seconds/minutes of initial delay. Received June 9, 1998 / Accepted January 21, 1999  相似文献   

12.
Clustering consists in partitioning a set of objects into disjoint and homogeneous clusters. For many years, clustering methods have been applied in a wide variety of disciplines and they also have been utilized in many scientific areas. Traditionally, clustering methods deal with numerical data, i.e. objects represented by a conjunction of numerical attribute values. However, nowadays commercial or scientific databases usually contain categorical data, i.e. objects represented by categorical attributes. In this paper we present a dissimilarity measure which is capable to deal with tree structured categorical data. Thus, it can be used for extending the various versions of the very popular k-means clustering algorithm to deal with such data. We discuss how such an extension can be achieved. Moreover, we empirically prove that the proposed dissimilarity measure is accurate, compared to other well-known (dis)similarity measures for categorical data.  相似文献   

13.
Abstract. The analysis of web usage has mostly focused on sites composed of conventional static pages. However, huge amounts of information available in the web come from databases or other data collections and are presented to the users in the form of dynamically generated pages. The query interfaces of such sites allow the specification of many search criteria. Their generated results support navigation to pages of results combining cross-linked data from many sources. For the analysis of visitor navigation behaviour in such web sites, we propose the web usage miner (WUM), which discovers navigation patterns subject to advanced statistical and structural constraints. Since our objective is the discovery of interesting navigation patterns, we do not focus on accesses to individual pages. Instead, we construct conceptual hierarchies that reflect the query capabilities used in the production of those pages. Our experiments with a real web site that integrates data from multiple databases, the German SchulWeb, demonstrate the appropriateness of WUM in discovering navigation patterns and show how those discoveries can help in assessing and improving the quality of the site. Received June 21, 1999 / Accepted December 24, 1999  相似文献   

14.
Active database management systems (DBMSs) are a fast-growing area of research, mainly due to the large number of applications which can benefit from this active dimension. These applications are far from being homogeneous, requiring different kinds of functionalities. However, most of the active DBMSs described in the literature only provide a fixed, hard-wired execution model to support the active dimension. In object-oriented DBMSs, event-condition-action rules have been propo sed for providing active behaviour. This paper presents EXACT, a rule manager for object-oriented DBMSs which provides a variety of options from which the designer can choose the one that best fits the semantics of the concept to be supported by rules. Due to the difficulty of foreseeing future requirements, special attention has been paid to making rule management easily extensible, so that the user can tailor it to suit specific applications. This has been borne out by an implementation in ADAM, an object -oriented DBMS. An example is shown of how the default mechanism can be easily extended to support new requirements. Edited by Y. Vassiliou. Received May 26, 1994 / Revised January 26, 1995, June 22, 1996 / Accepted November 4, 1996  相似文献   

15.
Summary. We set out a modal logic for reasoning about multilevel security of probabilistic systems. This logic contains expressions for time, probability, and knowledge. Making use of the Halpern-Tuttle framework for reasoning about knowledge and probability, we give a semantics for our logic and prove it is sound. We give two syntactic definitions of perfect multilevel security and show that their semantic interpretations are equivalent to earlier, independently motivated characterizations. We also discuss the relation between these characterizations of security and between their usefulness in security analysis.  相似文献   

16.
17.
18.
In this paper, we present a novel approach for multimedia data indexing and retrieval that is machine independent and highly flexible for sharing multimedia data across applications. Traditional multimedia data indexing and retrieval problems have been attacked using the central data server as the main focus, and most of the indexing and query-processing for retrieval are highly application dependent. This precludes the use of created indices and query processing mechanisms for multimedia data which, in general, have a wide variety of uses across applications. The approach proposed in this paper addresses three issues: 1. multimedia data indexing; 2. inference or query processing; and 3. combining indices and inference or query mechanism with the data to facilitate machine independence in retrieval and query processing. We emphasize the third issue, as typically multimedia data are huge in size and requires intra-data indexing. We describe how the proposed approach addresses various problems faced by the application developers in indexing and retrieval of multimedia data. Finally, we present two applications developed based on the proposed approach: video indexing; and video content authorization for presentation.  相似文献   

19.
In this work a visual-based autonomous system capable of memorizing and recalling sensory-motor associations is presented. The robot's behaviors are based on learned associations between its sensory inputs and its motor actions. Perception is divided into two stages. The first one is functional: algorithmic procedures extract in real time visual features such as disparity and local orientation from the input images. The second stage is mnemonic: the features produced by the different functional areas are integrated with motor information and memorized or recalled. An efficient memory organization and fast information retrieval enables the robot to learn to navigate and to avoid obstacles without need of an internal metric reconstruction of the external environment. Received: 22 November 1996 / Accepted: 18 November 1997  相似文献   

20.
Integration – supporting multiple application classes with heterogeneous performance requirements – is an emerging trend in networks, file systems, and operating systems. We evaluate two architectural alternatives – partitioned and integrated – for designing next-generation file systems. Whereas a partitioned server employs a separate file system for each application class, an integrated file server multiplexes its resources among all application classes; we evaluate the performance of the two architectures with respect to sharing of disk bandwidth among the application classes. We show that although the problem of sharing disk bandwidth in integrated file systems is conceptually similar to that of sharing network link bandwidth in integrated services networks, the arguments that demonstrate the superiority of integrated services networks over separate networks are not applicable to file systems. Furthermore, we show that: an integrated server outperforms the partitioned server in a large operating region and has slightly worse performance in the remaining region; the capacity of an integrated server is larger than that of the partitioned server; and an integrated server outperforms the partitioned server by a factor of up to 6 in the presence of bursty workloads.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号