首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The most efficient currently known algorithms for two-dimensional pattern matching with rotations have a worst case time complexity of O(n2m3)O(n2m3), where the size of the text is n×nn×n and the size of the pattern is m×mm×m. In this paper we present a new algorithm for the problem whose running time is O(n2m2)O(n2m2).  相似文献   

2.
3.
4.
Given a text T and a pattern P, the order-preserving pattern matching (OPPM) problem is to find all substrings in T which have the same relative orders as P. The OPPM has been studied in the fields of finding some patterns affected by relative orders, not by their absolute values. In this paper, we present a method of deciding the order-isomorphism between two strings even when there are same characters. Then, we show that the bad character rule of the Horspool algorithm for generic pattern matching problems can be applied to the OPPM problem and we present a space-efficient algorithm for computing shift tables for text search. Finally, we combine our bad character rule with the KMP-based algorithm to improve the worst-case running time. We give experimental results to show that our algorithm is about 2 to 6 times faster than the KMP-based algorithm in reasonable cases.  相似文献   

5.
We present a GPU-based approach to geometric pattern matching. We reduce this problem to finding the depth (maximally covered point) of an arrangement of polytopes in transformation space and describe hardware assisted (GPU) algorithms which exploit the available set of graphics operations to perform a fast rasterized depth computation. We give two alternatives, one is for translation + scale and the other is for rigid transformations, both have 3-parameters transformation space. We give extensive experimental results showing the running time of our method and its dependence on various parameters.  相似文献   

6.
针对复杂化工过程具有的非线性、非高斯性和动态特征,提出了基于核独立成分分析(KICA)的模式匹配方法,用于动态过程监控和诊断。首先,利用滑动窗建立基准集与测试集的KICA模型,提取各自的核独立元:其次,融合余弦函数绝对值度量和距离度量,提出新的不相似度监控指标,识别训练与测试操作期间的相似模式,进行故障检测:最后,基于两类数据的核子空间之间的差异子空间,获得每个过程变量方向与该差异子空间之间的互信息,并定义新的非线性非高斯贡献度指标,进行故障诊断。基于污水处理过程的仿真结果表明,与主成分分析不相似度因子的方法、标准的独立成分分析(ICA)统计指标方法及标准的ICA T~2/SPE指标融合的贡献度方法相比,本文提出的方法具有更好的检测能力与故障诊断效果。  相似文献   

7.
Statistical machine translation (SMT) has proven to be an interesting pattern recognition framework for automatically building machine translations systems from available parallel corpora. In the last few years, research in SMT has been characterized by two significant advances. First, the popularization of the so called phrase-based statistical translation models, which allows to incorporate local contextual information to the translation models. Second, the availability of larger and larger parallel corpora, which are composed of millions of sentence pairs, and tens of millions of running words. Since phrase-based models basically consists in statistical dictionaries of phrase pairs, their estimation from very large corpora is a very costly task that yields a huge number of parameters which are to be stored in memory. The handling of millions of model parameters and a similar number of training samples have become a bottleneck in the field of SMT, as well as in other well-known pattern recognition tasks such as speech recognition or handwritten recognition, just to name a few. In this paper, we propose a general framework that deals with the scaling problem in SMT without introducing significant time overhead by means of the combination of different scaling techniques. This new framework is based on the use of counts instead of probabilities, and on the concept of cache memory.  相似文献   

8.
具有相似功能的Web应用,其页面样式和布局往往存在很大的相似性。针对当前Web页面开发复杂度高且效率低的情况,提出一种挖掘现有页面布局结构和样式属性的方法来实现Web页面自动化设计。该方法充分利用Web网页布局结构上的特点,采用分级处理的方式,首先利用页面分块算法思想通过相似度计算挖掘出具有相似性的代码块,其次通过结合RoSunday方法解析样式文件快速匹配出节点集合对应的样式表并建立文档模型树结构,各个子模块之间的相互组合可以实现页面的自动化设计。通过应用实例表明,该方法能动态地设计并生成页面,有效提升Web页面开发效率。  相似文献   

9.
This paper presents research on a robust technique for texture-based image retrieval in multimedia museum collections. The aim is to be able to use a query image patch containing a single texture to retrieve images containing an area with similar texture to that in the query. The feature extractor used to build the feature vectors is based on an improved version of the discrete wavelet frames (DWF), proposed elsewhere. In order to utilise the feature extractor on real scene image datasets, a block-oriented decomposition technique, termed the multiscale sub-image matching method, is presented. The multiscale method, together with the DWF, provide an efficient content-based retrieval technique without the need for segmentation. The algorithms are tested on a range of databases of texture images as well as on real museum image collections. Promising results are reported.
Mohammad Faizal Ahmad FauziEmail:
  相似文献   

10.
11.
12.
Challenges of urbanization require new, more flexible approaches to design of public transportation systems. Demand Responsive Transport systems (DRT) that provide a share transportation services with flexible routes and focus on optimizing of economic and environmental value are becoming an important part of public transportation. In this paper we propose a new approach to design of DRT models which considers DRT as a multi-agent system (MAS) where various autonomous agents represent interests of system’s stakeholders. The distributed nature of the MAS facilitates design of scalable implementations in modern cloud environments. We also propose a planning algorithm based on combinatorial auctions (CA) that allows to express commodity of multiple transportation scenarios by evident means of the bids. Using the mechanism of CA we may fully take into account the presence of complementariness and substitutability among the items that differ across bidders. Further, we describe design principles of our proposed software with a prototype implementation. We believe that our approach to multi-agent modeling is general enough to provide the flexibility necessary for adoption of DRT-services modeling into real-world scenarios. The results of modeling have been compared against several cases of a local bus provider and validated in a set of computational experiments.  相似文献   

13.
针对近似模式匹配算法在处理带有灵活通配符和长度约束近似模式匹配(APMWL)问题时只能解决替换操作, 提出一种基于动态规划的编辑距离矩阵(EDM)构造方法,设计了基于EDM的近似模式匹配算法APM, 可以处理近似匹配中的三种编辑操作,即插入、替换和删除操作。此外,根据文本中字符是否允许被重复使用的约束条件,设计APM-OF算法。实验结果表明,APM和APM-OF与同类算法相比具备显著的优势:与Sail_Approx匹配算法实验对比, 获取解的平均增长率分别达到8.34%和12.37%; 将APM-OF算法应用至模式挖掘中, 挖掘出的频繁近似模式个数为OneoffMining算法的2.07倍。  相似文献   

14.
Ken-Chih Liu 《Software》1986,16(6):541-548
This paper presents an extension of Pascal with string pattern matching. Pattern definitions are built using six basic operations: alternation, concatenation, immediate value assignment, intersection, difference and complement. The last three have not been previously implemented and they increase the expressive power beyond context-free languages. The pattern matching actions are augmented with three options: trace, prefix and suffix. Comparisons with a SNOBOL4 implementation are also presented. This experiment demonstrates that Pascal with pattern matching is a useful tool for string processing applications.  相似文献   

15.
Pattern matching is one of the most performance-critical components for the content inspection based applications of network security, such as network intrusion detection and prevention. To keep up with the increasing speed network, this component needs to be accelerated by well designed custom coprocessor. This paper presents a parameterized multilevel pattern matching architecture (MPM) which is used on FPGAs. To achieve less chip area, the architecture is designed based on the idea of selected character decoding (SCD) and multilevel method which are analyzed in detail. This paper also proposes an MPM generator that can generate RTL-level codes of MPM by giving a pattern set and predefined parameters. With the generator, the efficient MPM architecture can be generated and embedded to a total hardware solution. The third contribution is a mathematical model and formula to estimate the chip area for each MPM before it is generated, which is useful for choosing the proper type of FPGAs. One example MPM architecture is implemented by giving 1785 patterns of Snort on Xilinx Virtex 2 Pro FPGA. The results show that this MPM can achieve 4.3 Gbps throughput with 5 stages of pipelines and 0.22 slices per character, about one half chip area of the most area-efficient architecture in literature. Other results are given to show that MPM is also efficient for general random pattern sets. The performance of MPM can be scalable near linearly, potential for more than 100 Gbps throughput. Supported by the National Natural Science Foundation of China (Grant No. 60803002), and the Excellent Young Scholars Research Fund of Beijing Institute of Technology  相似文献   

16.
于东  刘春花  田悦 《计算机应用》2016,36(2):455-459
针对从非结构化文本中抽取指定人物职衔履历属性问题,提出一种基于远距离监督和模式匹配的属性抽取方法。该方法从字符串模式和依存模式两个层面描述人物职衔履历特征,将问题分为两阶段。首先利用远距离监督知识和人工标注知识,挖掘具有高覆盖度的模式库,用于发现职衔履历属性和抽取候选集;其次利用职衔机构等属性间的文字接续关系,以及特定人物与候选属性的依存关系,设计候选集的过滤规则对候选项进行筛选,实现高准确度的属性抽取。实验结果显示,所提方法在CLP2014-PAE测试集上的F值达到55.37%,显著高于评测最好成绩(F值34.38%)和基于条件随机场(CRF)的有监督序列标注方法(F值43.79%),表明该方法能高覆盖度挖掘并抽取非结构化文档中的职衔履历属性。  相似文献   

17.
This paper proposes a flexible sequence alignment approach for pattern mining and matching in the recognition of human activities. During pattern mining, the proposed sequence alignment algorithm is invoked to extract out the representative patterns which denote specific activities of a person from the training patterns. It features high performance and robustness on pattern diversity. Besides, the algorithm evaluates the appearance probability of each pattern as weight and allows adapting pattern length to various human activities. Both of them are able to improve the accuracy of activity recognition. In pattern matching, the proposed algorithm adopts a dynamic programming based strategy to evaluate the correlation degree between each representative activity pattern and the observed activity sequence. It can avoid the trouble on segmenting the observed sequence. Moreover, we are able to obtain recognition results continuously. Besides, the proposed matching algorithm favors recognition of concurrent human activities with parallel matching. The experimental result confirms the high accuracy of human activity recognition by the proposed approach.  相似文献   

18.
The technique of searching for similar patterns among time series data is very useful in many applications. The problem becomes difficult when shifting and scaling are considered. We find that we can treat the problem geometrically and the major contribution of this paper is that a uniform geometrical model that can analyze the existing related methods is proposed. Based on the analysis, we conclude that the angle between two vectors after the Shift-Eliminated Transformation is a more intrinsical similarity measure invariant to shifting and scaling. We then enhance the original conical index to adapt to the geometrical properties of the problem and compare its performance with that of sequential search and R*-tree. Experimental results show that the enhanced conical index achieves larger improvement on R*-tree and sequential search in high dimension. It can also keep a steady performance as the selectivity increases. Part of the result related to the geometrical model has been published in the Proceedings of the 18th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp 237–248. Mi Zhou was born in China. He received his BS and MS degrees in computer science from the Northeastern University, China, in 1999 and 2002, respectively. He is currently pursuing the Ph D degree in the Computer Science and Engineering Department, The Chinese University of Hong Kong. His research interests include indexing of time series data, high-dimensional index, and sensor network. Man-Hon Wong received his BSc and MPhil degrees from The Chinese University of Hong Kong in 1987 and 1989 respectively. He then went to University of California at Santa Barbara where he got the PhD degree in 1993. Dr. Wong joined The Chinese University of Hong Kong in August 1993 as an assistant professor. He was promoted to associate professor in 1998. His research interests include transaction management, mobile databases, data replication, distributed systems, and computer and network security. Kam-Wing Chu was born in Hong Kong. He received his BS and MPhil degrees in computer science and engineering from The Chinese University of Hong Kong. When he was in Hong Kong, his research interests included database indexing of high dimensional data, and data mining. He later went to United States and received his MS degree in computer science from University of Maryland at College Park. While he was in Maryland, he focused on high performance implementation and algorithm design of advanced database systems. He is currently a senior software engineer in Server Performance group at Actuate Corporation. His expertise is in enterprise software development and software performance optimization.  相似文献   

19.
Bin packing problems are NP-hard combinatorial optimization problems of fundamental importance in several fields, including computer science, engineering, economics, management, manufacturing, transportation, and logistics. In particular, the non-guillotine version of the single-objective two-dimensional bin packing problem with rotations is a highly complex scheduling problem that consists in packing a set of items into the minimum number of bins, where items can be rotated 90° and are characterized by having different heights and widths. Recently, some authors have proposed multi-objective formulations that also consider additional objectives, such as the balancing the bin load in order to increase its stability. The load imbalance minimization, which depends on the distribution of the items packed in them, is a critical point in many real applications. This paper analyzes how to solve two-dimensional bin packing problems with rotations and load balancing using parallel and multi-objective memetic algorithms that apply a set of search operators specifically designed to solve this problem. Results obtained using a set of test problems show the good performance of parallel and multi-objective memetic algorithms in comparison with other methods found in the literature.  相似文献   

20.
一种用于内容过滤和检测的快速多关键词识别算法   总被引:13,自引:0,他引:13  
基于字符串匹配的检测方法是内容过滤和检测系统中一类很重要的分析方法,首先分析了现有的几种快速字符串匹配算法,然后提出了一种新的多模式字符串匹配算法,并简单分析了算法的复杂性,算法在设计的过程中吸取了BM算法中跳跃的特性,采用了后缀树算法得到了最大跳跃值,采用AC算法的匹配自动机原理从而避免对搜索树内每一个字符的匹配,最后,通过具体的实验数据验证了这些算法的性能,通过实验可以看出,新算法使得检测速度有很大提高,并有效屏蔽了关键词数量的增加对检测速度的影响。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号