首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
基于二层自适应正交小波的疵点检测   总被引:5,自引:0,他引:5  
单亦杰  韩润萍 《微计算机信息》2007,23(3X):303-304,274
本文提出了基于二层自适应正交小波的织物疵点检测方法。首先介绍了织物纹理图像的二维正交小波变换.在此基础上借鉴Daubechies小波构造过程,阐明了二层自适应正交小波的构造方法,然后对图象进行二层小波分解。分别比较无缺陷图像与待检测图像二层分解后的纬向和径向子图像,得到纬向和径向疵点信息,最后将两个方向上的疵点信息融合.得到检测结果。实验证明该方法是有效的。  相似文献   

2.
刘斌  彭嘉雄 《计算机工程》2007,33(10):25-27
提出了一类新的二维二通道小波的一种构造方法,并把此类小波应用于图像融合中,提出了利用小波分解后的低频子图像的梯度图对高频子图像进行融合的图像融合算法,并采用熵、均方根误差等指标对融合结果图像进行了评价。实验结果表明,该方法有较好的视觉效果。其融合性能好于采用相同融合算法的基于张量积四通道小波的融合方法,并能节约50%的运算量。  相似文献   

3.
多小波是小波理论的扩展,在图像处理方面具有单小波所不具有的优点.它能够为图像提供一种比小波多分辨率分析更加精确的分析方法.在研究了多小波变换域上同一尺度多个子带间相关性、子带内相邻系数的相关性以及能量的低聚性的特性后,提出了一种基于离散多小波变换域特征的融合方法,并将不同模态的医学脑部CT图像和MR图像利用此方法进行融合,相比于传统小波域内的图像融合方法.该方法不仅能够完好地显示源图像各自的信息,很好地将源图像的细节融合在一起,而且得到的融合图像具有更良好的视觉效果和更优的量化指标,体现出更好的融合效果.  相似文献   

4.
The move to publish documents electronically has several significant advantages to publishers and to consumers. These include the elimination of printing costs, paper costs, warehousing and transport of material, and the lag between release and delivery to the customer. There are also inherent dangers in electronic publishing as an unlimited number of perfect reproductions of the original can be made and distributed, thus depriving the publisher and author of revenues. While prevention of copying is preferred, it seems to be impractical when documents appear in digital form. In this paper we describe a method for digitally fingerprinting documents so that the publisher can distribute a unique copy to each customer. When a suspected illegal copy of a document is found, the publisher can determine which user’s copy was used. As long as the illegal copy is identical to the one of the originals, this is a straightforward process of comparison. A more serious problem arises when the attacker tries to hide the identity of the original by distorting the document (by changing segments, adding or deleting segments, etc.). In this situation, straightforward comparison may not be effective. In this case, we may want to find the closest original document to the illegal copy or determine whether a document is largely based on another document. We describe a method based on comparing the dictionaries generated by the LZW compression algorithm. This method allows for very rapid comparison of documents in the presence of changes made to prevent detection (distortion). While the primary application was for text documents, similar techniques can be applied to software and to images. Published online: 25 July 2001  相似文献   

5.
基于XML和N层VSM的Web信息检索   总被引:2,自引:0,他引:2  
基于XML文档格式良好、层次清晰,可以方便地操纵、分析其结构的特点。文中在将Web上的HTML文档转化为XML文档的基础上,通过Java中的DOM树,分析文档的层次结构。把文档分为层次化的文本段,对传统的VSM算法进行改进,把每个文本段转换为空间向量,实现了N层VSM算法,通过试验证明,改进后算法的查全率和查准率都要优于传统的VSM算法。  相似文献   

6.
分析了潜在语义模型,研究了潜在语义空间中文本的表示方法,提出了一种大容量文本集的检索策略。检索过程由粗粒度非相关剔除和相关文本的精确检索两个步骤组成。使用潜在语义空间模型对文本集进行初步的筛选,剔除非相关文本;使用大规模文本检索方法对相关文本在段落一级进行精确检索,其中为了提高检索的执行效率,在检索算法中引入了遗传算法;输出这些候选的段落序号。实验结果证明了这种方法的有效性和高效性。  相似文献   

7.
论文提出了一类新的小波-具有紧支撑、线性相位和正交性、伸缩矩阵为200的非张量积小波的一种构造方法,并把此类小波应用于多聚焦图像融合中。首先根据非张量积小波理论,提出了一种设计4通道6×6对称的非张量积小波滤波器组的方法,并用此方法设计出多组具有紧支撑、线性相位和正交性的非张量积小波6×6滤波器组,利用此类滤波器组对参加融合的图像进行滤波;然后利用基于张量积小波的图像融合常用的三种融合规则对图像进行融合。通过研究发现:无论利用三种规则的哪一种对多聚焦图像进行融合,基于该文构造的非张量积小波的融合方法都可得到较好的融合效果。作为例子选用对低频部分和高频部分均采用基于局部窗口激活度量取大的融合算法对分解后的系数图像进行融合;最后重构。并采用熵、均方根误差等指标对融合结果图像进行了评价,实验结果表明,该方法对多聚焦图像的融合有较好的融合效果。其融合性能好于采用相同融合算法的基于张量积Haar小波的融合方法和基于不对称的非张量积小波融合方法。  相似文献   

8.
基于编码的XML关系数据库存储   总被引:2,自引:0,他引:2  
在XML的发展过程中,如何有效地利用关系数据库技术存储和查询XML数据已经成为一个研究热点.提出了一种基于前、后序编码的XML关系数据库存储方法,该方法采用的模式映射方法能够使基于不同DTD(或schema)的XML文档保存在同一个关系表中,支持快速的XML路径查询,且具有较高的XML文档重组效率.对该方法中递归模式的处理技术也进行了讨论.实验表明,与XRel,Florescu和Kossman等人提出的XML关系数据库存储方法相比,该方法能够缩短复杂XML路径查询(如带条件谓词约束的路径查询)的响应时间.  相似文献   

9.
Liu  Mengchi  Ling  Tok Wang 《World Wide Web》2001,4(1-2):49-77
Most documents available over the Web conform to the HTML specification. Such documents are hierarchically structured in nature. The existing data models for the Web either fail to capture the hierarchical structure within the documents or can only provide a very low level representation of such hierarchical structure. How to represent and query HTML documents at a higher level is an important issue. In this paper, we first propose a novel conceptual model for HTML. This conceptual model has only a few simple constructs but is able to represent the complex hierarchical structure within HTML documents at a level that is close to human conceptualization/visualization of the documents. We also describe how to convert HTML documents based on this conceptual model. Using the conceptual model and conversion method, one can capture the essence (i.e., semistructure) of HTML documents in a natural and simple way. Based on this conceptual model, we then present a rule–based language to query HTML documents over the Internet. This language provides a simple but very powerful way to query both intra–document structures and inter–document structures and allows the query results to be restructured. Being rule–based, it naturally supports negation and recursion and therefore is more expressive than SQL–based languages. A logical semantics is also provided.  相似文献   

10.
针对以维吾尔语书写的文档间的相似性计算及剽窃检测问题,提出了一种基于内容的维吾尔语剽窃检测(U-PD)方法。首先,通过预处理阶段对维吾尔语文本进行分词、删除停止词、提取词干和同义词替换,其中提取词干是基于N-gram 统计模型实现。然后,通过BKDRhash算法计算每个文本块的hash值并构建整个文档的hash指纹信息。最后,根据hash指纹信息,基于RKR-GST匹配算法在文档级、段落级和句子级将文档与文档库进行匹配,获得文档相似度,以此实现剽窃检测。通过在维吾尔语文档中的实验评估表明,提出的方法能够准确检测出剽窃文档,具有可行性和有效性。  相似文献   

11.
This paper presents a multi-level matching method for document retrieval (DR) using a hybrid document similarity. Documents are represented by multi-level structure including document level and paragraph level. This multi-level-structured representation is designed to model underlying semantics in a more flexible and accurate way that the conventional flat term histograms find it hard to cope with. The matching between documents is then transformed into an optimization problem with Earth Mover’s Distance (EMD). A hybrid similarity is used to synthesize the global and local semantics in documents to improve the retrieval accuracy. In this paper, we have performed extensive experimental study and verification. The results suggest that the proposed method works well for lengthy documents with evident spatial distributions of terms.  相似文献   

12.
The efficiency of image enhancement algorithms depends on the quality and processing speed of image enhancement. There are many algorithms to implement image enhancement using wavelet theory. These algorithms have one thing in common: they all capture image details by decomposing low frequency sub-images. In fact, a lot of details in high-frequency sub-images are also found. Enlightened by the above-mentioned facts, a novel medical image enhancement method based on wavelet decomposition is proposed by adding details from the high-frequency sub-images and decomposing the image specially with ant-symmetric biorthogonal wavelet instead of some traditional wavelets. It not only improves the image enhancement, but also overcomes the shortcomings of large computation with faster computational speed and satisfies the real-time requirement in edge detection. Simulation experiments of mammographic images are implemented by Matlab with several different methods, the results show that the proposed method is superior to some popular methods, such as histogram equalization and wavelet nonlinear enhancement.  相似文献   

13.
一种有效的基于小波信息分布熵的图像检索技术   总被引:1,自引:0,他引:1  
在分析基于小波域图像索引技术的缺点后,提出一种基于小波信息分布熵的图像检索新方法.该方法首先将图像分割成若干个子图像,然后对这些子图像进行三层小波变换,并对小波变换后的各个子带图像进行处理以获取纹理图像,从而大大减少了计算的复杂性.最后以小波纹理直方图为概率密度函数,计算各个子图像的信息分布熵,不但使图像特征有紧密表示,而且也使图像检索的速度大大加快.实验结果表明,该方法对图像检索是有效的.  相似文献   

14.

Text document clustering is used to separate a collection of documents into several clusters by allowing the documents in a cluster to be substantially similar. The documents in one cluster are distinct from documents in other clusters. The high-dimensional sparse document term matrix reduces the clustering process efficiency. This study proposes a new way of clustering documents using domain ontology and WordNet ontology. The main objective of this work is to increase cluster output quality. This work aims to investigate and examine the method of selecting feature dimensions to minimize the features of the document name matrix. The sports documents are clustered using conventional K-Means with the dimension reduction features selection process and density-based clustering. A novel approach named ontology-based document clustering is proposed for grouping the text documents. Three critical steps were used in order to develop this technique. The initial step for an ontology-based clustering approach starts with data pre-processing, and the characteristics of the DR method are reduced with the Info-Gain collection. The documents are clustered using two clustering methods: K-Means and Density-Based clustering with DR Feature Selection Process. These methods validate the findings of ontology-based clustering, and this study compared them using the measurement metrics. The second step of this study examines the sports field ontology development and describes the principles and relationship of the terms using sports-related documents. The semantic web rational process is used to test the ontology for validation purposes. An algorithm for the synonym retrieval of the sports domain ontology terms has been proposed and implemented. The retrieved terms from the documents and sport ontology concepts are mapped to the retrieved synonym set words from the WorldNet ontology. The suggested technique is based on synonyms of mapped concepts. The proposed ontology approach employs the reduced feature set in order to clustering the text documents. The results are compared with two traditional approaches on two datasets. The proposed ontology-based clustering approach is found to be effective in clustering the documents with high precision, recall, and accuracy. In addition, this study also compared the different RDF serialization formats for sports ontology.

  相似文献   

15.
Doing exhaustive relevance judgments is one of the most challenging tasks in the construction process of an IR test collection, especially when the collection is composed of millions of documents. Pooling (or system pooling), which is basically a method for selecting documents to assess, is a solution to overcome this challenge. In this paper, to form such an assessment pool, a new, ranked-based document selection criterion, called the expected level of importance (ELI), is introduced. The results of the experiments performed, using TREC 5, 6, 7, and 8 data, showed that by using a pool in which the documents are sorted in the decreasing order of their calculated ELI scores, relevance judgments can efficiently be made by minimal human effort, while maintaining the size and the effectiveness of the resulting test collection. The criterion we propose can directly be adapted to the traditional TREC pooling practice in favor of efficiency, with no additional cost.  相似文献   

16.
17.
一种基于文本行和对角侧面特性的数字水印方法   总被引:5,自引:0,他引:5  
针对文本文件的特点,提出一种基于文本的行和对角侧面特性水印方法.该方法利用文本具有行的特征嵌入水印信息,大大增加水印容量,提取水印只需要行对角侧面,可以不用原始文本参照,同时该方法还可以用文本对角侧面特性来检测文本的完整性.实验表明,这种方法对于典型的剪切和噪声干扰具有一定鲁棒性.  相似文献   

18.
PCCS部分聚类分类:一种快速的Web文档聚类方法   总被引:16,自引:1,他引:15  
PCCS是为了帮助Web用户从搜索引擎所返回的大量文档片中筛选出自已所需要的文档,而使用的一种对Web文档进行快速聚类的部分聚类分法,首先对一部分文档进行聚类,然后根据聚类结果形成类模型对其余的文档进行分类,采用交互式的一次改进一个聚类摘选的聚类方法快速地创建一个聚类摘选集,将其余的文档使用Naive-Bayes分类器进行划分,为了提高聚类与分类的效率,提出了一种混合特征选取方法以减少文档表示的维数,重新计算文档中各特征的熵,从中选取具有最大熵值的前若干个特征,或者基于持久分类模型中的特征集来进行特征选取,实验证明,部分聚类方法能够快速,准确地根据文档主题内容组织Web文档,使用户在更高的术题层次上来查看搜索引擎返回的结果,从以主题相似的文档所形成的集簇中选取相关文档。  相似文献   

19.
Both of an automatic classification method for original documents based on image feature and a layout analysis method based on rule hypothesis tree are proposed. Then an intelligent document-filling system by electronizing the original documents, which can be applied to cellphones and pads is designed. When users are filling documents online, information can be automatically input to the financial information system merely by taking photos of the original documents. By this means can not only save time but also ensure the accuracy between the data online and that on the original documents. Experiments show that the accuracy of document classification is 88.38%, the accuracy of document-filling is 87.22%, and it takes 5.042 seconds dealing with per document. The system can be applied to financial, government, libraries, electric power, enterprises and many other industries, which has high economic and application value.  相似文献   

20.
Personalization is increasingly vital especially for enterprises to be able to reach their customers. The key challenge in supporting personalization is the need for rich metadata, such as metadata about structural relationships, subject/concept relations between documents and cognitive metadata about documents (e.g. difficulty of a document). Manual annotation of large knowledge bases with such rich metadata is not scalable. As well as, automatic mining of cognitive metadata is challenging since it is very difficult to understand underlying intellectual knowledge about document automatically. On the other hand, the Web content is increasing becoming multilingual since growing amount of data generated on the Web is non-English. Current metadata extraction systems are generally based on English content and this requires to be revolutionized in order to adapt to the changing dynamics of the Web. To alleviate these problems, we introduce a novel automatic metadata extraction framework, which is based on a novel fuzzy based method for automatic cognitive metadata generation and uses different document parsing algorithms to extract rich metadata from multilingual enterprise content using the newly developed DocBook, Resource Type and Topic ontologies. Since the metadata generation process is based upon DocBook structured enterprise content, our framework is focused on enterprise documents and content which is loosely based on the DocBook type of formatting. DocBook is a common documentation formatting to formally produce corporate data and it is adopted by many enterprises. The proposed framework is illustrated and evaluated on English, German and French versions of the Symantec Norton 360 knowledge bases. The user study showed that the proposed fuzzy-based method generates reasonably accurate values with an average precision of 89.39% on the metadata values of document difficulty, document interactivity level and document interactivity type. The proposed fuzzy inference system achieves improved results compared to a rule-based reasoner for difficulty metadata extraction (∼11% enhancement). In addition, user perceived metadata quality scores (mean of 5.57 out of 6) found to be high and automated metadata analysis showed that the extracted metadata is high quality and can be suitable for personalized information retrieval.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号