首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
新型非易失存储环境下事务型数据管理技术研究   总被引:1,自引:1,他引:0  
为适应底层存储架构的变化,上层数据库系统已经经历了多轮的演化与变革。在大数据环境下,以非易失、大容量、低延迟、按字节寻址等为特征的新型非易失存储器件(NVM)的出现势必对数据库系统带来重大影响,相关的存储与事务处理技术是其中值得关注的重要环节。首先,概述了事务型数据库系统随存储环境发展的历史与趋势;其次,对影响上层数据管理系统设计的非易失性存储技术,以及面向大数据应用领域与硬件环境优化的事务技术进行综述与分析;最后,对非易失存储环境下事务型数据库面临的挑战与研究趋势进行了展望。  相似文献   

2.
为了满足移动计费应用的基本要求,需要将大批量的基本数据常驻内存.分析了内存中数据异常产生的原因,在借鉴数据通信中的CRC技术的基础上,探讨了在内存数据管理中,采用在每个数据片断上都设置一个校验码的方式,进行内存数据一致性的检查,结合传统数据库的日志技术和检查点技术,给出了实现内存数据异常检测和预防的基本方法.  相似文献   

3.
内存取证是计算机取证的一个重要分支,而获取内存镜像文件中进程和线程信息是内存取证技术的重点和难点。基于微软最新操作系统平台Windows 8,研究其进程和线程的获取方法。运用逆向工程分析技术对Windows 8下进程和线程相关内核数据结构进行分析,提取出相应特征;基于这些特征,提出了一种能够从物理内存镜像文件中得到系统当前进程和线程信息的算法。实验结果和分析表明,该算法能够成功提取隐藏进程和非隐藏进程,及其各进程相关的线程信息,为内存取证分析提供了可靠的数据基础。  相似文献   

4.
针对使用传统技术处理和显示大图片在界面重绘时容易造成系统阻塞现象,提出改进三层缓冲技术。该技术在内存中开辟一块与大图片属性相兼容的内存区域,用于载入大图片数据;然后再在内存中开辟另一块与显示设备环境相兼容的内存区域,用于执行其它的图形处理指令;最后通过位块传输法将内存中的图形依次拷贝到真正的显示设备环境输出。输出完成之后仅仅释放与显示设备环境相兼容的内存区域,保留与图片属性相兼容的内存区域,用于界面重绘的时候,避免重新载入图片,从而解决系统阻塞现象。该技术在数字岩心软件系统处理钻孔柱状图中得到验证和应用,取得了良好效果。  相似文献   

5.
内存取证研究与进展   总被引:1,自引:0,他引:1  
张瑜  刘庆中  李涛  吴丽华  石春 《软件学报》2015,26(5):1151-1172
网络攻击内存化和网络犯罪隐遁化,使部分关键数字证据只存在于物理内存或暂存于页面交换文件中,这使得传统的基于文件系统的计算机取证不能有效应对.内存取证作为传统文件系统取证的重要补充,是计算机取证科学的重要组成部分,通过全面获取内存数据、详尽分析内存数据,并在此基础上提取与网络攻击或网络犯罪相关的数字证据,近年来,内存取证已赢得安全社区的持续关注,获得了长足的发展与广泛应用,在网络应急响应和网络犯罪调查中发挥着不可替代的作用.首先回顾了内存取证研究的起源和发展演化过程;其次介绍了操作系统内存管理关键机制;然后探讨了内存取证的数据获取和分析方法,归纳总结目前内存取证研究的最新技术;最后讨论了内存取证存在的问题、发展趋势和进一步的研究方向.  相似文献   

6.
随着科学计算和人工智能技术的快速发展,分布式环境下的并行计算已成为解决大规模理论计算和数据处理问题的重要手段。内存容量的提高以及迭代算法的广泛应用,使得以Spark为代表的内存计算技术愈发成熟。但是,当前主流的分布式内存模型和计算框架难以兼顾易用性和计算性能,并且在数据格式定义、内存分配、内存使用效率等方面存在不足。提出一种基于分布式数据集的并行计算方法,分别从模型理论和系统开销两个角度对内存计算进行优化。在理论上,通过对计算过程进行建模分析,以解决Spark在科学计算环境下表达能力不足的问题,同时给出计算框架的开销模型,为后续性能优化提供支持。在系统上,提出一种框架级的内存优化方法,该方法主要包括对跨语言分布式内存数据集的重构、分布式共享内存的管理、消息传递过程的优化等模块。实验结果表明,基于该优化方法实现的并行计算框架可以显著提升数据集的内存分配效率,减少序列化/反序列化开销,缓解内存占用压力,应用测试的执行时间相比Spark减少了69%~92%。  相似文献   

7.
大数据爆发的时代产生了各种新的业务类型,业务数据驱动着事务管理系统创新性的迭代发展.由于传统持久化介质的制约,传统的事务管理系统无法高效执行事务.并且,解决事务冲突的额外开销仍然会限制事务管理系统的吞吐.新型硬件的商业化应用为事务管理系统注入了更多的可能性,在学术界和工业界均得到了广泛关注.硬件事务内存可以为事务管理系统提供硬件级别的事务冲突检测.而且,相对于固态硬盘,非易失性内存的字节寻址和持久化特性可以显著降低事务延迟并提升事务管理系统的性能.但是,现有的事务管理系统技术无法充分地利用硬件本身带来的性能提升,因此需要重构事务架构来解决这个问题.首先对新型硬件环境下的事务管理系统进行总结分析;之后总结了当前基于新型硬件事务管理系统的技术路线,明确了硬件事务内存和非易失性存储硬件下的事务管理系统的优势和不足;最后指明了新型硬件环境中事务管理系统未来可能的发展方向以及新的挑战.  相似文献   

8.
在多核处理器、大内存、非易失内存等新硬件技术的支持下,异构存储与计算平台成为主流的高性能计算平台.传统的数据库引擎采用一体化设计,新兴数据库则采用存算分离和算子下推技术以更好适应新型分布式存储架构.提出了一种新颖的基于管算存分离方法的内存数据库实现技术,在存算分离技术的基础上进一步根据数据库模式、数据分布与负载计算特征将数据集划分为元数据集和数值集,将统一的查询引擎分解为元数据管理引擎、计算引擎和存储引擎,将包含语义信息的元数据管理抽象为独立的管理层,将无语义的数值存储和计算抽象为计算存储层,其中计算密集型负载定义为计算层,数据密集型负载设计为存储层,并根据硬件平台的不同分离或合并计算与存储层.内存数据库的实现技术分为几个层次:1)模式优化,实现数据库存储中“数(数值)”与“据(元数据)”的分离,根据数据的内在特性选择不同的存储与计算策略;2)模型优化,采用Fusion OLAP模型,实现在关系存储模型上的高性能多维计算;3)算法优化,通过代理键索引、向量索引支持优化的向量连接、向量聚集算法,提高OLAP性能;4)系统设计优化,通过数据库引擎分层技术实现管理与计算分离、存储与计算分离以...  相似文献   

9.
对多核环境下内存数据进行并发调度,可以减少计算机宕机次数和数据切换时时间,提高数据并发调度精度,增加数据操作平稳性;当前的内存数据并发调度方法是利用PrebuiltTrigger对内存数据进行并发调度,在调度过程中,没有设定具体的内存数据调度目标,导致内存数据库中的数据因此错乱无序,存在数据并发调度精度低的问题;为此,提出一种基于Linux的多核环境下内存数据并发调度优化方法;该方法首先采用IACT算法对影响调度进行的数据和内存数据库中相似或重复数据进行清洗,然后以清洗的数据为基础,利用启发式算法对其进行数据特征选取,依据多属性决策理论对内存数据并发调度的最优路径属性权重集合进行计算,以其结果为依据,计算调度最优路径的偏差值,最后利用最小偏差值,建立调度最优路径线性规划模型,对每条调度路径的综合决策属性值进行排序,由此得到调度的最优路径,完成对多核环境下内存数据的并发调度;实验结果证明,所提方法可以对多核环境下内存数据进行高效率地并发调度,提高了数据调度精度,增加了内存数据的可循环利用性,为低开销的内存数据调度提供了支撑。  相似文献   

10.
内存计算技术研究综述   总被引:4,自引:3,他引:1  
罗乐  刘轶  钱德沛 《软件学报》2016,27(8):2147-2167
在大数据时代,如何高效地处理海量数据以满足性能需求,是一个需要解决的重要问题.内存计算充分利用大容量内存进行数据处理,减少甚至避免I/O操作,因而极大地提高了海量数据处理的性能,同时也面临一系列有待解决的问题.首先,在分析内存计算技术特点的基础上对其进行了分类,并分别介绍了各类技术及系统的原理、研究现状及热点问题;其次,对内存计算的典型应用进行了分析;最后,从总体层面和应用层面对内存计算面临的挑战予以分析,并且对其发展前景做了展望.  相似文献   

11.
12.
Time series analysis has always been an important and interesting research field due to its frequent appearance in different applications. In the past, many approaches based on regression, neural networks and other mathematical models were proposed to analyze the time series. In this paper, we attempt to use the data mining technique to analyze time series. Many previous studies on data mining have focused on handling binary-valued data. Time series data, however, are usually quantitative values. We thus extend our previous fuzzy mining approach for handling time-series data to find linguistic association rules. The proposed approach first uses a sliding window to generate continues subsequences from a given time series and then analyzes the fuzzy itemsets from these subsequences. Appropriate post-processing is then performed to remove redundant patterns. Experiments are also made to show the performance of the proposed mining algorithm. Since the final results are represented by linguistic rules, they will be friendlier to human than quantitative representation.  相似文献   

13.
The optimization capabilities of RDBMSs make them attractive for executing data transformations. However, despite the fact that many useful data transformations can be expressed as relational queries, an important class of data transformations that produce several output tuples for a single input tuple cannot be expressed in that way.

To overcome this limitation, we propose to extend Relational Algebra with a new operator named data mapper. In this paper, we formalize the data mapper operator and investigate some of its properties. We then propose a set of algebraic rewriting rules that enable the logical optimization of expressions with mappers and prove their correctness. Finally, we experimentally study the proposed optimizations and identify the key factors that influence the optimization gains.  相似文献   


14.
随着互联网的高速发展,特别是近年来云计算、物联网等新兴技术的出现,社交网络等服务的广泛应用,人类社会的数据的规模正快速地增长,大数据时代已经到来。如何获取,分析大数据已经成为广泛的问题。但随着带来的数据的安全性必须引起高度重视。本文从大数据的概念和特征说起,阐述大数据面临的安全挑战,并提出大数据的安全应对策略。  相似文献   

15.
Compression-based data mining of sequential data   总被引:3,自引:1,他引:2  
The vast majority of data mining algorithms require the setting of many input parameters. The dangers of working with parameter-laden algorithms are twofold. First, incorrect settings may cause an algorithm to fail in finding the true patterns. Second, a perhaps more insidious problem is that the algorithm may report spurious patterns that do not really exist, or greatly overestimate the significance of the reported patterns. This is especially likely when the user fails to understand the role of parameters in the data mining process. Data mining algorithms should have as few parameters as possible. A parameter-light algorithm would limit our ability to impose our prejudices, expectations, and presumptions on the problem at hand, and would let the data itself speak to us. In this work, we show that recent results in bioinformatics, learning, and computational theory hold great promise for a parameter-light data-mining paradigm. The results are strongly connected to Kolmogorov complexity theory. However, as a practical matter, they can be implemented using any off-the-shelf compression algorithm with the addition of just a dozen lines of code. We will show that this approach is competitive or superior to many of the state-of-the-art approaches in anomaly/interestingness detection, classification, and clustering with empirical tests on time series/DNA/text/XML/video datasets. As a further evidence of the advantages of our method, we will demonstrate its effectiveness to solve a real world classification problem in recommending printing services and products. Responsible editor: Johannes Gehrke  相似文献   

16.
As the amount of multimedia data is increasing day-by-day thanks to cheaper storage devices and increasing number of information sources, the machine learning algorithms are faced with large-sized datasets. When original data is huge in size small sample sizes are preferred for various applications. This is typically the case for multimedia applications. But using a simple random sample may not obtain satisfactory results because such a sample may not adequately represent the entire data set due to random fluctuations in the sampling process. The difficulty is particularly apparent when small sample sizes are needed. Fortunately the use of a good sampling set for training can improve the final results significantly. In KDD’03 we proposed EASE that outputs a sample based on its ‘closeness’ to the original sample. Reported results show that EASE outperforms simple random sampling (SRS). In this paper we propose EASIER that extends EASE in two ways. (1) EASE is a halving algorithm, i.e., to achieve the required sample ratio it starts from a suitable initial large sample and iteratively halves. EASIER, on the other hand, does away with the repeated halving by directly obtaining the required sample ratio in one iteration. (2) EASE was shown to work on IBM QUEST dataset which is a categorical count data set. EASIER, in addition, is shown to work on continuous data of images and audio features. We have successfully applied EASIER to image classification and audio event identification applications. Experimental results show that EASIER outperforms SRS significantly. Surong Wang received the B.E. and M.E. degree from the School of Information Engineering, University of Science and Technology Beijing, China, in 1999 and 2002 respectively. She is currently studying toward for the Ph.D. degree at the School of Computer Engineering, Nanyang Technological University, Singapore. Her research interests include multimedia data processing, image processing and content-based image retrieval. Manoranjan Dash obtained Ph.D. and M. Sc. (Computer Science) degrees from School of Computing, National University of Singapore. He has worked in academic and research institutes extensively and has published more than 30 research papers (mostly refereed) in various reputable machine learning and data mining journals, conference proceedings, and books. His research interests include machine learning and data mining, and their applications in bioinformatics, image processing, and GPU programming. Before joining School of Computer Engineering (SCE), Nanyang Technological University, Singapore, as Assistant Professor, he worked as a postdoctoral fellow in Northwestern University. He is a member of IEEE and ACM. He has served as program committee member of many conferences and he is in the editorial board of “International journal of Theoretical and Applied Computer Science.” Liang-Tien Chia received the B.S. and Ph.D. degrees from Loughborough University, in 1990 and 1994, respectively. He is an Associate Professor in the School of Computer Engineering, Nanyang Technological University, Singapore. He has recently been appointed as Head, Division of Computer Communications and he also holds the position of Director, Centre for Multimedia and Network Technology. His research interests include image/video processing & coding, multimodal data fusion, multimedia adaptation/transmission and multimedia over the Semantic Web. He has published over 80 research papers.  相似文献   

17.
18.
Linear combinations of translates of a given basis function have long been successfully used to solve scattered data interpolation and approximation problems. We demonstrate how the classical basis function approach can be transferred to the projective space ℙ d−1. To be precise, we use concepts from harmonic analysis to identify positive definite and strictly positive definite zonal functions on ℙ d−1. These can then be applied to solve problems arising in tomography since the data given there consists of integrals over lines. Here, enhancing known reconstruction techniques with the use of a scattered data interpolant in the “space of lines”, naturally leads to reconstruction algorithms well suited to limited angle and limited range tomography. In the medical setting algorithms for such incomplete data problems are desirable as using them can limit radiation dosage.  相似文献   

19.
Existing automated test data generation techniques tend to start from scratch, implicitly assuming that no pre‐existing test data are available. However, this assumption may not always hold, and where it does not, there may be a missed opportunity; perhaps the pre‐existing test cases could be used to assist the automated generation of additional test cases. This paper introduces search‐based test data regeneration, a technique that can generate additional test data from existing test data using a meta‐heuristic search algorithm. The proposed technique is compared to a widely studied test data generation approach in terms of both efficiency and effectiveness. The empirical evaluation shows that test data regeneration can be up to 2 orders of magnitude more efficient than existing test data generation techniques, while achieving comparable effectiveness in terms of structural coverage and mutation score. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

20.
自互联网出现以来,数据保护一直是个难题。当社交媒体网站在数字市场上大展拳脚的那一刻,对用户数据和信息的保护让决策者们不得不保持警惕。在数字经济时代的背景下,数据逐渐成为企业提升竞争力的重要要素,围绕着数据展开的市场竞争越来越多。数字经济时代,企业对数据资源的重视与争夺,将网络平台权利与用户个人信息保护、互联网企业之间有关数据不正当竞争的纠纷和冲突,推上了风口浪尖。因此,如何协调和把握数据的合理利用和保护之间的关系,规制不正当竞争行为,以求在数字经济快速发展的洪流中,占据竞争优势显得尤为重要。文章将通过分析数据的二元性,讨论数据在数字经济时代的价值,并结合反不正当竞争法和实践案例,进一步讨论数据利用和保护的关系。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号