首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
基于Web的数据挖掘技术的应用研究   总被引:7,自引:0,他引:7  
Web是一个动态性极强的信息源,要访问、分析这些数据必须要研究异构数据的集成问题和选择合适的技术进行数据分析、集成和处理.文中介绍了多数据源数据仓库体系结构,多数据源数据的集成思想和实现的框架;分析了转换器在面向Web的数据挖掘中存在的不足和XML语言的技术特点;提出了应用XML技术对多数据源数据进行集成与转换以便构建数据仓库,同时给出了关键技术的实现方法.  相似文献   

2.
针对输入更新频率是输出刷新频率整数倍的未知参数双率系统,设计一个损失输出估计器计算采样间输出,再根据随机梯度算法设计参数估计器并得到系统模型的估计参数,基于最小方差控制原则设计出双率系统的自适应控制器。通过与基于最小二乘方法辨识系统参数的自适应控制算法进行比较,可以看出该算法的计算量较小,尤其是在输入数据更新频率与输出数据刷新频率相差较大时,计算量的差距更加明显。最后用仿真例子说明了该算法的有效性。  相似文献   

3.
近年来,链接预测成为社会网络和其他复杂网络链接挖掘中的热门研究领域.在链接预测问题中,经常会存在用来提高预测效果的附加数据信息源,这些数据可以用于预测网络中的链接是否存在.在所有的数据源中,最主要的数据源在链接预测中起到最重要的作用.因此,设计具备健壮性的算法用于充分利用所有数据源的信息来进行链接预测十分重要,算法还需要平衡主数据源和附加数据源的关系,使得链接预测能够获得更好的效果.同时,传统基于拓扑结构计算的无监督算法大多数通过计算网络中节点间的评分值来解决预测链接存在可能性的问题,这些方法能够获得有效的结果.在链接预测方法中,最关键的一步是构建准确的输入矩阵数据.由于许多真实世界数据集存在噪声,这导致降低了大多数链接预测模型的效果.提出了一种新的链接预测方法,通过多个数据源的融合,兼顾地利用了主数据源的信息和其他附加数据源的信息.接着,主数据源和其他附加数据源被用于构建一个低噪声且更准确的矩阵,而新的矩阵被用于作为传统无监督拓扑链接预测算法的输入.根据在多个真实世界数据上的测试结果,在多源数据集上进行对比实验,提出的基于低秩和稀疏矩阵分解的多源融合链接预测算法相对于基准算法能够获得更好的效果.  相似文献   

4.
当前很多的数据管理应用都需要从多个数据源集成数据,每个数据源都会提供一组值,并且不同的数据源常常提供相互冲突的数据值.为了提供给用户高质量的数据值,关键是数据集成系统能够解决数据冲突问题,提取出正确的数据值.文中对已有的真值发现算法进行了分析与总结,通过考虑处理同一个值的不同表现形式和改进的选票算法,作者对现有方法给出了改进,改进后的方法可以更有效地在众多冲突数据中找出正确的数据值.  相似文献   

5.
目前几乎所有的Web息系统都需要访问持久性数据资源,不同的数据源访问方式也不同,因此实现和封装数据访问越来越成为构建稳定、健壮和灵活的Web应用的基础。在分析了一个真实的Web信息系统的应用环境的基础上,描述了如何使用单倒,数据访问对象和抽象工厂模式来设计数据访问层体系结构,解决了多种并行数据源的访问问题,并对该结构面临的问题进行了分析。  相似文献   

6.
Web中大量可访问的数据源为人们获取有用的信息带来了极大的便利。作为Web数据源集成的一个必要的步骤,需要将存在于不同数据源表达形式各异的重复Web实体准确地识别出来。在已有的重复实体识别的工作中,主要是在两个数据源之间进行。由于Web数据源数量众多,使得这些方法无法应用于多个Web数据源之间的重复实体识别。针对这个问题提出了一种基于迭代训练的Web重复实体识别方法,可以在较小规模的训练样本上实现在多个Web数据源上的重复实体识别。通过在图书和计算机产品两个不同领域中多个Web数据源上的广泛实验,表明了提出方法的有效性。  相似文献   

7.
缓存可以提高应用系统的性能.但应用系统使用数据的情形是动态变化的,特别当数据更改数量大时,固定缓存会使应用系统的性能急剧下降.为了取得更好的性能,缓存应该根据应用系统的动态变化相应动态改变其数据和大小.缓存中的各类数据的查询、更改的频率是不同的,根据这一特点,提出了一种调整缓存的算法.当应用系统繁忙或负载情况发生重大变化时,则进行缓存调整.算法相对比较简单,容易实现.对各种负载情况进行模拟实验证明,这一自适应的缓存算法比固定缓存具有更好的性能.  相似文献   

8.
如何从数量众多的Web数据源集合中选择数量合适的数据源,使得在满足特定查询需求的前提下尽可能地减少访问数据源的数量,是Web大数据系统集成中的关键问题之一。提出了一个两阶段数据源选择方案:第一阶段通过各个数据源模式与中间模式的相似度选择与查询相关度高的数据源,通过计算依赖数据源的质量来选取质量较好的数据源;第二阶段基于最大熵理论计算数据源之间的重复率,设计实现了一个查询最小代价模型动态选择数据源算法。最后在实验平台上对算法进行了评估,实验表明该算法具有较高的效率与扩展性。  相似文献   

9.
基于自顶向下的投影挖掘策略,提出一种无需多遍扫描数据库的Web访问模式算法TAM-WAP.其特点是用当前所挖掘数据的特征去驱动一个预测算法,根据预测结果,有选择性地生成中间数据.对多种实际数据和模拟数据的实验表明,本文算法优于传统算法.  相似文献   

10.
计算机软件数据整合虚拟数据库体系研究   总被引:2,自引:0,他引:2  
数据整合是实现信息资源共享的有效途径之一,能为数据挖掘、知识发现以及应用开发提供透明的数据访问服务.将虚拟数据库的技术应用到数据整合系统中,实现了异构数据源的集中管理和多数据源间的数据共享.系统通过分类包装数据源,统一数据源访问接口,实现了数据源的"即插即用".根据领域划分知识,建立主题知识元数据模型解决了语义冲突和结构冲突.在"山西省科技基础条件平台建设"项目的示范工程中验证了系统的合理性,实现了专家信息资源共享,使资源得到有效利用与合理分布.  相似文献   

11.
An efficient algorithm for dynamic estimation of probabilities without division on unlimited number of input data is presented. The method estimates probabilities of the sampled data from the raw sample count, while keeping the total count value constant. Accuracy of the estimate depends on the counter size, rather than on the total number of data points. Estimator follows variations of the incoming data probability within a fixed window size, without explicit implementation of the windowing technique. Total design area is very small and all probabilities are estimated concurrently. Dynamic probability estimator was implemented using a programmable gate array from Xilinx. The performance of this implementation is evaluated in terms of the area efficiency and execution time. This method is suitable for the highly integrated design of artificial neural networks where a large number of dynamic probability estimators can work concurrently.  相似文献   

12.
This paper is a continuation of the authors' earlier work [1], where a version of the Tråvén's [2] Gaussian clustering neural network (being a recursive counterpart of the EM algorithm) has been investigated. A comparative simulation study of the Gaussian clustering algorithm [1], two versions of plug-in kernel estimators and a version of Friedman's projection pursuit algorithm are presented for two- and three-dimensional data. Simulations show that the projection pursuit algorithm is a good or a very good estimator, provided, however, that the number of projections is suitably chosen. Although practically confined to estimating normal mixtures, the simulations confirm general reliability of plug-in estimators, and show the same property of the Gaussian clustering algorithm. Indeed, the simulations confirm the earlier conjecture that this last estimator proivdes a way of effectively estimating arbitrary and highly structured continuous densities on Rd, at least for small d, either by using this estimator itself or, rather, by using it as a pilot estimator for a newly proposed plug-in estimator.  相似文献   

13.
基于自适应陷波滤波器的频率和幅值估计   总被引:4,自引:0,他引:4  
估计正弦信号的频率和幅值以实现准确信号跟随具有广泛的应用. 本文采用三维自适应陷波滤波器分析正弦信号, 提出了非归一化和归一化两种频率估计方法, 两种算法都具有圆形周期轨道, 能够获得信号的频率和幅值的准确估计以及正弦跟踪. 用Lyapunov定理和平均方法证明积分流形的存在性和一致渐近稳定性. 归一化算法改进了非归一化的收敛速度受制于信号幅值的缺点. 分析了估计器的带宽参数和频率自适应增益参数对频率跟踪暂态速度和稳态精度以及噪声特性的影响. 通过仿真证实了算法的有效性.  相似文献   

14.
A comparative study is presented regarding the performance of commonly used estimators of the fractional order of integration when data is contaminated by noise. In particular, measurement errors, additive outliers, temporary change outliers, and structural change outliers are addressed. It occurs that when the sample size is not too large, as is frequently the case for macroeconomic data, then non-persistent noise will generally bias the estimators of the memory parameter downwards. On the other hand, relatively more persistent noise like temporary change outliers and structural changes can have the opposite effect and thus bias the fractional parameter upwards. Surprisingly, with respect to the relative performance of the various estimators, the parametric conditional maximum likelihood estimator with modelling of the short run dynamics clearly outperforms the semiparametric estimators in the presence of noise that is not too persistent. However, when a non-zero mean is allowed for, it may reverse the conclusion.  相似文献   

15.
岭回归算法在卫星定位中的应用   总被引:1,自引:0,他引:1       下载免费PDF全文
提出一种在可见卫星相对几何构型不好时提高定位精度的新算法。卫星间相对几何位置对定位精度的影响很大,在可见卫星几何布局不好的情况下,定位精度会大大降低。针对此问题,提出一种新的卫星定位算法。该算法通过引入有偏估计,以最小化参数估计的方差为目标,用轻微牺牲无偏性的代价大大提高了定位精度,解决了可见卫星布局不好情况下不能精确定位的问题,在军事上有很高的应用价值。  相似文献   

16.
A multiple model adaptive estimator (MMAE) is presented for nonlinear systems with unknown disturbances. Multiple models are constructed with a set of process noise covariance matrices, such that the algorithm can adapt to different levels of unknown disturbances. The performance of the MMAE is analyzed for the considered system. It is proved that, under certain assumptions, the MMAE keeps the dynamics of its estimation error stable. A performance comparison among different estimators is carried out for space surveillance, where the position of a space target is estimated by using double line‐of‐sight measurements. Simulation studies illustrate that the presented algorithm outperforms the extended Kalman filter and the nonlinear robust filter.  相似文献   

17.
In classical time domain Box-Jenkins identification discrete-time plant and noise models are estimated using sampled input/output signals. The frequency content of the input/output samples covers uniformly the whole unit circle in a natural way, even in case of prefiltering. Recently, the classical time domain Box-Jenkins framework has been extended to frequency domain data captured in open loop. The proposed frequency domain maximum likelihood (ML) solution can handle (i) discrete-time models using data that only covers a part of the unit circle, and (ii) continuous-time models. Part I of this series of two papers (i) generalizes the frequency domain ML solution to the closed loop case, and (ii) proves the properties of the ML estimator under non-standard conditions. Contrary to the classical time domain case it is shown that the controller should be either known or estimated. The proposed ML estimators are applicable to frequency domain data as well as time domain data.  相似文献   

18.
Standard errors for bagged and random forest estimators   总被引:1,自引:0,他引:1  
Bagging and random forests are widely used ensemble methods. Each forms an ensemble of models by randomly perturbing the fitting of a base learner. The standard errors estimation of the resultant regression function is considered. Three estimators are discussed. One, based on the jackknife, is applicable to bagged estimators and can be computed using the bagged ensemble. The two other estimators target the bootstrap standard error estimator, and require fitting multiple ensemble estimators, one for each bootstrap sample. It is shown that these bootstrap ensemble sizes can be small, which reduces the computation involved in forming the estimator. The estimators are studied using both simulated and real data.  相似文献   

19.
孙纲灿  周常柱  苏贝 《计算机应用》2005,25(6):1468-1470
提出一种基于连续导频的LMS自适应信道估计算法,在一个符号内进行ⅢS自适应滤波,经多次迭代得到信道估计值,然后根据信道变化速度再对其进行多符号平均。接收系统通过Simulink实现,并通过编写S函数仿真LMS自适应信道估计算法,系统动态仿真结果显示此方法可明显提高信道估计的准确度和接收机性能。  相似文献   

20.
A simultaneously efficient and robust approach for distribution-free parametric inference, called the simulated minimum Hellinger distance (SMHD) estimator, is proposed. In the SMHD estimation, the Hellinger distance between the nonparametrically estimated density of the observed data and that of the simulated samples from the model is minimized. The method is applicable to the situation where the closed-form expression of the model density is intractable but simulating random variables from the model is possible. The robustness of the SMHD estimator is equivalent to the minimum Hellinger distance estimator. The finite sample efficiency of the proposed methodology is found to be comparable to the Bayesian Markov chain Monte Carlo and maximum likelihood Monte Carlo methods and outperform the efficient method of moments estimators. The robustness of the method to a stochastic volatility model is demonstrated by a simulation study. An empirical application to the weekly observations of foreign exchange rates is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号