首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
传统遗传算法求解计算密集型任务时,适应度函数的执行时间增加相当快,致使当种群规模或者进化代数增大时,算法的收敛速度非常缓慢。基于此,设计了"粗粒度-主从式"混合式并行遗传算法(HBPGA),并在目前TOP500上排名第一的超级计算机神威"太湖之光"平台上实现。该算法模型采用两级并行架构,结合了MPI和Athread两种编程模型,与传统在单核或者一级并行构架的多核集群上实现的遗传算法相比,在申威众核处理器上实现了二级并行,并得到了更好的性能和更高的加速比。实验中,当从核数为16×64时,最大加速比达到544,从核加速比超过31。  相似文献   

2.
Han  Qingchang  Yang  Hailong  Dun  Ming  Luan  Zhongzhi  Gan  Lin  Yang  Guangwen  Qian  Depei 《The Journal of supercomputing》2021,77(5):4533-4564
The Journal of Supercomputing - Tile low-rank general matrix multiplication (TLR GEMM) is a novel method of matrix multiplication on large data-sparse matrices, which can significantly reduce...  相似文献   

3.
三维地震声波理论与计算方法是地质勘探研究的基础,通过分析不同介质中声波的传播特性,完成三维地震声波正演模拟。针对三维地震声波有限差分交错网格方程正演过程中存在数值计算大、内存消耗大等实际问题,提出了基于神威·太湖之光超级计算机系统中国产异构众核处理器(申威26010)的三维地震声波正演模拟编程模型,完成了基于处理器间的进程级并行基于计算核心间的线程级并行优化策略。研究了DMA(直接内存读取)通信方式,提出2.5D流水线任务划分、通信与计算的相互掩盖的多角度优化策略。实验结果表明,该策略有效缓解了带宽瓶颈,发挥了处理器强大的计算能力,解决了程序在申威26010异构众核处理器处理有限差分问题时,并行效率过低的问题。在大规模测试下,使用266240个计算核心,程序仍能够保持稳定的计算性能,达到5.5 GFlops的场值更新。  相似文献   

4.
基于分段线性动态时间弯曲的时间序列聚类算法研究   总被引:4,自引:0,他引:4  
时间序列是一类重要的复杂类型数据,时间序列知识发现正成为知识发现的研究热点之一。欧几里德距离及其扩展作为相似测度被广泛应用于时间序列的比较中,但是这种距离测度时数据没有好的鲁棒性。动态时间弯曲技术是基于非线性动态编程的一种模式匹配算法,但是其计算复杂性相当高。本文提出了基于时间序列分段线性表示的动态时间弯曲算法,通过计算线性分段序列数据之间的最短弯曲路径来获得序列的匹配。对综合控制时间序列数据进行基于不同距离测度的聚类分析对比结果表明本文提出的算法有很高的精度和时振幅差异、嘈声和线性漂移有强的鲁棒性,大大降低计算复杂性,具有良好的应用价值。  相似文献   

5.
时间偏差的并行离散事件模拟研究综述   总被引:2,自引:1,他引:1  
时间偏差是并行离散事件模拟中广泛使用的一种同步机制。从事件列表管理、消息取消、乐观性控制、状态保存和恢复、内存管理以及全局虚拟时间计算等几个主要方面对时间偏差的并行离散事件模拟进行了探讨,阐述了其存在的问题,并对各种优化策略进行了分析比较和对并行离散事件模拟的应用前景作了一个展望。  相似文献   

6.
The behavior of n interacting processes synchronized by the "Time Warp" rollback mechanism is analyzed under the constraint that the total amount of memory to execute the program is limited. In Time Warp, a protocol called "cancelback" has been proposed to reclaim storage when the system runs out of memory. A discrete state, continuous time Markov chain model for Time Warp augmented with the cancelback protocol is developed for a shared memory system with n homogeneous processors and homogeneous workload with constant message population. The model allows one to predict speedup as the amount of available memory is varied. The performance predicted by the model is validated through performance measurements on an operational Time Warp system executing on a shared-memory multiprocessor using a workload similar to that in the model. It is observed that if the sequential simulation requires m message buffers, Time Warp with a small fraction of message buffers beyond m performs almost as well as Time Warp with unlimited memory.  相似文献   

7.
Load Balancing Strategies for Time Warp on Multi-User Workstations   总被引:1,自引:0,他引:1  
Burdorf  C.; Marti  J. 《Computer Journal》1993,36(2):168-176
  相似文献   

8.
Time Warp is an optimistic protocol for synchronizing parallel discrete event simulations. To achieve performance in a multiuser network of workstation (NOW) environment, Time Warp must continue to operate efficiently in the presence of external workloads caused by other users, processor heterogeneity, and irregular internal workloads caused by the simulation model. However, these performance problems can cause a Time Warp program to become grossly unbalanced, resulting in slower execution. The key observation asserted in this article is that each of these performance problems, while different in source, has a similar manifestation. For a Time Warp program to be balanced, the amount of wall clock time necessary to advance an LP one unit of simulation time should be about the same for all LPs. Using this observation, we devise a single algorithm that mitigates these performance problems and enables the “background” execution of Time Warp programs on heterogeneous distributed computing platforms in the presence of external as well as irregular internal workloads  相似文献   

9.
In modal frequency response analysis, the dynamic analyst is often faced with the structure’s dynamic behavior, the modal contributions included, over a frequency window rather than at a single frequency. Therefore a new method in modal frequency response analysis has been developed for computing both complex modal-contributions and real, actual modal-contributions over a frequency range. Contributions from normal modes to displacement, velocity, or acceleration of a set of selected evaluation points (grid-component combinations) are considered. The focus lies on identifying the major actual-contributions from normal modes and the frequency range they are active in. The method is valid for all branches of mechanical engineering. With the thorough knowledge of the dominant modal-contributions to the physical motion response of relevant structure locations and the modal contributions’ frequency history, the traditional design process can substantially be enhanced. It is worthwhile to notice that by the use of the presented procedure the dynamic analyst may find innovative redesigns which the automatic structural optimizers are not able to find. Examples are given to demonstrate the application, the strength of the coupling between modes, the influence of base and force excitation on the modal contributions and, finally, some recommendations on how to reduce undesired structural responses.  相似文献   

10.
Time Warp is an optimistic synchronization protocol used for parallel discrete event simulation. While Time Warp has the potential to reduce the execution time of large simulations, it has been plagued by a variety of problems, namely: 1. Instability due to thrashing effects caused by echoing and cascading rollbacks. 2. Memory bottlenecks due to state saving and excessive optimism. 3. Inefficient scheduling algorithms for scheduling Time Warp processes on each processing node. These problems have inhibited the widespread use of Time Warp as a general purpose synchronization algorithm. The general trend of researchers attempting to solve these problems has been to statically limit the optimism of Time Warp. Unfortunately, these attempts have achieved only limited success. This is because a static set of parameters may perform well for one simulation but not for another. This paper attacks the problem using adaptive mechanisms to control optimism, using an index of performance called useful work. This research presents solutions for the above mentioned problems, by: 1. Stabilizing Time Warp using adaptive bounded time windows. 2. Reducing memory usage and overall execution time by using an adaptive mechanism to vary the checkpoint interval. 3. Scheduling Time Warp processes with the useful work parameter to favor more productive processes. Using this new performance index called Useful Work, several modifications to Time Warp are implemented to stabilize and improve Time Warp. Thus, this new improved Time Warp synchronization mechanism termed Parameterized Time Warp provides an integrated adaptive solution to optimistic Parallel Discrete Event Simulation. Empirical work showing that PTW outperforms an equivalent Time Warp simulation executing under similar partitioning and load conditions is also presented.  相似文献   

11.
谢昊飞  郑鸣  王平 《计算机工程》2010,36(5):240-242
针对传统精确时间协议的软件设计中存在随机抖动、误差大的问题,提出一种IP核设计。该设计采用数字逻辑电路获取精确时间戳,并实现同步算法、晶振补偿算法和状态转移控制算法,利用晶振分频比微调减小晶振频率漂移对同步精度和稳定性的影响。仿真结果表明,该设计具有较高的同步精度,精度值可达10 ns。  相似文献   

12.
One important problem in deterministic scheduling theory is to schedule a set of independent jobs on a set of parallel processors without any preemption. When the jobs have fixed due dates, the objective often is to minimize the maximum lateness. The problem is NP-Complete[7]. In this paper a fast heuristic procedure is developed to solve this problem. It is applicable to both equal and unequal processors. The “average” behavior of the procedure is tested against a truncated Branch and Bound algorithm in a large scale computational study consisting of about 10,000 different examples. The results show that the procedure is highly efficient.  相似文献   

13.
We consider a sequencing problem in which there are n jobs to be processed nonpreemptively on m nonidentical processors. The processing time of the j-th processor is exponentially distributed with rate μj, where μ1μ2μm. Job i incurs a holding cost at rate ci per unit time while still in the system, where c1c2cn. We show that to minimize total expected holding costs (weighted flowtime), it is optimal to take the fastest (lowest indexed) available processor, say processor j, and assign job k to it if k>(Σij1μi)/μjj k−1. After each assignment the jobs are renumbered (so that job k+1 becomes job k, etc.), and the procedure is repeated with the next fastest available processor, etc. Note that the policy does not depend on the values of the holding costs ci. This result is a generalization of the result of Agrawala et al. (1984) for minimizing expected flowtime, i.e., minimizing total holding cost when the holding costs of all the jobs are the same. We give a simpler proof of the more general result.  相似文献   

14.
15.
不平衡数据集分类为机器学习热点研究问题之一,近年来研究人员提出很多理论和算法以改进传统分类技术在不平衡数据集上的性能,其中用阈值判定标准确定神经网络中的阈值是重要的方法之一。常用的阈值判定标准存在一定缺点,如不能使少数类及多数类分类精度同时取得最好、过于偏好多数类的精度等。为此提出一种新的阈值判定标准,依据该标准能够使少数类及多数类分类精度同时取得最好而不受样例类别比例的影响。以神经网络与遗传算法相结合训练分类器,作为阈值选择条件和分类器的评价标准,新标准能够得到较好的结果。  相似文献   

16.
基于频率调节的分布式系统时间同步算法设计与实现   总被引:1,自引:0,他引:1  
赵斌  贺鹏  易娜 《计算机应用》2007,27(4):814-817
为了降低Internet上对NTP时间服务器的访问频率,有效缓解时间服务器资源负担过重的状况,提出了一套适用于分布式系统的基于频率调节的时间同步算法。实验表明,该算法在保障同步精度的前提下,相对于传统的建立在相位调节方式上的时间同步算法,有较好的效果。  相似文献   

17.
为了解决LSF调度算法在实时调度中由颠簸现象引起的调度实时性差、浪费系统资源的问题,在LSF算法中引入一个任务重要度系数,采用云模型对任务重要度系数和裕度进行定量表示,并通过由重要度系数云和裕度云两个任务特征参数云模型共同确定的二维云模型,为每个任务设定一个抢占阈值,当某一就绪任务要抢占当前任务时,必须要满足它的优先级高于当前任务的抢占阈值.仿真结果表明,采用云模型优化后的LSF算法不仅有效解决了颠簸现象,而且能使紧急且重要的任务优先运行.  相似文献   

18.
There is a function of dynamic mapping when processing non-linear complex data with Elman neural networks. Because Elman neural network inherits the feature of back-propagation neural network to some extent, it has many defects; for example, it is easy to fall into local minimum, the fixed learning rate, the uncertain number of hidden layer neuron and so on. It affects the processing accuracy. So we optimize the weights, thresholds and numbers of hidden layer neurons of Elman networks by genetic algorithm. It improves training speed and generalization ability of Elman neural networks to get the optimal algorithm model. It has been proved by instance analysis that new algorithm was superior to the traditional model in terms of convergence rate, predicted value error, number of trainings conducted successfully, etc. It indicates the effect of the new algorithm and deserves further popularization.  相似文献   

19.
研究了非合作用户的网络定价问题.将对策论中主从策略的思想应用到定价策略中,首先分析了在Nash平衡态下使主方收益达到最大时价格所满足的条件;然后结合网络市场的供求关系,将Nash平衡点视为供求平衡点,从而确定了相应的价格;最后以数值例子得出Nash平衡态下用户的速率和网络的收益.结果表明,合理的价控策略能够激发用户合理地使用网络资源,同时给管理者带来最优收入.  相似文献   

20.
针对航空通信环境中正交频分多址系统的资源分配问题,在信道资源有限的约束条件下,以最大化用户节点的效用总和为目标,提出了一种基于粒子群优化(PSO)的时频联合资源分配算法.该算法采用离散变量来编码粒子位置,并针对离散空间构建新的基于概率信息的粒子速度和位置更新算法.仿真结果表明:所提出的资源分配算法在效用总和、公平性等方面优于现有资源分配算法.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号