期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

叶英刘佩林《计算机与现代化》2013,(4):48-52,56

众核架构的发展与共享化数据模型的普及,使共享型、私有型存储架构不再胜任。综合两者的Cooperative Caching虽对多核系统表现良好,但对众核系统,其对被替换数据块的唯一保留,导致大量片上长距离访存,增加片上通信,影响整体性能。对此,本文提出私有存储自适应共享化架构:允许被替换数据块多副本保留,并自适应控制保留数量。仿真结果显示,该架构较Cooperative Caching片上通信量平均减少12.8%,最好减少32.7%,整体性能提高9.1%;证明其在众核、共享化环境下性能出色。相似文献

2.

SmartCache：让你的数据飞起来

《网管员世界》2014,(23):16-16

大环境随着多核CPU技术的迅猛发展,其处理能力已经大大超过了硬盘的处理能力,前端应用系统与后端存储系统的性能差距越来越大,应用系统的大部分时间都花费在等待存储系统响应上,因此无论服务器的CPU处理能力多强大,应用系统的整体性能仍然偏低。解决CPU与后端存储速率不匹配的传统做法是增加昂贵的DRAM,但是随着后端存储容量的不断增加,这个方法已不再有效。相似文献

3.

Amdahl定律在层次化片上多核处理器中的扩展

陈书明陈胜刚尹亚明《计算机研究与发展》2012,49(1):83-92

层次化片上多核处理器以紧耦合的多个核构成超节点,对访存和片上通信的局部性有良好支撑,能有效地缓解片上多核中数据通信带来的通信开销.在关于多核处理器的Amdahl开销/性能模型已有的研究基础上,引入片上数据通信延迟作为Amdahl任务计算开销的新元素,构建了层次化片上多核处理器的Amdahl加速比扩展模型.基于该扩展模型,就层次化片上多核处理器的加速比与超节点配置的关系问题展开研究.模拟分析发现,要获得良好的加速比性能,层次化片上多核处理器需要在超节点数目与超节点的大小(超节点内核的个数)之间作仔细的权衡;对于给定核数目的层次化片上多核处理器,使系统性能最优的超节点大小往往出现在中间某个值而不是最大或者最小,并且该值随着系统规模的变化会发生相应的变化. 相似文献

4.

基于PCI总线的多路定标器 总被引：2，自引：0，他引：2

韩慧薛志华刘松秋李鹏宇赵捷《计算机测量与控制》2005,13(5):491-493

针对核辐射测量实验的需求,开发了基于PCI总线的多路定标器系统。硬件上重点讨论了多路定标器的大规模逻辑电路(PLD)设计方法,以及核仪器系统中的PCI接口设计。软件上介绍了采用图形化编程平台LabVIEW,实现多路定标器的图形化软面板,使仪器实现自动化、智能化。实验结果表明,该系统稳定可靠,性能上有很大提高,操作简单、直观。相似文献

5.

一种异构多核处理器嵌入式实时操作系统构架设计 总被引：3，自引：1，他引：2

蒋建春汪同庆《计算机科学》2011,38(6):298

由于异构多核处理器和多处理器系统及同构多核处理器的构架存在很大差别,应用于多处理器系统的分布式结构以及应用于同构多核系统的主从式结构操作系统不能解决异构多核处理器的实时调度和效率问题。对异构多核处理器的特点及发展趋势进行了研究,提出了一种适用异构多核处理器的多主模式实时操作系统构架。这种构架将通信总线中的多主模式引入多核操作系统构架中,采用对称式结构及组件模式设计操作系统模型,使多核处理器中每个内核都可以作为主核实现对资源、任务的实时管理,提高系统性能,同时可以解决主从式操作系统存在的由于处理器核增多而带来的主内核不能满足系统性能要求的瓶颈问题。通过这种单一构架模型可以进行灵活配置,以适应不同结构及功能要求的处理器内核,降低操作系统开发难度。相似文献

6.

LTE系统级仿真建模实现及其优化

何金薇丁璐刘志敏《计算机工程与应用》2012,48(22):104-108

长期演进（LTE）作为下一代移动通信系统,其整体性能需要在系统级仿真中评估。LTE系统的设计、建模以及实现方法对仿真平台的有效性有直接影响,而目前功能较全的平台一般仿真速度较慢,针对这一问题,给出了LTE系统级仿真平台建模框架,并利用CPU多核以及OpenMP并行计算技术,对平台中耗时较多的模块进行优化,显著地提高了系统的仿真效率。通过比较不同的调度算法评估了仿真平台的性能。该系统级仿真平台达到了3GPP对系统设计的要求,为LTE-Advanced的标准化工作奠定了基础。相似文献

7.

一种Trace驱动的多核SMP集群并行性能模拟方法

翁玉芬徐传福车永刚方建滨王正华《计算机工程与科学》2009,31(Z1)

基于新型多核SMP集群的层次化性能模型,本文在BigSim并行性能模拟器基础上实现了一个Trace驱动的多核SMP集群并行性能模拟器Sim-MSC。在一个InfiniBand多核SMP集群的宿主机平台上采用jacobi3D程序进行了测试,结果表明Sim-MSC能够模拟MPI消息传递并行应用程序在多核SMP集群上的执行特征,精确预测系统和应用性能。相似文献

8.

用于说话人识别支持向量机模型的核函数选择

《计算机应用与软件》2013,(4)

支持向量机(SVM)已广泛地应用于文本无关的说话人辨认系统,不同的核函数影响识别性能。基于此,在TIMIT语料库上对线性核、多项式核以及径向基核进行了对比实验。实验表明多项式核在多项式次数等于6的情况下具有最佳的识别性能,其识别率可以达到82.88%。相似文献

9.

梦想的续航魅族MX四核

《电脑迷》2012,(17):14

经典延续魅族MX上的Micro SIM、内置电池、固化存储空间等闪光点,依旧出现在MX四核上,从淡化硬件配置再到性能为王,魅族在重视整体协调性与系统优化的同时,对性能的追求的脚步从来没有停止过.魅族MX四核将手机处理器从Exynos4210更换为技术更先进的Exynos4212,虽然这两款处理器均基于ARMCortex-A9架构,但是Exynos4212在频率以及制程上都有一定的提升,在性能、功耗上有着更加出色的表现. 相似文献

10.

基于数据预取的多核处理器末级缓存优化方法

单书畅胡瑜李晓维《计算机辅助设计与图形学学报》2012,24(9):1241-1248

末级缓存的性能已成为影响多核处理器整体性能的关键因素.基于多核处理器在处理并行程序时各处理器核访存行为的相似性,提出一种降低访存缺失率的数据预取方法.首先记录各处理器核的访存缺失历史;然后通过分析历史信息预测各处理器核之间末级缓存缺失的关联关系,采用数据预取的方式,在处理器核出现读缺失之前为其末级缓存提供数据块.实验结果表明,对于4核和16核处理器系统,该方法可以分别降低末级缓存缺失率9.8％和18.4％,提高性能4.0％与12.4％. 相似文献

11.

多核处理器平台资源管理的若干问题研究

刘宇芳《计算机科学》2012,39(106):441-443,463

在多核处理器体系结构中,同一芯片上集成了多个处理器核心,它们共享片上多种硬件资源。介绍了多核处理器技术的发展;提出了多核处理器平台上资源管理的相关问题;针对处理器管理和共享Cache管理中的若干关键问题进行了较深入的探讨。相似文献

12.

基于指令Cache作废的多核处理器同步技术

下载免费PDF全文

郭建军戴葵王志英《计算机工程与应用》2009,45(4):1-3

共享存储多核处理器中“忙-等待”技术常用来实现锁或栅栏等同步操作,这些典型的同步机制通常受限于较长的同步延迟和资源竞争等问题,导致扩展性较差,且需要不时进行访存操作,影响正常存储器访问操作,加剧对存储系统的带宽需求。提出了一种用于同步数据触发结构多核处理器的基于指令Cache作废的同步技术,同步时作废将执行的指令Cache行导致取指失效,向L2 Cache发送取指请求,L2 Cache中设置相应的过滤机制,不服务不满足同步条件的处理器核的取指请求,使相应处理器核暂停,达到同步目的。测试表明,该方法在可扩展性和同步性能方面均具有一定的优势。相似文献

13.

基于横向局部性的多核计算模型

袁良张云泉《计算机科学》2012,39(7):1-6

片内多核已成为延长摩尔定律的方式,并行算法设计、编程模型、编译器和运行时系统都需要利用计算模型进行分析。现有多核模型对线程间共享缓存等资源的竞争已有较精确的模型,但是对于线程间数据共享考虑较少。提出线程间共享缓存的横向局部性和任务共享率概念,基于此扩展串行存储层次模型RAM(h),提出考虑任务共享率的多核并行计算模型MRAM(h)。相似文献

14.

基于动芯基带芯片的多核并行同步仿真平台

周鸣《单片机与嵌入式系统应用》2017,17(2)

为了对物理层代码进行验证与分析,提出了一种基于动芯基带芯片的多核仿真平台.该平台采用多线程技术,通过共享内存和信号量分别实现了多核间的通信和同步功能.实验结果表明,该仿真平台可以正确模拟多核间的并行运行情况,并验证物理层代码的正确性.该仿真平台在动芯基带芯片设计实现方面发挥了巨大作用. 相似文献

15.

On modeling contention for shared caches in multi-core processors with techniques from ecology

Wesley Emeneker Amy Apon 《Natural computing》2013,12(3):411-428

Multi-core x86_64 processors introduced an important change in architecture, a shared last level cache. Historically, each processor has had access to a large private cache that seamlessly and transparently (to end users) interfaced with main memory. Previously, processes or threads only had to compete for memory bandwidth, but now they are competing for actual space. Competition for space and environmental resources is a problem studied in other scientific domains. This paper introduces methods from ecology to model multi-core cache usage with the competitive Lotka–Volterra equations. A model is presented and validated for characterizing the interaction of cores through shared caching, and for predicting the degree to which different workloads will interfere with each others’ execution from cache contention. 相似文献

16.

Exploiting multi-core nodes in peer-to-peer grids

Jaehwan Lee Pete Keleher Alan Sussman 《Journal of Parallel and Distributed Computing》2014

While the majority of CPUs now sold contain multiple computing cores, current grid computing systems either ignore the multiplicity of cores, or treat them as distinct, independent machines. The latter approach ignores the resource contention present between cores in a single CPU, while the former approach fails to take advantage of significant computing power. We provide a decentralized resource management framework for exploiting multi-core nodes to run multi-threaded applications in peer-to-peer grids. We present two new load-balancing schemes that explicitly account for the resource sharing and contention of multiple cores, and propose a parameterized performance prediction model that can represent a continuum of resource sharing among cores of a CPU. We use extensive simulation to confirm that our two algorithms match jobs with computing nodes efficiently, and balance load during the lifetime of the computing jobs. 相似文献

17.

Dynamic cloud resource management for efficient media applications in mobile computing environments

Gangyong?Jia Guangjie?Han Email author Jinfang?Jiang Sammy?Chan Yuxin?Liu 《Personal and Ubiquitous Computing》2018,22(3):561-573

Single-instruction-set architecture (Single-ISA) heterogeneous multi-core processors (HMP) are superior to Symmetric Multi-core processors in performance per watt. They are popular in many aspects of the Internet of Things, including mobile multimedia cloud computing platforms. One Single-ISA HMP integrates both fast out-of-order cores and slow simpler cores, while all cores are sharing the same ISA. The quality of service (QoS) is most important for virtual machine (VM) resource management in multimedia mobile computing, particularly in Single-ISA heterogeneous multi-core cloud computing platforms. Therefore, in this paper, we propose a dynamic cloud resource management (DCRM) policy to improve the QoS in multimedia mobile computing. DCRM dynamically and optimally partitions shared resources according to service or application requirements. Moreover, DCRM combines resource-aware VM allocation to maximize the effectiveness of the heterogeneous multi-core cloud platform. The basic idea for this performance improvement is to balance the shared resource allocations with these resources requirements. The experimental results show that DCRM behaves better in both response time and QoS, thus proving that DCRM is good at shared resource management in mobile media cloud computing. 相似文献

18.

Proactive Use of Shared L3 Caches to Enhance Cache Communications in Multi-Core Processors

Fide Sevin Jenks Stephen 《Computer Architecture Letters》2008,7(2):57-60

The software and hardware techniques to exploit the potential of multi-core processors are falling behind, even though the number of cores and cache levels per chip is increasing rapidly. There is no explicit communications support available, and hence inter-core communications depend on cache coherence protocols, resulting in demand-based cache line transfers with their inherent latency and overhead. In this paper, we present Software Controlled Eviction (SCE) to improve the performance of multithreaded applications running on multi-core processors by moving shared data to shared cache levels before it is demanded from remote private caches. Simulation results show that SCE offers significant performance improvement (8-28%) and reduces L3 cache misses by 88-98%. 相似文献

19.

热安全约束下异构多核系统动态映射方法

安鑫杨海娇李建华任福继《计算机应用》2021,41(9):2631-2638

异构多核平台通过集成不同类型的处理核来为系统设计提供灵活性,从而使应用程序可以根据自身需求动态地选择不同类型的处理核来进行处理,实现应用程序的高效运行。随着半导体技术的发展,单芯片上集成的核心数量随之增加,使得现代多核处理器具有更高的功率密度,而这会导致芯片温度的升高,最终会对系统性能造成一定的负面影响。为了充分发挥出异构多核处理系统的性能优势,提出一种在满足温度安全功率的前提下,以最大化系统性能为目标的动态映射方法。该方法考虑异构多核系统的两种异构指标来确定映射方案：第一种异构指标是核心类型,不同类型的处理核具有不同的特征,因而它们适用于处理不同的应用程序;第二种异构指标是热感受性,芯片上不同的处理核位置具有不同的热感受性,越是中心位置的处理核受到的来自于其他处理核的热传递越多,因而温度也就越高。为此,提出一种基于神经网络性能预测器来对线程与处理核类型进行匹配,并利用热安全功率（TSP）模型将经过匹配后的线程映射到芯片上的具体位置。实验结果表明,所提出的方法与常见的轮询调度（RRS）相比,能在保证热安全约束的前提下将平均每个时钟周期内程序所执行的指令数,即指令/周期（IPC）提高53%左右。相似文献

20.

Architectural support for thread communications in multi-core processors

Sevin Varoglu Stephen Jenks 《Parallel Computing》2011,37(1):26-41

In the ongoing quest for greater computational power, efficiently exploiting parallelism is of paramount importance. Architectural trends have shifted from improving single-threaded application performance, often achieved through instruction level parallelism (ILP), to improving multithreaded application performance by supporting thread level parallelism (TLP). Thus, multi-core processors incorporating two or more cores on a single die have become ubiquitous. To achieve concurrent execution on multi-core processors, applications must be explicitly restructured to exploit parallelism, either by programmers or compilers. However, multithreaded parallel programming may introduce overhead due to communications among threads. Though some resources are shared among processor cores, current multi-core processors provide no explicit communications support for multithreaded applications that takes advantage of the proximity between cores. Currently, inter-core communications depend on cache coherence, resulting in demand-based cache line transfers with their inherent latency and overhead. In this paper, we explore two approaches to improve communications support for multithreaded applications. Prepushing is a software controlled data forwarding technique that sends data to destination’s cache before it is needed, eliminating cache misses in the destination’s cache as well as reducing the coherence traffic on the bus. Software Controlled Eviction (SCE) improves thread communications by placing shared data in shared caches so that it can be found in a much closer location than remote caches or main memory. Simulation results show significant performance improvement with the addition of these architecture optimizations to multi-core processors. 相似文献