期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An analytical study of resource division and its impact on power and performance of multi-core processors

Saravanan Vijayalakshmi Alagan Anpalagan D. P. Kothari Isaac Woungang Mohammad S. Obaidat 《The Journal of supercomputing》2014,68(3):1265-1279

The study and development of chip multi-processors (CMPs) are of utmost importance for the creation of future technologies. Devising a theoretical analysis of the micro-architecture model for the power/performance on CMPs is still a challenge. This paper addresses this problem by (1) introducing an analytical model for measuring the power and performance of a processor quantitatively, (2) analyzing the effects of resource division on power consumption and performance when executing a given benchmark, and (3) predicting the optimum number of cores to run the benchmark on. Our proposed analytically derived results show that in order to achieve power/performance gains, the optimum number of cores must be between 8 and 16. 相似文献

2.

A queueing theoretic approach for performance evaluation of low-power multi-core embedded systems

Arslan Munir Ann Gordon-Ross Sanjay Ranka Farinaz Koushanfar 《Journal of Parallel and Distributed Computing》2014

With Moore’s law supplying billions of transistors on-chip, embedded systems are undergoing a transition from single-core to multi-core to exploit this high transistor density for high performance. However, the optimal layout of these multiple cores along with the memory subsystem (caches and main memory) to satisfy power, area, and stringent real-time constraints is a challenging design endeavor. The short time-to-market constraint of embedded systems exacerbates this design challenge and necessitates the architectural modeling of embedded systems to reduce the time-to-market by expediting target applications to device/architecture mapping. In this paper, we present a queueing theoretic approach for modeling multi-core embedded systems that provides a quick and inexpensive performance evaluation both in terms of time and resources as compared to the development of multi-core simulators and running benchmarks on these simulators. We verify our queueing theoretic modeling approach by running SPLASH-2 benchmarks on the SuperESCalar simulator (SESC). Results reveal that our queueing theoretic model qualitatively evaluates multi-core architectures accurately with an average difference of 5.6% as compared to the architectures’ evaluations from the SESC simulator. Our modeling approach can be used for performance per watt and performance per unit area characterizations of multi-core embedded architectures, with varying number of processor cores and cache configurations, to provide a comparative analysis. 相似文献

3.

一种多核处理器直连接口QoS的设计与验证

罗莉周宏伟周理潘国腾周海亮刘彬《计算机工程与科学》2021,43(4):620-627

多核处理器直接互连构建多路并行系统,一直是提高高性能计算机并行性的主要方式.主要研究多核处理器直连接口的QoS设计,通过直连接口完成跨芯片的Cache一致性报文有效、可靠传输,实现共享主存的SMP系统.详细阐述了直连接口各个协议层的QoS设计的关键技术,基于UVM方法学构建了可重用验证平台,模拟验证了QoS设计的正确性... 相似文献

4.

Variable-size mosaics: A process-variation aware technique to increase the performance of tile-based, massive multi-core processors

Enric Musoll Author vitae 《Computers & Electrical Engineering》2011,37(6):1193-1211

相似文献

5.

A comparative simulation study on the power–performance of multi-core architecture

Vijayalakshmi Saravanan Alagan Anpalagan D. P. Kothari Isaac Woungang Mohammad S. Obaidat Fellow of IEEE Fellow of SCS 《The Journal of supercomputing》2014,70(1):465-487

Nowadays, multi-core processor is the main technology used in desktop PCs, laptop computers and mobile hardware platforms. As the number of cores on a chip keeps increasing, it adds up the complexity and impacts more on both power and performance of a processor. In multi-processors, the number of cores and various parameters, such as issue-width, number of instructions and execution time, are key design factors to balance the amount of thread-level parallelism and instruction-level parallelism. In this paper, we perform a comprehensive simulation study that aims to find the optimum number of processor cores in desktop/laptop computing processor models with shallow pipeline depth. This paper also explores the trade-off between the number of cores and different parameters used in multi-processors in terms of power–performance gains and analyzes the impact of 3D stacking on the design of simultaneous multi-threading and chip multiprocessing. Our analysis shows that the optimum number of cores varies with different classes of workloads, namely: SPEC2000, SPEC2006 and MiBench. Simulation study is presented using architectures with shorter pipeline depth, showing that (1) the optimum number of cores for power–performance is 8, (2) the optimum number of threads in the range [2, 4], and (3) for beyond 32 cores, multi-core processors are no longer efficient in terms of performance benefits and overall power consumption. 相似文献

6.

Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity

Zhi-xiang Chen Zhao-lin Li Shan Cao Fang Wang Jie Zhou 《浙江大学学报:C卷英文版》2015,16(12):1018-1033

相似文献

7.

A neural network-based approach for the performance evaluation of branch prediction in instruction-level parallelism processors

Nain Sweety Chaudhary Prachi 《The Journal of supercomputing》2022,78(4):4960-4976

The Journal of Supercomputing - Branch prediction is essential for improving the performance of pipeline processors. As the number of pipeline stages in modern processors increases, an accurate... 相似文献

8.

Parallel Light Speed Labeling: an efficient connected component algorithm for labeling and analysis on multi-core processors

Laurent Cabaret Lionel Lacassagne Daniel Etiemble 《Journal of Real-Time Image Processing》2018,15(1):173-196

In the last decade, many papers have been published to present sequential connected component labeling (CCL) algorithms. As modern processors are multi-core and tend to many cores, designing a CCL algorithm should address parallelism and multithreading. After a review of sequential CCL algorithms and a study of their variations, this paper presents the parallel version of the Light Speed Labeling for connected component analysis (CCA) and compares it to our parallelized implementations of State-of-the-Art sequential algorithms. We provide some benchmarks that help to figure out the intrinsic differences between these parallel algorithms. We show that thanks to its run-based processing, the LSL is intrinsically more efficient and faster than all pixel-based algorithms. We show also, that all the pixel-based are memory-bound on multi-socket machines and so are inefficient and do not scale, whereas LSL, thanks to its RLE compression can scale on such high-end machines. On a 4 × 15-core machine, and for 8192 × 8192 images, LSL outperforms its best competitor by a factor ×10.8 and achieves a throughput of 42.4 gigapixel labeled per second. 相似文献

9.

Modelling and performance analysis of an adaptive state-transition approach for power saving in Bluetooth

《Simulation Modelling Practice and Theory》2013

In this paper, an approach is developed to improve the power efficiency of Bluetooth. The better efficiency is achieved by reducing the unnecessary polling operations in the Basic Rate/Enhanced Data Rate (BR/EDR) controllers. An analysis of the current low power modes in the Bluetooth BR/EDR controller indicates that their activation requires a critical and challenging parameter negotiation phase. These parameters have a wide range of choices and as a result the associated low power modes are typically ignored by the Bluetooth application developers. The new approach is based upon multiple polling intervals. It is shown that three different polling intervals: small, medium and large are sufficient for a broad range of data traffic scenarios. As the kernel idea, each controller runs a common algorithm to choose among the three polling intervals and adaptively switches link state between the active data transfer state and idle. The state-transition rules are derived, and a system model is established based on the Hidden Markov Model (HMM), which is used to analyze and design the new Bluetooth link state-transition algorithm. The simulation and analysis demonstrates significant power saving and relatively low average end-to-end packet delay for this state-transition based approach, in comparison to the conventional polling system and the low power sniff mode. Moreover, the state-transition approach enables easier parameter setting that can be further optimized for a specific Bluetooth scenario. 相似文献

10.

A SCOR based approach for measuring a benchmarkable supply chain performance

Batuhan Kocaoğlu Bahadır Gülsün Mehmet Tanyaş 《Journal of Intelligent Manufacturing》2013,24(1):113-132

Performance measurement can only help to identify the problems existing in the current supply chain, while it is helpless in exploring the root causes of these problems and thus choosing corresponding actions to improve supply chain performance. The conflict between the top-down strategy decomposition and the bottom-up implementation process is serious. Therefore, in order to overcome the above issues, it is very necessary to link strategic objectives to operations, which could help managers, especially those operating at a strategic level, to know more operational mechanism of supply chains. In this study, an integrated approach which employs analytic hierarchy process (AHP) and technique for order preference by similarity to ideal solution (TOPSIS) together is proposed for the linking strategic objectives to operations. Supply chain operations reference model is used to model the linkage of the strategic objectives and operational metrics in a hierarchical way. The AHP is used to analyze this metric hierarchy and determine weights of the metrics, and TOPSIS method is used to make a normalization of metric values having different units, so a comparison will be available. Proposed approach is applied to a problem of decision making process in a manufacturing company. Company managers found the application and results satisfactory and implementable in their decisions. 相似文献

11.

A framework for memory contention analysis in multi-core platforms

Dakshina Dasari Vincent Nelis Benny Akesson 《Real-Time Systems》2016,52(3):272-322

相似文献

12.

多核Cache稀疏目录性能提升方法综述

吴健虢陈海燕刘胜邓让钰陈俊杰《计算机工程与科学》2019,41(3):385-392

受限于功耗,十多年前通用微处理器就停止追求更高的主频转而向集成更多处理器核的方向发展;同时,随着晶体管密度按摩尔定律不断提高,单片可集成的处理器核数成倍增长,片上多核、众核处理器已成为高性能微处理器发展的主流。未来千核级通用众核处理器支持共享存储编程模型是一种必然趋势,但传统的Cache一致性目录结构面临着查找延迟高、目录项替换频繁以及硬件代价和功耗可扩展性有限等问题。稀疏目录实现了传统目录结构硬件开销与一致性维护效率的折衷,被认为是众核处理器维护Cache一致性的一种高能效、可扩展结构。综述了近年来提高稀疏目录性能的相关研究与方法,并对其在面积、访问延迟、功耗和实现复杂性等方面进行分析,归纳出这些方法各自的优点和存在的不足,对创新设计未来高性能众核处理器共享存储体系结构具有一定的参考价值。相似文献

13.

A case study in multi-core parallelism for the reliability evaluation of composite power systems

Robert C. GreenII Vishakha Agrawal 《The Journal of supercomputing》2017,73(12):5125-5149

The probabilistic evaluation of composite power system reliability is an important but computationally intense task that requires the sampling/searching of a large search space. While multiple methods have been used for performing these computations, a remaining area of research is the impact that modern platforms for parallel computation may have on this computation. Studies have been performed in the past, but they have been primarily limited to cluster-based computing. In addition, the most recent works in this area have used outdated technology or been evaluated using smaller test systems. In the modern era, a wide variety of platforms are available for achieving parallelism in computation including options like multi-core processors, clusters, and accelerators. Each of these platforms provides unique opportunities for accelerating computation and exploiting scalability. In order to fill this gap in the research, this study implements and evaluates two methods of parallel computation—batch parallelism and pipeline parallelism—using a multi-core architecture in a cloud computing environment on Amazon Web Services using up to 36 virtual compute cores. Further, the methodologies are contrasted and compared in terms of computation time, speedup, efficiency, and scalability. Results are collected using IEEE reliability test systems, and speedups upwards of 5x are demonstrated across multiple test systems. 相似文献

14.

超低热噪声测试供电系统实现与实验研究

《微型机与应用》2014,(16)

通过对热噪声源与抑制技术手段的分析,以供电方式隔绝外界干扰,筛选器件并优化控制电路,实现多级低热噪声偏压输出。测试结果表明,输出噪声均方根值可控制在0.6μV以内,能满足某高灵敏度电子器件的测试工作需要。相似文献

15.

A subjective risk analysis approach of container supply chains

Steve Bonsall 《国际自动化与计算杂志》2005,2(1):85-92

1 Introduction Container supply chains (CSCs), with many com- plex physical and information ?ows, have contributed themselves to economic prosperity and also rendered themselves uniquely vulnerable by many risks. In the past decade, some specific events closely related to the risks include the Kobe earthquake which a?ected sup- ply chains across the globe in 1995; the Asian economic crisis in 1997; the Y2K-related IT problems at the end of the 20th century; the fuel protest of September 20… 相似文献

16.

开关电源中功率MOSFET管损坏模式及分析 总被引：2，自引：0，他引：2

刘松张龙王飞刘瞻《电子技术应用》2013,39(3):64-66

结合功率MOSFET管不同的失效形态,论述了功率MOSFET管分别在过电流和过电压条件下损坏的模式,并说明了产生这样的损坏形态的原因,也分析了功率MOSFET管在关断及开通过程中发生失效形态的差别,从而为失效在关断或在开通过程中发生损坏提供了判断依据。给出了测试过电流和过电压的电路图。同时分析了功率MOSFET管在动态老化测试中慢速开通、在电池保护电路应用中慢速关断及较长时间工作在线性区时损坏的形态。最后,结合实际应用,论述了功率MOSFET通常会产生过电流和过电压二种混合损坏方式损坏机理和过程。相似文献

17.

A compumetrical approach for analysis and clustering of computer system performance variables

Niv Ahituv Yoav Benjamini Magid Igbaria 《Computers & Operations Research》1988,15(6)

Various statistical models have been constructed for analyzing the workload variables of a computer system, but most of these models fail to analyze each variable separately and identify job groups by hardware consumption patterns. In this paper we propose a compumetrical approach to analyze the computer system performance variables and to cluster the jobs into homogeneous groups. It involves using univariable and multivariable analysis and graphical methods for analyzing the variables. This approach enables us to explore data thoroughly, to look for patterns and clusters, to confirm or disprove the expected hardware consumption, and to discover new phenomena. 相似文献

18.

A soft multi-core architecture for edge detection and data analysis of microarray images

George Kornaros 《Journal of Systems Architecture》2010,56(1):48-62

As configurable processing advances, elements from the traditional approaches of both hardware and software development can be combined by incorporating customized, application-specific computational resources into the processor’s architecture, especially in the case of field-programmable-gate-array-based systems with soft-processors, so as to enhance the performance of embedded applications. This paper explores the use of several different microarchitectural alternatives to increase the performance of edge detection algorithms, which are of fundamental importance for the analysis of DNA microarray images. Optimized application-specific hardware modules are combined with efficient parallelized software in an embedded soft-core-based multi-processor. It is demonstrated that the performance of one common edge detection algorithm, namely Sobel, can be boosted remarkably. By exploiting the architectural extensions offered by the soft-processor, in conjunction with the execution of carefully selected application-specific instruction-set extensions on a custom-made accelerating co-processor connected to the processor core, we introduce a new approach that makes this methodology noticeably more efficient across various applications from the same domain, which are often similar in structure. With flexibility to update the processing algorithms, an improvement reaching one order of magnitude over all-software solutions could be obtained. In support of this flexibility, an effective adaptation of this approach is demonstrated which performs real-time analysis of extracted microarray data; the proposed reconfigurable multi-core prototype has been exploited with minor changes to achieve almost 5× speedup. 相似文献

19.

反激式开关电源的环路分析与设计 总被引：1，自引：1，他引：0

岳中哲《电子技术应用》2012,38(6):61-64

设计了一款反激式开关电源,依据理论计算出补偿器参数,通过实验调试证明计算参数能够使环路稳定,并接近于优化参数。相似文献

20.

PWM控制开关变换器大信号稳定性的滑模分析方法

周岩王柏林杨长业《控制与决策》2013,28(4):637-640

研究了传统脉宽调制(PWM)控制开关变换器中一个重要现象:闭环调节器的输出信号与锯齿波比较信号发生多次截交导致开关频率升高且不能获得恒定控制频率,甚至系统不能稳定输出工作.以常见的Buck、Boost开关变换器设计为例,研究了基于PWM-准滑模控制理论的开关变换器大信号稳定性条件,最终所得结论与经典“斜波匹配”理论相吻合.仿真结果验证了所提出理论的正确性. 相似文献