共查询到20条相似文献,搜索用时 15 毫秒
1.
R. Govindarajan Erik R. Altman Guang R. Gao 《Design Automation for Embedded Systems》2002,6(3):243-275
Exploiting instruction-level parallelism (ILP) is extremely important for achieving high performance in application specific instruction set processors (ASIPs) and embedded processors. Unlike conventional general purpose processors, ASIPs and embedded processors typically run a single application and hence must be optimized extensively for this in order to extract maximum performance. Further, low power and low cost requirements of ASIPs may demand reuse of pipeline stages causing pipelines with complex structural hazards. In such architectures, exploiting higher ILP is a major challenge to the designer.Existing techniques deal with either scheduling hardware pipelines to obtain higher throughput or software pipelining—an instruction scheduling technique for iterative computation—for exploiting greater ILP. We integrate these techniques to co-schedule hardware and software pipelines to achieve greater instruction throughput. In this paper, we develop the underlying theory of Co-Scheduling, called the Modulo-Scheduled Pipeline (or MS-Pipeline) theory. More specifically, we establish the necessary and sufficient condition for achieving the maximum throughput in a given pipeline operating under modulo scheduling. Further, we establish a sufficient condition to achieve a specified throughput, based on which we also develop a methodology for designing the hardware pipelines that achieve such a throughput. Further, we present initial experimental results which help to establish the usefulness of MS-pipeline theory in software pipelining. As the proposed theory helps to analyze and improve the throughput of Modulo-Scheduled Pipelines (MS-pipelines), it is especially useful in designing ASIPs and embedded processors. 相似文献
2.
B. Neumann T. von Sydow H. Blume T. G. Noll 《Journal of Signal Processing Systems》2008,53(1-2):129-143
This paper presents a novel architecture combining an application specific instruction set processor (ASIP) core and an application domain specific embedded FPGAs (eFPGAs) used as flexible accelerator for the ASIP. The eFPGA is based on a parametrisable architecture template optimised for arithmetic oriented applications. It was designed as a physically optimised VLSI-macro using a flexible design methodology also sketched in this paper. Quantitative comparisons of the eFPGA with a commercial standard FPGA show significant improvements in energy, area and timing delays. Simulations of the new ASIP-eFPGA architecture have been conducted using a model based approach to evaluate its efficiency. The results show that power- and area-efficiencies similar to an FPGA can be achieved for the flexible ASIP-eFPGA while preserving the flexibility of a software programmable processor. 相似文献
3.
4.
《Solid-State Circuits, IEEE Journal of》1975,10(6):437-447
Describes a new method using emitter current crowding for performing accurate multiplication of analog signals using devices of special geometry but capable of fabrication with a standard bipolar process. A narrow region of current injection-a carrier domain-can be positioned on an emitter by one electrical input and controlled in magnitude by a second input. The resistive epi layer resolves this current into a differential output proportional to the product of the inputs. A key advantage of these multipliers is their low noise. The basic principle can be applied to many other nonlinear operations. A two-quadrant and a four-quadrant multiplier are described. 相似文献
5.
《Electron Devices, IEEE Transactions on》1960,7(3):179-185
A design method for crossed-field guns based on a space-charge-flow solution in crossed fields is given. By using the method of analytic continuation in the complex plane, it is shown that it is possible to find the exact form of the electrodes required The design results in a gun similar to the French "short gun" with the great advantage that the current emitted from the gun and the current density at the cathode can be predicted. It is also shown that by making certain approximations to the exact space-charge-flow solution, a new type of gun can be designed, a "long gun" which can have extremely high convergence. The theory for this latter gun is extremely simple and the electrode shapes can be given entirely in analytic form. 相似文献
6.
The authors consider the class of conformal antennas consisting of bounded smooth closed curves in two dimensions and determine the surface field which maximizes power radiated in angular sector. The problem is cast as one of optimal control with the control set consisting of the surface current, constrained to have energy bounded by some constant, and the cost functional is taken to be the far field power radiated in an angular sector. A constructive algorithm is presented for approximating both the optimal value of the cost functional and the surface current which produce this optimal value. Bothte andtm polarizations are considered. 相似文献
7.
采用反射群时延理论结合三维电磁仿真软件进行毫米波滤波器和双工器研究,设计了应用于毫米波效应实验所需要的双工器。经三维电磁仿真,各端口回波损耗大于20 dB,输入端口隔离度大于60 dB。仿真结果表明该设计方法不仅能快速获得设计结果,且结果相当准确。该双工器将用于毫米波段注入效应实验中。 相似文献
8.
The objective of this paper is to present a theory, constraints, and a design method for nonlinear-phase halfband and Mth-band filters. Based on a time-domain property, the constraints and properties in the frequency domain are derived. They are the generalization of the well-known conditions for linear-phase Mth-band filters. Having found all necessary conditions, we present the design method based on an eigenfilter that minimizes the mean-squared errors. The design method is also extended to the design of nonlinear-phase Mth-band filters with properties of R-regularity, or equiripple stopband attenuation, or impulse responses that have complex coefficients. Design examples of various Mth-band filters with different properties are presented, discussed, and compared with the linear-phase case 相似文献
9.
Jongsun Park Muhammad K. Roy K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(2):244-253
Finite impulse response (FIR) filtering can be expressed as multiplications of vectors by scalars. We present high-speed designs for FIR filters based on a computation sharing multiplier which specifically targets computation re-use in vector-scalar products. The performance of the proposed implementation is compared with implementations based on carry-save and Wallace tree multipliers in 0.35-/spl mu/m technology. We show that sharing multiplier scheme improves speed by approximately 52 and 33% with respect to the FIR filter implementations based on the carry-save multiplier and Wallace tree multiplier, respectively. In addition, sharing multiplier scheme has a relatively small power delay product than other multiplier schemes. Using voltage scaling, power consumption of the FIR filter based on computation sharing multiplier can be reduced to 41% of the FIR filter based on the Wallace tree multiplier for the same frequency of operation. 相似文献
10.
单个激光声脉冲具有脉宽窄、频带宽的特点,不适于远距离水声通信。为了改善激光声通信声源的频谱特性,提高通信性能,采用对激光声进行调制的方法,建立了调制激光声码元的数学模型,基于该模型,提出一种调制激光声源的设计方法,该方法使调制激光声码元频谱的能量集中体现在调制特征的频率上,降低了信号的频谱宽度,更利于激光声信号在水声信道中远距离传播;推导了在热膨胀机制和光击穿机制下调制激光声码元的调制参量,利用实验获得的热膨胀和光击穿激光声脉冲,仿真了该方法下的调制激光声码元,并分析其频谱特征。结果表明,该声源调制方法是正确和有效的。 相似文献
11.
12.
A new iteration technique has been proposed for calculating exact antiresonant values of cladding layer thicknesses in antiresonant reflecting optical waveguides (ARROW's). This technique enables design of ARROW waveguides with core layers incorporating sophisticated multilayer systems, where the approximate analytical formulas cannot be applied 相似文献
13.
14.
To improve the Network-on-Chip (NoC) performance, we propose a system-level bandwidth design method customising the bandwidths of the NoC links. In details, we first built a mathematical model to catch the relationship between the NoC commutation latency and the NoC link bandwidth, and then develop a bandwidth allocation algorithm to automatically optimise the bandwidth for each NoC link. The experimental results show that our bandwidth-customising method improves the NoC performance compared to the traditional uniform bandwidth allocation method. Besides, it can also make our NoC to achieve the same communication performance level as the uniform bandwidth NoC but using fewer bandwidth resources, which is beneficial to save the NoC area and power. 相似文献
15.
16.
In this correspondence, we propose a simple design method for nonuniform integer-decimated filter banks based on a uniform cosine-modulated filter bank. The resulting distortion and aliasing are comparable to the stopband attenuation of the prototype filter. Examples are given to demonstrate the proposed method 相似文献
17.
A graphic design method for matched low-noise amplifiers 总被引:1,自引:0,他引:1
A graphic design method for matched low-noise amplifiers is presented in which all necessary design information is given in the load plane. It is possible to work exclusively in the load plane, as the input-matching requirement makes the source admittance dependent on the load admittance. As a consequence of the bilinear transformation involved, all parameters may be presented by circles. Analytic equations giving the centers and the radii of the circles in the load plane are presented. Two kinds of amplifier configurations are considered: a single-stage amplifier with an input-match requirement and a two-stage cascade amplifier. The latter is required to have an output match and noise-optimized second stage, in addition to an input match. For the single-stage case the noise figure, the power gain, the stability, and the input network are treated. In the cascade design, the total noise figure, the interstage network, and the available gain are treated as well. A design example for the case of lossless feedback is presented 相似文献
18.
19.
A novel ASM one-zero-hot design method based upon ternary flip-flops with binaryconstruction as storage cells is presented. 相似文献
20.
Plimmer S.A. David J.P.R. Ong D.S. Li K.F. 《Electron Devices, IEEE Transactions on》1999,46(4):769-775
A simple Monte Carlo model (SMC) using single effective parabolic valleys and accurately accounting for deadspace effects is presented for calculating the avalanche process. Very good agreement is achieved with a range of measured electron and hole multiplication results from GaAs p +-i-N+'s with i-region thicknesses, ω, from 1 μm down to 0.025 μm and with the excess noise factors down to 0.05 μm. While the results are insensitive to the precise values of input parameter for structures with ω⩾0.2 μm, this is not the case in thinner structures where the deadspace represents a significant fraction of the device. For ω<0.2 μm, the energy dependence of the ionization rate becomes increasingly important. The SMC model is tested against a full-band Monte Carlo model (FBMC) by comparing the mean distance between ionization events and the probability density functions, which are effectively the histograms of distances between ionization events, for equivalent material parameters. The good agreement between these suggests that the SMC, with a relatively small number of fitting parameters and much faster calculation times than the FBMC, is a useful tool for device simulation and interpreting experimental results 相似文献