首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 140 毫秒
1.
在现有可重构处理器设计的基础上,提出了一种改进的阵列型动态可重构处理器-IRAP.在IRAP中,将处理单元组成的阵列按象限划分为4个区域,每个区域包含个可配置的处理单元,运算时不同区域可以根据需要进行不同的配置,增加了配置的灵活性,提高了系统的执行效率;同时增加了系统数据的传输带宽,并根据数字信号处理中常用的蝶形算法对阵列互联进行了优化.仿真结果显示,在FFT等典型数字信号处理应用中,IRAP具有比改进原型更优的性能.  相似文献   

2.
为了提高LS MPP(Li-Shan MPP)系统的性能,并将其纳入新型嵌入式流处理器之中.以LS MPP体系结构为基础,根据嵌入式流处理器概念模型,针对图像处理应用的特征,提出了基于LS MPP的流处理技术.该技术通过定义新型流数据类型和核函数,构造了流处理模型,并分析了以LS MPP为基础提出的嵌入式流处理器概念模型上的流调度的实现方法,为全面提高LS MPP嵌入式流处理器的性能提供了系统软件支持.  相似文献   

3.
基于局部线性滤波函数的大多数图像处理操作,都可以表示成图像数据与一个权值样板的卷积.对于N×N的图像和M×M(M<N)的模板,卷积算法在单处理机上用传统的方法实现需要O(N2M2)时间.显然它应当采用数据并行的处理方法来实现.本文较详细地讨论了卷积算法在局部寄存器个数受限与不受限情况下的两维处理元阵列的数据并行实现方法,提出了一种适用于具有有限局部寄存器的-维处理元阵列的卷积并行算法,并对算法的复杂度进行了分析.  相似文献   

4.
阵列处理器系统芯片的发展   总被引:2,自引:1,他引:1  
本文从数据流动的计算模式、并行计算的阵列芯片、应用演变的数学技术、以及硅基芯片的制造技术等4个方面,研究了阵列处理器系统芯片的发展,提出了一种统一体系结构的阵列处理器系统芯片,简称APU系统芯片。  相似文献   

5.
灰度直方图计算是图像分割与图像灰度变换等图像处理操作中一种重要的分析工具。本文将首先讨论直方图计算在单处理机上的执行效率,并讨论两维处理元阵列上直方图计算的数据并行实现方法,然后以每个处理元中至少有256个寄存器的一维处理元阵列为应用背景,提出了一种新的直方图计算的数据并行实现方法,使处理元阵列的执行效率达到了每个像素只需1次数据并行的计数操作。  相似文献   

6.
为了获得尽可能高的并行计算单元的计算能力,对SIMD图像处理机的存储系统进行了深入研究.该存储系统根据图像处理应用的特点,使用基于编译获得的数据流存取全局信息进行数据流调度,有效地提高了数据存取的速度,满足了并行计算单元对数据存取速度的要求,为SIMD图像处理机系统性能的提高提供了支持.  相似文献   

7.
随着多核处理器芯片在嵌入式应用领域越来越受到关注,提高应用程序开发产能同时获得并行性能收益是多核大众化并行计算研究的核心目标。着重综述了嵌入式应用领域面临的三个关键问题。首先,对当前的高性能嵌入式计算与超级计算做了比较,并对嵌入式应用领域做了分类总结。其次,对当前的适用于嵌入式的片上多核处理器架构做了研究。最后,综述了多核并行编程的方式的研究现状,并总结了嵌入式多核并行未来的研究问题。  相似文献   

8.
不断提高计算机的能力是支持数学上的infinite的技术途径之一,本文介绍了标量计算机、并行计算模式、阵列语言、阵列计算机等技术。但强大的计算能力遭遇到了能耗问题。圆片级的硅直通技术(TSV)是降低能耗的途径之一。粗粒度的阵列计算机的规则性是适合于TSV技术的。  相似文献   

9.
采用TI公司的DaVinci系列芯片TMS320DM6446为主芯片设计了一套视频处理模块,该模块通过高速视频解码电路对模拟视频信号进行数字化处理,由DSP(数字信号处理器)芯片对数据进行处理,从而实现实时处理和输出。详细介绍了视频处理模块的硬件设计方案,主要涉及到视频解码芯片TVP5150、FPGA(现场可编程门阵列)及DSP等,还包括视频协处理电路和网络通信及接口电路。所提出的系统设计方案满足嵌入式视频处理要求,可作为最小的独立视频处理系统运行。  相似文献   

10.
与DC T结合的分形编码能够利用两种技术的优点,在提供较高压缩比的同时确保图像的质量。分形编码对于计算量的需求是制约其编解码速度进而影响其应用实时性的主要因素。本文针对固定块大小分形编码与DC T结合的编码结构的计算步进行分析,针对占其主要计算开销的固定块大小全搜索分形编码结构,研究其数据局部性和计算并行性,应用阵列结构,提出一种结构规整、易于扩展并能够方便实现的并行计算结构。该结构能够映射为VLSI实现,以高效的硬件结构提供编解码计算结构所要求的实时性。  相似文献   

11.
This paper provides a tutorial on the motivations, design, and applications of parallel processing applied to video real-time, illustrated by the experience gained in the implementation of the P3I machine. Its main purpose is to highlight the motivations for such a development the basic implementation choices, the major difficulties encountered and how they have been solved. Through these studies we found that parallel processing is well-suited to video real-time, when programmable implementations are considered. There are many outcomes of the P3I project, ranging from architectural considerations to parallel algorithms optimizations, and programming methodology. We want to emphasize three conclusions. First, programming an architecture composed of different parallel paradigms in a given architecture is tractable, and this heterogeneity is cost effective and efficient in terms of processing performances. Second, concerning the well known debate about how to match parallel architectures and image processing “levels” we conclude that the key is not to discuss Flynn's taxonomy (i.e., data versus tasks parallelism) but to consider how the parallelism grain evolves within a whole application. Third, we confirm that in the field of image processing, the efficiency of parallelism can only be gained if algorithms developers think “parallel”; this result seems to be obvious, but just consider the trends of recent RISC processors, embedding more and more parallelism, and claiming at a compatibility with existing sequential softwares  相似文献   

12.
The authors present a general system design method which is intended to support parallelisation of complete image processing applications using MIMD processors. The approach is based upon the utilisation of a generic system level parallel processor architecture, the `pipeline processor farm'(PPF), and is applicable to any embedded application with continuous input/output. The design method is illustrated using applications from the fields of computer vision and image coding. The design model accommodates several commonly exploited parallel processing paradigms, maps conveniently to the software structure of most image processing algorithms, provides incrementally scalable performance, and enables upper-bound speedups to be easily estimated from profiling data generated by the original sequential implementation of the application. It is believed that the approach has significant application in parallel embedded systems design, in the development environment, and in simulation work for computationally intensive image coding algorithms  相似文献   

13.
钟升 《电子学报》2009,37(7):1546-1553
 本文为满足G级像素帧的实时性处理需求,针对DCT变换计算量大和常规处理中并行度不足的问题,提出一种基于SIMD PE阵列的DCT数据并行实现方法.该方法因PE阵列本身所具有的可裁减特性,可应用于不同并行度需求的嵌入式系统中.文中提出一种基于PE标识的数据并行操作方式, 不但解决了局部计算中的"PE自治"问题,又省去了数据寻址时间开销.该操作方式规则、简洁,满足SIMD操作规则性强的要求,符合并行处理技术的发展方向.  相似文献   

14.
A formalism and an algorithm for configuring and sequencing parallel to massively parallel processors for the application of generalised spectral analysis transforms are presented. Successive partial rotations of a base-p hypercube, where p is an arbitrary integer, are shown to produce dynamic contention-free memory allocation, in a generalised parallelism processor architecture. The approach is illustrated by factorisations involving the processing of matrices of transforms which are functions of four variables. Parallel operations are implemented as matrix multiplications. Each matrix, of dimension N /spl times/ N, where N = p/sup n/, n integer, has a structure that depends on a variable parameter k. The level of parallelism, in the form of M = p/sup m/ processors, can be chosen arbitrarily by varying m between zero and its maximum value of n - 1. The result is an equation describing the generalised parallelism factorisation as a function of the four variables n, p, k and m. Applications of the approach are shown in relation to complex matrix structures of image processing generalised spectral analysis transforms. The same approach can be applied to a much larger class of parallel and multiprocessing systems for digital signal processing applications.  相似文献   

15.
16.
嵌入式处理器以其高处理性能、低功耗、低成本、易裁剪、体积小等优势,在各类电子设备中越来越普及.3G通信则以其高传输速率、覆盖范围广、通信质量好等优点,适合大数量实时传输.结合两者的优势,文中设计了一种基于3G的嵌入式图像传输系统.该系统使用TCP传输协议,采用C/S服务模式,图像数据通过服务器转发.实验结果表明,图像能够稳定传输,基本满足实时传输要求.  相似文献   

17.
In this paper, we present an optimized design method for high-speed embedded image processing system using 32 bit floating-point Digital Signal Processor (DSP) and Complex Programmable Logic Device (CPLD). The DSP acts as the main processor of the system: executes digital image processing algorithms and operates other devices such as image sensor and CPLD. The CPLD is used to acquire images and achieve complex logic control of the whole system. Some key technologies are introduced to enhance the performance of our system. In particular, the use of DSP/BIOS tool to develop DSP applications makes our program run much more efficiently. As a result, this system can provide an excellent computing platform not only for executing complex image processing algorithms, but also for other digital signal processing or multi-channel data collection by choosing different sensors or Analog-to-Digital (A/D) converters.  相似文献   

18.
韩梅 《信息技术》2011,(6):155-157
介绍了一种基于嵌入式处理技术的数据记录器,该记录器配置灵活,通过板卡扩展和组合,可以满足如最大存储容量、最高记录速度等技术指标的特定要求,能够支持磁存储介质和固态存储介质,适用于不同的应用环境,对试验样机的初步测试验证了该方案的可行性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号