首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 562 毫秒
1.
针对NAND Flash应用,完成了并行化BCH编译码器硬件设计。采用寄存器传输级硬件描述语言,利用LFSR电路、计算伴随式、求解关键方程、Chien搜索算法等技术方法完成了BCH编译码算法在FPGA上的硬件实现。相比于传统串行实现方案,采用并行化实现提高了编译码器的速度。搭建了基于SoPC技术的嵌入式验证平台,在Nios处理器的控制下能快速高效地完成对BCH编译码算法的验证,具有测试环境可配置、测试向量覆盖率高、测试流程智能化的特点。  相似文献   

2.
随着计算机技术的不断发展,人们对多媒体技术的实时性有了更高的要求,特别是视频编解码的时间效率.另外,随着多核CPU及相关技术的不断普及,使得原有非并行化程序的性能的不足显现了出来,因此对传统程序的并行化迫在眉睫.本文以目前较流行的视频编解码算法h.263为例,通过一个具体的视频会议系统,分析传统串行编解码算法的性能,通过英特尔Parallel studio并行化分析工具,找到算法的运行瓶颈,然后用英特尔线程构建模块对编解码算法进行并行化优化,取得了良好的效果.  相似文献   

3.
提出了一种针对HINOC系统探测帧的单频检测算法与BCH编解码结合应用的方案,以改善HINOC系统的抗单频干扰能力。首先介绍了单频检测的一般方法及实现流程;接着简述了BCH编解码的基本原理;最后详细阐述了将单频检测算法与BCH编解码算法结合应用于HINOC系统的方案及实现流程。仿真结果表明,该方案能明显改善HINOC系统的抗单频干扰性能,具有较强的实用价值。  相似文献   

4.
基于OpenMP的AVS并行编码算法研究与实现   总被引:1,自引:1,他引:0  
为了提高新一代音视频编解码技术标准AVS的编码速度,利用OpenMP在多核处理器平台上研究并实现了AVS的GOP级、条带级,帧级和基于任务队列模型的帧级并行编码算法.对CIF格式的视频序列进行了测试,在四核处理器平台上加速比最高能达到3.82x.另外,基于任务队列模型的帧级并行算法在保持图像质量不变的基础上解决了帧级并行算法加速比偏低的缺点.实验结果表明,OpenMP是一种简单而有效的并行化编程工具,基于OpenMP的各个AVS并行编码算法与原串行算法相比,编码速度都有显著提高.  相似文献   

5.
王杰  沈海斌 《计算机工程》2010,36(16):222-225
提出一种应用于NAND Flash控制器的并行BCH编/译码器,在译码阶段引入流水线操作和分组预取译码操作,提升BCH码的译码效率。实验结果表明,在NAND Flash的2 KB页读取操作中,该编/译码器纠正8 bit的随机错误只需要565个周期的译码时间,是采用按页预取译码方式所需时间的1/4。  相似文献   

6.
随着视频编解码标准的不断演进,算法处理的数据量也随之剧增。多核结构并行化处理技术在提升算法计算速度的同时,使得存储结构成为了整个编解码系统性能的瓶颈。针对视频编解码算法访存的局部性、各算法之间数据交互频繁性、算法内部大量临时数据不交互性的特点,设计并实现了由私有存储层和共享存储层构成的多层次分布式存储结构。通过Xilinx公司的Virtex-6系列xc6vlx550T开发板对设计进行测试,实验结果表明,该结构在保持简洁性和可扩展性的同时,最高可提供9.73 GB/s的访存带宽,能够满足视频编解码算法数据访存的需求。  相似文献   

7.
针对并行BCH译码器的特点,采用异或门实现有限域上常系数乘法,从而降低硬件复杂度。先计算部分错误位置多项式,再根据仿射多项式和格雷码理论,进行逻辑运算得到剩余的错误位置多项式,从而减少了系统所占用的资源。在现场可编程门阵列(FPGA)开发软件ISE10.1上进行了时序仿真,验证了该算法时间和空间的高效性。  相似文献   

8.
基于无线传感器网络能量有限的特点,提出了一种判决与校验相结合的算法,利用BCH码和CRC码实现多位纠错功能,在纠正错误时,一位错误和多位错误分开纠错.对纠错算法的能耗进行了分析,并与ARQ方案和BCH纠错方案的能耗进行了对比.仿真结果表明:该多位纠错方法有效地改善了误码率和帧错误率,当误码率大于1.3E-3时,这种算法有较高的能量利用率.  相似文献   

9.
针对便携设备上不断增强的视频处理要求和H.264编解码算法相对较高的计算复杂度之间的矛盾,提出了基于片上多核结构的H.264并行化方案,以达到实时编码的效果。该方案以FPGA为验证平台,通过硬件结构与软件算法协同优化的方式,在单总线双核结构的MPSoC上实现了基于片的H.264并行编码。实验结果表明,在嵌入式环境下利用多核技术实现H.264并行编码可以取得良好的加速效果。  相似文献   

10.
通过对北斗导航电文BCH纠错编译码方式的深入理解和研究,提出了一种基于并行数据处理的BCH译码器的设计方案。该方案利用FPGA对BCH电文进行并行处理,在一个时钟周期内实现电文译码,提高了BCH解码模块的译码效率;同时给出了系统各个模块的Modelsim仿真结果与分析,验证了设计的可行性。本设计对提高接收机的基带数据处理性能有一定的参考和指导意义。  相似文献   

11.
基于FPGA的数字图像实时放大设计   总被引:10,自引:1,他引:9  
文章设计并实现了基于FPGA的双线性插值放大,设计中针对硬件实现,对算法的并行计算结构进行了优化。实验表明应用该方法插值计算结构简单,计算误差小,精度与软件实现相当,可实现变倍率的实时视频图像放大。  相似文献   

12.
Iterative detection and decoding in communication systems with multiple transmitter and receiver antennas suffer from a significant increase in the computational cost and energy consumption. Nowadays, application of specific high-performance computing techniques for signal processing in communication systems is receiving considerable attention. In this paper, we present an accelerated and efficient iterative receiver, which has been implemented following two strategies. First, we reduce the computational cost using parallelized algorithms executed on graphics processing unit. In addition, our receiver allows the selection between two types of detectors with different complexity and performance. The selection can be done to fulfill a given compromise between bit error rate and power consumption.  相似文献   

13.
纹理进化系统是一个面向近似规则纹理合成的算法,系统的主要特点是在进化理论的基础上,通过定义相关行为来优化纹理块在被拼接后无法改变而引发的累积误差问题.提出一种新的基于协同进化思想的纹理合成方案,通过新定义的个体选取及排布方式,可适用任意方向周期的纹理;通过去除迁徙、预建立适应度表等操作,很好地减轻了适应度计算开销的冗余度,并且新的进化过程能够更好地实现并行化.实验结果表明所提方案不仅在通用性上有所增强,在效率上也得到了显著提升.  相似文献   

14.
An empirical study is presented that examines the potential to parallelize general-purpose software systems. The study is conducted on 13 open source systems comprising over 14 MLOC. Each for-loop is statically analyzed to determine if it can be parallelized or not. A for-loop that can be parallelized is termed a free-loop. Free-loops can be easily parallelized using tools such as OpenMP. For the loops that cannot be parallelized, the various inhibitors to parallelization are determined and tabulated. The data shows that the most prevalent inhibitor by far, is functions called within for-loops that have side effects. This single inhibitor poses the greatest challenge in adapting and re-engineering systems to better utilize modern multi-core architectures. This fact is somewhat contradictory to the literature, which is primarily focused on the removal of data dependencies within loops. Results of this paper also show that function calls via function pointers and virtual methods have very little impact on the for-loop parallelization process. Historical data over a 10-year period of inhibitor counts for the set of systems studied is also presented. It shows that there is little change in the potential for parallelization of loops over time.  相似文献   

15.
Flood modelling often involves prediction of the inundated extent over large spatial and temporal scales. As the dimensionality of the system and the complexity of the problems increase, the need to obtain quick solutions becomes a priority. However, for large-scale problems or situations where fine resolution data is required, it is often not possible or practical to run the model on a single computer in a reasonable timeframe. This paper presents the development and testing of a parallelized 2D diffusion-based flood inundation model (FloodMap-Parallel) which enables large-scale simulations to be run on distributed multi-processors. The model has been applied to three locations in the UK with different flow and topographical boundary conditions. The accuracy of the parallelized model and its computational efficiency have been tested. The predictions obtained from the parallelized model match those obtained from the serialized simulations. The computational performance of the model has been investigated in relation to the granularity of the domain decomposition, the total number of cells and the domain decomposition configuration pattern. Results show that the parallelized model is more effective with simulations of low granularity and a large number of cells. The large communication overhead associated with the potential load-imbalance between sub-domains is a major bottleneck in utilizing this approach with higher domain granularity.  相似文献   

16.
杨良怀  项俊腱  徐卫  范玉雷 《计算机科学》2018,45(3):171-177, 212
面向具有时间维度的大数据流,基于二级B+树索引结构,提出了一种高效的面向时间窗口、采用批量装载技术的内存B+树构建方法。该方法对时间窗口进行分片,通过分离出可以并行处理的操作来加速构建过程,将排序操作与数据流接收并行,B+树骨架的构建与排序并行;采用基于排序的批量装载技术以及优化 的构建顺序,能够避免多线程之间不必要的加锁、同步开销,有效提高构建效率。提出的多次微批量排序单次批量装载(MBSortSBLoad)B+树构建方法的构建速度快,能承载的最大流速大。实验验证了所提方法的有效性。  相似文献   

17.
研究动态模式识别算法在GPU并行计算平台的实现。随着GPGPU(通用计算图形处理器)硬件的发展,基于GPU的大规模并行计算技术将有效地处理动态模式识别算法带来的海量计算问题。文中通过介绍动态模式识别算法,对算法中涉及的巨大计算量进行分析,并针对性地对其中密集计算部分进行并行化分解,移除原算法中在执行中存在的依赖关系,最终得到算法在特定的GPU平台———Jacket上的并行计算实现。实例验证表明,相比于原CPU串行程序,在GPU上运行的并行化程序能实现明显加速,因而具有很好的工程应用价值。  相似文献   

18.
This paper studies Dantzig–Wolfe decomposition for real-time optimization of process systems with a decentralized structure. The idea is to improve computational efficiency and transparency of a solution. The contribution lies in the application of the Dantzig–Wolfe method which allows us to efficiently decompose an optimization problem into parts. Moreover, we show how the algorithm can be parallelized for even higher efficiency. The nonlinear system is modeled by piecewise linear models with the added benefit that error bounds can be computed. In this context alternative parameterizations are discussed.The properties of the method are studied by applying it to a model of a complex petroleum field with severe production optimization challenges due to rate dependent gas-coning wells. The model resembles the Troll west oil rim, a huge gas and oil field on the Norwegian Continental shelf.Finally, the paper discusses workflows in production optimization as a means to explain how the proposed methodology can be applied in practice.  相似文献   

19.
多处理器并行EDPF优化实时调度算法   总被引:2,自引:0,他引:2       下载免费PDF全文
实时多处理器系统的任务调度问题始终都是一个重要课题。针对该系统须保证任务截止期和有效性的特点,提出了一种并行EDPF(Earliest Deadline and Processing Time First)优化调度算法。该算法适用于可并行任务,并在考虑到了任务集的截止期和资源因素基础上,加入了运行时间因素,达到了减少调度返回次数以及提高有效性的目的。最后通过大量的仿真,分析了一些必要参数对调度成功率的影响,并通过比较证明了该算法明显优于Myopic算法。  相似文献   

20.
Although over a thousand scientific papers address the topic of load forecasting every year, only a few are dedicated to finding a general framework for load forecasting that improves the performance, without depending on the unique characteristics of a certain task such as geographical location. Meta-learning, a powerful approach for algorithm selection has so far been demonstrated only on univariate time-series forecasting. Multivariate time-series forecasting is known to have better performance in load forecasting. In this paper we propose a meta-learning system for multivariate time-series forecasting as a general framework for load forecasting model selection. We show that a meta-learning system built on 65 load forecasting tasks returns lower forecasting error than 10 well-known forecasting algorithms on 4 load forecasting tasks for a recurrent real-life simulation. We introduce new metafeatures of fickleness, traversity, granularity and highest ACF. The meta-learning framework is parallelized, component-based and easily extendable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号