期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Kenneth J. Supowit Eric A. Slutz 《Computer aided design》1984,16(1):45-50

相似文献

2.

Genetic algorithms for data-driven web question answering

Figueroa AG Neumann G 《Evolutionary computation》2008,16(1):89-125

We present an evolutionary approach for the computation of exact answers to natural languages (NL) questions. Answers are extracted directly from the N-best snippets, which have been identified by a standard Web search engine using NL questions. The core idea of our evolutionary approach to Web question answering is to search for those substrings in the snippets whose contexts are most similar to contexts of already known answers. This context model together with the words mentioned in the NL question are used to evaluate the fitness of answer candidates, which are actually randomly selected substrings from randomly selected sentences of the snippets. New answer candidates are then created by applying specialized operators for crossover and mutation, which either stretch and shrink the substring of an answer candidate or transpose the span to new sentences. Since we have no predefined notion of patterns, our context alignment methods are very dynamic and strictly data-driven. We assessed our system with seven different datasets of question/answer pairs. The results show that this approach is promising, especially when it deals with specific questions. 相似文献

3.

A systolic array for cyclic-by-rows Jacobi algorithms

《Journal of Parallel and Distributed Computing》1987,4(3):334-340

The major concern of this paper is the systolic realization of Jacobi algorithms with a cyclic-by-rows iteration scheme. This aim can be achieved by construction of an algorithm which is well suited for parallel processing and is essentially equivalent to the above-mentioned type of algorithm. Moreover, a large variety of applications for Jacobi algorithms is presented as well as a comparison to other parallel schemes for the same problem. Finally, a systolic array is derived which requires (n + 1)²/4 processing cells and has a time complexity of O(n) for each sweep. 相似文献

4.

The function processor: A data-driven processor array for irregular computations

Jesper Vasell Jonas Vasell 《Future Generation Computer Systems》1992,8(4):321-335

相似文献

5.

VLSI algorithms and architecture for relational operations

M. A. Bonuccelli E. Lodi F. Luccio P. Maestrini L. Pagli 《Calcolo》1985,22(1):63-90

The use of a special purpose VLSI chip for relational operations is proposed The chip is structured like a tree with processors at the nodes, called TOP (Tree of Processors). Each node is capable of storing a data element and of performing elementary operations on elements. A table ofn tuples ofk elements each (e. g., a relation defined as in data base theory) is stored inn subtrees of at leastk nodes each, at the lowest levels of TOP. The upper portion of TOP is used for routing and bookkeeping purposes. A number of elementary operations are defined for the nodes, and high level operations on tables are performed as combinations of the former ones. In particular, some operations for data input/output and update are discussed, and the basic operations of UNION, DIFFERENCE, PROJECTION, PRODUCT, SELECTION, and JOIN, defined in relational algebra, are studied for TOP realization. Even the most complex operations are executed inO (kn) steps, that is the size of data. This result is optimal in our system, where we assume that data are transmitted to TOP's through channels of constan bandwidth. Dedicated to Professor S. Faedo on his 70th birthday This research has been partially supported by Ministero della Pubblica Istruzione of Italy. 相似文献

6.

Evolutionary algorithms for VLSI multi-objective netlist partitioning

《Engineering Applications of Artificial Intelligence》2006,19(3):257-268

The problem of partitioning appears in several areas ranging from VLSI, parallel programming to molecular biology. The interest in finding an optimal partition, especially in VLSI, has been a hot issue in recent years. In VLSI circuit partitioning, the problem of obtaining a minimum cut is of prime importance. With current trends, partitioning with multiple objectives which includes power, delay and area, in addition to minimum cut is in vogue. In this paper, we engineer three iterative heuristics for the optimization of VLSI netlist bi-partitioning. These heuristics are based on Genetic Algorithms (GAs), Tabu Search (TS) and Simulated Evolution (SimE). Fuzzy rules are incorporated in order to handle the multi-objective cost function. For SimE, fuzzy goodness functions are designed for delay and power, and proved efficient. A series of experiments are performed to evaluate the efficiency of the algorithms. ISCAS-85/89 benchmark circuits are used and experimental results are reported and analyzed to compare the performance of GA, TS and SimE.Further, we compared the results of the iterative heuristics with a modified FM algorithm, named PowerFM, which targets power optimization. PowerFM performs better in terms of power dissipation for smaller circuits. For larger sized circuits, SimE outperforms PowerFM in terms of all the three objectives, delay, number of nets cut, and power dissipation. 相似文献

7.

A processor-time-minimal systolic array for cubical mesh algorithms

Cappello P. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(1):4-13

Using a directed acyclic graph (DAG) model of algorithms, the paper focuses on time-minimal multiprocessor schedules that use as few processors as possible. Such a processor-time-minimal scheduling of an algorithm's DAG first is illustrated using a triangular shaped 2-D directed mesh (representing, for example, an algorithm for solving a triangular system of linear equations). Then, algorithms represented by an n×n×n directed mesh are investigated. This cubical directed mesh is fundamental; it represents the standard algorithm for computing matrix product as well as many other algorithms. Completion of the cubical mesh required 3n-2 steps. It is shown that the number of processing elements needed to achieve this time bound is at least [3n^2/4]. A systolic array for the cubical directed mesh is then presented. It completes the mesh using the minimum number of steps and exactly [3n ^2/4] processing elements it is processor-time-minimal. The systolic array's topology is that of a hexagonally shaped, cylindrically connected, 2-D directed mesh 相似文献

8.

A methodology for VLSI implementation of Cellular Automata algorithms using VHDL

《Advances in Engineering Software》2001,32(3):189-202

相似文献

9.

Efficient algorithms for array redistribution

Thakur R. Choudhary A. Ramanujam J. 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(6):587-594

Dynamic redistribution of arrays is required very often in programs on distributed presents efficient algorithms for redistribution between different cyclic(k) distributions, as defined in High Performance Fortran. We first propose special optimized algorithms for a cyclic(x) to cyclic(y) redistribution when x is a multiple of y, or y is a multiple of x. We then propose two algorithms, called the GCD method and the LCM method, for the general cyclic(x) to cyclic(y) redistribution when there is no particular relation between x and y. We have implemented these algorithms on the Intel Touchstone Delta, and find that they perform well for different array sizes and number of processors 相似文献

10.

A VLSI systolic architecture for solving DBT-transformed fuzzy clustering problems of arbitrary size

R. Doallo E. L. Zapata 《Parallel Computing》1990,13(3):321-335

This article presents a VLSI systolic architecture for solving fuzzy C-means clustering problems of arbitrary size with an array of fixed size. The array design assumes that the data fed in have been subjected to the DBT transformation (dense-to-band matrix transformation by triangular block partitioning). The architectural configuration of the various subarrays involved are presented together with the internal structure of their processing elements, and the performance is analysed. The modularity and regularity of the design make it highly suitable for VLSI implementation. 相似文献

11.

Databases and cell-selection algorithms for VLSI cell libraries 总被引：1，自引：0，他引：1

Foo S.Y. Takefuji Y. 《Computer》1990,23(2):18-30

The issues that must be addressed before commercial database management systems can be used to manage VLSI CAD data are defined. A survey is presented of approaches addressing four of the defined issues: design hierarchies and multilevel representations, design alternatives and version control, common interface between cell libraries and efficient cell selection based on given design constraints. A frame-based model is considered as a case study of the special-purpose design database management system approach. This framework for capturing design data is based on semantic networks. It is well suited for application-specific ICs, yet general enough for other CAD/CAM environments. Benchmark results for the selection algorithms that run on top of the frame-based database system are presented 相似文献

12.

DNA and quantum based algorithms for VLSI circuits testing

Amardeep?Singh Email author Lalit?M.?Bharadwaj Singh?Harpreet 《Natural computing》2005,4(1):53-72

Testing of VLSI circuits is still a NP hard problem. Existing conventional methods are unable to achieve the required breakthrough in terms of complexity, time and cost. This paper deals with testing the VLSI circuits using natural computing methods. Two prototypical algorithms named as DATPG and QATPG are developed utilizing the properties of DNA computing and Quantum computing, respectively. The effectiveness of these algorithms in terms of result quality, CPU requirements, fault detection and number of iterations is experimentally compared with some of existing classical approaches like exhaustive search and Genetic algorithms, etc. The algorithms developed are so efficient that they require only N (where N is the total number of vectors) iterations to find the desired test vector whereas in classical computing, it takes N/2 iterations. The extendibility of new approach enables users to easily find out the test vector from VLSI circuits and can be adept for testing the VLSI chips. 相似文献

13.

Parallelism detection and transformation techniques useful for VLSI algorithms

J.A.B. Fortes D.I. Moldovan 《Journal of Parallel and Distributed Computing》1985,2(3):277-301

相似文献

14.

An evaluation framework for input variable selection algorithms for environmental data-driven models

《Environmental Modelling & Software》2014

Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously. 相似文献

15.

Fast algorithms and VLSI architecture design for HEVC intra-mode decision

Xiaofeng Huang Huizhu Jia Binbin Cai Chuang Zhu Jie Liu Mingyuan Yang Don Xie Wen Gao 《Journal of Real-Time Image Processing》2016,12(2):285-302

The emerging intra-coding tools of High Efficiency Video Coding (HEVC) standard can achieve up to 36 % bit-rate reduction compared to H.264/AVC, but with significant complexity increase. The design challenges, such as data dependency and computational complexity, make it difficult to implement a hardware encoder for real-time applications. In this paper, firstly, the data dependency in HEVC intra-mode decision is fully analyzed, which is cost by the reconstruction loop, the Most Probable Mode, the context adaption during Context-based Adaptive Binary Arithmetic Coding based rate estimation, and the Chroma derived mode. Then, several fast algorithms are proposed to remove the data dependency and to reduce the computational complexity, which include source signal based Rough Mode Decision, coarse to fine rough mode search, Prediction Mode Interlaced RDO mode decision, parallelized context adaption and Chroma-free Coding Unit (CU)/Prediction Unit (PU) decision. Finally, the parallelized VLSI architecture with CU reordering and Chroma reordering scheduling is proposed to improve the throughput. The experimental results demonstrate that the proposed intra-mode decision achieves 41.6 % complexity reduction with 4.3 % Bjontegaard Delta Rate (BDR) increase on average compared to the reference software, HM-13.0. The intra-mode decision scheme is implemented with 1571.7K gate count in 55 nm CMOS technology. The implementation results show that our design can achieve 1080p@60fps real time processing at 294 MHz operation frequency. 相似文献

16.

A programmable array processor architecture for flexible approximate string matching algorithms

Panagiotis D. Michailidis Konstantinos G. Margaritis 《Journal of Parallel and Distributed Computing》2007

Approximate string matching problem is a common and often repeated task in information retrieval and bioinformatics. This paper proposes a generic design of a programmable array processor architecture for a wide variety of approximate string matching algorithms to gain high performance at low cost. Further, we describe the architecture of the array and the architecture of the cell in detail in order to efficiently implement for both the preprocessing and searching phases of most string matching algorithms. Further, the architecture performs approximate string matching for complex patterns that contain don’t care, complement and classes symbols. We also simulate and evaluate the proposed architecture on a field programmable gate array (FPGA) device using the JHDL tool for synthesis and the Xilinx Foundation tools for mapping, placement, and routing. Finally, our programmable implementation achieves about 8–340 times faster execution than a desktop computer with a Pentium 4 3.5 GHz for all algorithms when the length of the pattern is 1024. 相似文献

17.

基于分布式任意阵列的宽带信源定位方法研究

《电子技术应用》2016,(1)

为提高宽带信源的定位精度,在已知信源个数情况下,提出一种新的基于任意阵列的宽带信源定位方法。首先,针对宽带信号的非平稳特性,可将宽带信号利用短时傅里叶变换在频域表示,利用群延迟函数(Group Delay)实现宽带信号的高精度波达方向(DOA)估计;最后根据质心收缩算法和互功率谱相位法以收缩区域的方式进行信源定位。对提出的算法进行了仿真分析,仿真结果表明本文算法DOA估计精度较高,且误差较小,信源定位精度较高。相对于现有的定位方法,本文算法计算量小,精度更高,更具实用性。相似文献

18.

Efficient algorithms for block-cyclic array redistribution betweenprocessor sets

Neungsoo Park Prasanna V.K. Raghavendra C.S. 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(12):1217-1240

Run-time array redistribution is necessary to enhance the performance of parallel programs on distributed memory supercomputers. In this paper, we present an efficient algorithm for array redistribution from cyclic(x) on P processors to cyclic(Kx) on Q processors. The algorithm reduces the overall time for communication by considering the data transfer, communication schedule, and index computation costs. The proposed algorithm is based on a generalized circulant matrix formalism. Our algorithm generates a schedule that minimizes the number of communication steps and eliminates node contention in each communication step. The network bandwidth is fully utilized by ensuring that equal-sized messages are transferred in each communication step. Furthermore, the time to compute the schedule and the index sets is significantly smaller. It takes O(max(P, Q)) time and is less than 1 percent of the data transfer time. In comparison, the schedule computation time using the state-of-the-art scheme (which is based on the bipartite matching scheme) is 10 to 50 percent of the data transfer time for similar problem sizes. Therefore, our proposed algorithm is suitable for run-time array redistribution. To evaluate the performance of our scheme, we have implemented the algorithm using C and MPI on an IBM SP2. Results show that our algorithm performs better than the previous algorithms with respect to the total redistribution time, which includes the time for data transfer, schedule, and index computation 相似文献

19.

A data-driven paradigm for mapping problems

《Parallel Computing》2015

We present a new data-driven paradigm for solving mapping problems on parallel computers. This paradigm targets at mapping data modules, instead of task modules, onto multiple processing cores. By dependency analysis of data modules, we devise a data movement matrix to reduce the need of manipulating task program modules at the expenses of handling data modules. To visualize and quantify the complex maneuver, we adopt the parallel activities trace graphs introduced earlier. To demonstrate the procedure and algorithmic values of our paradigm, we test it on the Strassen matrix multiplication and Cholesky matrix inversion algorithms. Mapping tasks has been more widely studied while mapping data is a new approach that appears to be more efficient for data-intensive applications that are becoming prevalent for today's parallel computers with millions of cores. 相似文献

20.

Embedding Binary Tree in VLSI/WSI Processor Array

下载免费PDF全文

Chung-Han CHEN 《计算机科学技术学报》1996,11(3):326-336

Many reconfiguration schemes for fault-tolerant binary tree architectures have been proposed in the literature^[1-6].The VLSI layouts of most previous studies are based on the classical H-tree layout,resulting in low area utilization and likely an unnecessarily high maufacturing cost simply due to the waste of a significant portion of silicon area. In this paper,we present an area-efficient approach to the reconfigurable binary tree architecture.Area utilization and interconnection complexity of our design compare favorably with the other known approaches.In the reliability analysis,we take into account the fact that accepted chips(after fabrication)are with different degrees of redundancy initially,so as to obtain results which better reflect real situations. 相似文献