首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We introduce a new composite adaptive Algebraic Multigrid (composite \(\alpha \) AMG) method to solve systems of linear equations without a-priori knowledge or assumption on characteristics of near-null components of the AMG preconditioned problem referred to as algebraic smoothness. Our version of \(\alpha \) AMG is a composite solver built through a bootstrap strategy aimed to obtain a desired convergence rate. The coarsening process employed to build each new solver component relies on a pairwise aggregation scheme based on weighted matching in a graph, successfully exploited for reordering algorithms in sparse direct methods to enhance diagonal dominance, and compatible relaxation. The proposed compatible matching process replaces the commonly used characterization of strength of connection in both the coarse space selection and in the interpolation scheme. The goal is to design a method leading to scalable AMG for a wide class of problems that go beyond the standard elliptic Partial Differential Equations (PDEs). In the present work, we introduce the method and demonstrate its potential when applied to symmetric positive definite linear systems arising from finite element discretization of highly anisotropic elliptic PDEs on structured and unstructured meshes. We also report on some preliminary tests for 2D and 3D elasticity problems as well as on problems from the University of Florida Sparse Matrix Collection.  相似文献   

2.
《Applied Soft Computing》2008,8(1):350-370
Differential Evolution (DE) has gathered a reputation for being a powerful yet simple global optimiser with continually outperforming many of the already existing stochastic and direct search global optimisation techniques. It is however well established that DE is particularly sensitive to its control parameters, most notably the mutation weighting factor F. This sensitivity is further studied here and a simple randomised self-adaptive scheme is proposed for the DE mutation weighting factor F. The performance of this algorithm is studied with the use of several benchmark problems and applied to a difficult control systems design case study.  相似文献   

3.
InA Subset of Concurrent Prolog and Its Interpreter (1983), E. Y. Shapiro introduces the language Concurrent Prolog. In his presentation, the problem of guaranteeing bounded-waiting during a merge operation is used as a programming example. Solutions are proposed for binary and n-ary merges. The solutions are, however, completely dependent on specific operational characteristics of a Concurrent Prolog machine or interpreter. This paper presents an alternate approach in which the property of bounded-waiting is incorporated into the semantics of the programs, demonstrable given only the computational model of the language. The solution strategy is to utilize the familiar systems programming techniques of block-on-input and busy-wait. This approach requires that the language be augmented with a metalogical predicate analogous to thevar(_) predicate of Sequential Prolog. The resultant programs are interesting and illustrative examples of Concurrent Prolog as a programming language.  相似文献   

4.
This paper presents a Differential Evolution algorithm with self-adaptive trial vector generation strategy and control parameters (SspDE) for global numerical optimization over continuous space. In the SspDE algorithm, each target individual has an associated strategy list (SL), a mutation scaling factor F list (FL), and a crossover rate CR list (CRL). During the evolution, a trial individual is generated by using a strategy, F, and CR taken from the lists associated with the target vector. If the obtained trial individual is better than the target vector, the used strategy, F, and CR will enter a winning strategy list (wSL), a winning F list (wFL), and a winning CR list (wCRL), respectively. After a given number of iterations, the FL, CRL or SL will be refilled at a high probability by selecting elements from wFL, wCRL and wSL or randomly generated values. In this way, both the trial vector generation strategy and its associated parameters can be gradually self-adapted to match different phases of evolution by learning from their previous successful experience. Extensive computational simulations and comparisons are carried out by employing a set of 19 benchmark problems from the literature. The computational results show that overall the SspDE algorithm performs better than the state-of-the-art differential evolution variants.  相似文献   

5.
Differential Evolution (DE) is arguably one of the most powerful stochastic real-parameter optimization algorithms of current interest. DE operates through the similar computational steps as employed by a standard Evolutionary Algorithm (EA). However, unlike the traditional EAs, the DE-variants perturb the current-generation population members with the scaled differences of randomly selected and distinct population members. Therefore, no separate probability distribution has to be used, which makes the scheme self-organizing in this respect. Scale Factor (F) and Crossover Rate (Cr) are two very important control parameters of DE since the former regulates the step-size taken while mutating a population member in DE and the latter controls the number of search variables inherited by an offspring from its parent during recombination. This article describes a very simple yet very much effective adaptation technique for tuning both F and Cr, on the run, without any user intervention. The adaptation strategy is based on the objective function value of individuals in the DE population. Comparison with the best-known and expensive variants of DE over fourteen well-known numerical benchmarks and one real-life engineering problem reflects the superiority of proposed parameter tuning scheme in terms of accuracy, convergence speed, and robustness.  相似文献   

6.
7.
CARMEL-2 is a high performance VLSI uniprocessor, tuned forFlat Concurrent Prolog (FCP). CARMEL-2 shows almost 5-fold speedup over its predecessor, CARMEL-1, and it achieves 2,400 KLIPS executingappend. This high execution rate was gained as a result of an optimized design, based on an extensive architecture-oriented execution analysis of FCP, and the lessons learned with CARMEL-1. CARMEL-2 is a RISC processor in its character and performance. The instruction set includes only 29 carefully selected instructions. The 10 special instructions, the prudent implementation and pipeline scheme, as well as sophisticated mechanisms such as intelligent dereference, distinguish CARMEL-2 as a RISC processor for FCP.  相似文献   

8.
Energy-centric DVFS controlling method for multi-core platforms   总被引:1,自引:0,他引:1  
Dynamic voltage and frequency scaling (DVFS) is a well-known and effective technique for reducing energy consumption in modern processors. However, accurately predicting the effect of frequency scaling on system performance is a challenging problem in real environments. In this paper, we propose a realistic DVFS performance prediction method, and a practical DVFS control policy (eDVFS) that aims to minimize total energy consumption in multi-core platforms. We also present power consumption estimation models for CPU and DRAM by exploiting a hardware energy monitoring unit. We implemented eDVFS in Linux, and our evaluation results show that eDVFS can save a substantial amount of energy compared with Linux “on-demand” CPU governor in diverse environments.  相似文献   

9.
This paper presents LS-VisionDraughts: an efficient unsupervised evolutionary learning system for Checkers whose contribution is to automate the process of selecting an appropriate representation for the board states – by means of Evolutionary Computation – keeping a deep look-ahead (search depth) at the moment of choosing an adequate move. It corresponds to a player Multi Layer Perceptron Neural Network whose weights are updated through an evaluation function that is automatically adjusted by means of the Temporal Difference methods. A Genetic Algorithm automatically chooses a concise and efficient set of functions, which describe various scenarios associated with Checkers – called features – to represent the board states in the input layer of the Neural Network. It means that each individual of the Genetic Algorithm is a candidate set of features that is associated to a distinct Multi Layer Perceptron Neural Network. The output layer of the Neural Network is a real number (prediction) that indicates to which extent the input state is favorable to provide a better agent performance. In LS-VisionDraughts, a particular version of the search algorithm Alpha-Beta, called fail-soft Alpha-Beta, combined with Table Transposition, Iterative Deepening and ordered tree, uses this prediction value to choose the best move corresponding to the current board state. The best individual is chosen by means of numerous tournaments involving these selfsame Neural Networks. The architecture of LS-VisionDraught is inspired on the agent NeuroDraughts. However, the former system enhances the performance of the latter by automating the selection of the features through Evolutionary Computation and by replacing its Minimax search algorithm with the improved search strategy resumed above. This procedure allows for a 95 % reduction in the search runtime. Further, it remarkably increases the search tree depth. The results obtained from evaluative tournaments confirm the advances of LS-VisionDraughts compared to its opponents. It is however important to point out that LS-VisionDraughts learns practically without human supervision, contrary to the current automatic world champion Chinook, which has been built in a strongly supervised manner.  相似文献   

10.
A new searching approach, Selecten Jumping Searching (SJS), in which the length of every seaching step from a node to another node along a path is much longer than one, has been proposed this paper. In addition to all the problems which can be solved by GPS or MPS, SJS can also solve other problems such as theN Queens problems, theN Puzzle problems, etc., which GPS and MPS fail whenN is large and whose computational complexity are exponential by the general searching approach or MPS. The searching algorithms of SJS, algorithmsC,C 0 (orC 0 ) andC * whose computational complexity is only polynomial and linear respectively have been proposed also in this paper. Finally, the experimental results of the Five Hundred Queens problem, more than two Thousand Queens problem and theN Puzzle problem (whereN is more than one thousand) are given. In order to get the first some solutions of the Fifty Queens problem and to build the 352?1 Puzzle problem’s Macro Table of MPS, both of them would take 1025 years even using a 1015 ops supercomputer by the general searching approach. But using proposed approach and algorithms (whose computational complexity isO(N) andO(N 3/2) respectively), 4000 solutions of the Five Hundred Queens problem have been got when program runs about 227 minutes on HP 9000/835 and the average solution time to solve the 352?1 Puzzle problem with arbitrary problem state is less than one minute on HP 9000/300. SJS is a searching approach as a result mapped from Macro Transformation approach.  相似文献   

11.
This paper proposes a distributed address configuration scheme for a Mobile Ad hoc Network (MANET). The architecture for a MANET, the algorithm of constructing a MANET, and the algorithm for calculating a unique address space for the assignment are proposed. In the architecture, a common node has a unique address space for the assignment, and an isolated node can acquire a unique address from a neighboring common node without performing duplicate address detection. In this way, the address configuration task is distributed around common nodes. In this scheme, the control packets used for address configuration are exchanged within a one-hop scope, so the scalability is enhanced. This paper also presents an address recovery algorithm that can effectively retrieve the address resources released by failed nodes and the MANET merging/partitioning algorithm that can ensure a node’s address uniqueness. This paper compares the performance parameters of the proposed scheme and the existing schemes, including strong duplicate address detection and prime dynamic host configuration protocol, and the comparative results show that the address configuration cost of the proposed scheme is lower and the delay is shorter.  相似文献   

12.
13.
Multi-core processors and clusters of multi-core processors are ubiquitous. They provide scalable performance yet introducing complex and low-level programming models for shared and distributed memory programming. Thus, fully exploiting the potential of shared and distributed memory parallelization can be a tedious and error-prone task: programmers must take care of low-level threading and communication (e.g. message passing) details. In order to assist programmers in developing performant and reliable parallel applications Algorithmic Skeletons have been proposed. They encapsulate well-defined, frequently recurring parallel and distributed programming patterns, thus shielding programmers from low-level aspects of parallel and distributed programming. In this paper we take on the design and implementation of the well-known Farm skeleton. In order to address the hybrid architecture of multi-core clusters we present a two-tier implementation built on top of MPI and OpenMP. On the basis of three benchmark applications, including a simple ray tracer, an interacting particles system, and an application for calculating the Mandelbrot set, we illustrate the advantages of both skeletal programming in general and this two-tier approach in particular.  相似文献   

14.
Many real-world problems may be expressed as nonlinear constrained optimization problems (CNOP). For this kind of problems, the set of constraints specifies the feasible solution space. In the last decades, several algorithms have been proposed and developed for tackling CNOP. In this paper, we present an extension of the “Musical Composition Method” (MMC) for solving constrained optimization problems. MMC was proposed by Mora et al. (Artif Intell Rev 1–15, doi:10.1007/s10462-011-9309-8, 2012a). The MMC is based on a social creativity system used to compose music. We evaluated and analyzed the performance of MMC on 12 CNOP benchmark cases. The experimental results demonstrate that MMC significantly improves the global performances of the other tested metaheuristics on some benchmark functions.  相似文献   

15.
The Andorra model is a parallel execution model of logic programs which exploits the dependent and-parallelism and or-parallelism inherent in logic programming. We present a flat subset of a language based on the Andorra model, henceforth called Andorra Prolog, that is intended to subsume both Prolog and the committed choice languages. Flat Andorra, in addition todon’t know anddon’t care nondeterminism, supports control of or-parallel split, synchronisation on variables, and selection of clauses. We show the operational semantics of the language, and its applicability in the domain of committed choice languages. As an examples of the expressiveness of the language, we describe a method for communication between objects by time-stamped messages, which is suitable for expressing distributed discrete event simulation applications. This method depends critically on the ability to expressdon’t know nondeterminism and thus cannot easily be expressed in a committed choice language.  相似文献   

16.
Process migration provides many benefits for parallel environments including dynamic load balancing, data access locality or fault tolerance. This paper describes an in-memory application-level checkpoint-based migration solution for MPI codes that uses the Hierarchical Data Format 5 (HDF5) to write the checkpoint files. The main features of the proposed solution are transparency for the user, achieved through the use of CPPC (ComPiler for Portable Checkpointing); portability, as the application-level approach makes the solution adequate for any MPI implementation and operating system, and the use of the HDF5 file format enables the restart on different architectures; and high performance, by saving the checkpoint files to memory instead of to disk through the use of the HDF5 in-memory files. Experimental results prove that the in-memory approach reduces significantly the I/O cost of the migration process.  相似文献   

17.
In the design phase of business and IT system development, it is desirable to predict the properties of the system-to-be. A number of formalisms to assess qualities such as performance, reliability and security have therefore previously been proposed. However, existing prediction systems do not allow the modeler to express uncertainty with respect to the design of the considered system. Yet, in contemporary business, the high rate of change in the environment leads to uncertainties about present and future characteristics of the system, so significant that ignoring them becomes problematic. In this paper, we propose a formalism, the Predictive, Probabilistic Architecture Modeling Framework (P2AMF), capable of advanced and probabilistically sound reasoning about business and IT architecture models, given in the form of Unified Modeling Language class and object diagrams. The proposed formalism is based on the Object Constraint Language (OCL). To OCL, P2AMF adds a probabilistic inference mechanism. The paper introduces P2AMF, describes its use for system property prediction and assessment and proposes an algorithm for probabilistic inference.  相似文献   

18.
With the rapid growth of the video surveillance applications, the storage energy consumption of video surveillance is more noticeable, but existed energy-saving methods for massive storage system most concentrate on the data centers mainly with random accesses. The storage of video surveillance has inherent access pattern, and requires special energy-saving approach to save more energy. An energy-efficient data layout for video surveillance, Semi-RAID is proposed. It adopts partial-parallelism strategy, which partitions disk data into different groups, and implements parallel accesses in each group. Grouping benefits to realize only partial disks working and the rest ones idle, and inner-group parallelism provides the performance guarantee. In addition, greedy strategy for address allocation is adopted to effectively prolong the idle period of the disks; particular Cache strategies are used to filter the small amount of random accesses. The energy-saving efficiency of Semi-RAID is verified by a simulated video surveillance consisting of 32 cameras with D1 resolution. The experiment shows: Semi-RAID can save 45 % energy than Hibernator; 80 % energy than PARAID; 33 % energy than MAID; 79 % energy than eRAID-5, while providing single disk fault tolerance and meeting the performance requirement, such as throughput.  相似文献   

19.
This paper presents a novel solution based on Extreme Learning Machine (ELM) for multiclass XML documents classification. ELM is a generalized Single-hidden Layer Feedforward Network (SLFN) with extremely fast learning capacity. An improved vector model DSVM (Distribution based Structured Vector Model) is proposed to represent XML documents with more structural information and more precise semantic information. The XML documents classifiers are conducted based on PV-ELM (Probablity based Voting ELM) with a postprocessing method ε-RCC (ε - Revoting of Confusing Classes) to refine the voting results. To evaluate the overall performance of this solution, a series of experiments are conducted on two real datasets of news feeds online. The experimental results show that DSVM represents the XML documents more effectively and PV-ELM with ε-RCC achieves a higher accuracy than original ELM algorithm for multiclass classification.  相似文献   

20.
Clustering analysis is one of the most commonly used data processing algorithms. Over half a century, K-means remains the most popular clustering algorithm because of its simplicity. Recently, as data volume continues to rise, some researchers turn to MapReduce to get high performance. However, MapReduce is unsuitable for iterated algorithms owing to repeated times of restarting jobs, big data reading and shuffling. In this paper, we address the problems of processing large-scale data using K-means clustering algorithm and propose a novel processing model in MapReduce to eliminate the iteration dependence and obtain high performance. We analyze and implement our idea. Extensive experiments on our cluster demonstrate that our proposed methods are efficient, robust and scalable.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号