期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications

Vydyanathan Naga Krishnamoorthy Sriram Sabin Gerald M. Catalyurek Umit V. Kurc Tahsin Sadayappan Ponnuswamy Saltz Joel H. 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(8):1158-1172

Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches. 相似文献

2.

Optimizing latency and throughput of application workflows on clusters

Naga Vydyanathan Umit Catalyurek Tahsin Kurc Ponnuswamy Sadayappan Joel Saltz 《Parallel Computing》2011,37(10-11):694-712

Scheduling, in many application domains, involves optimization of multiple performance metrics. For example, application workflows with real-time constraints have strict throughput requirements and also desire a low latency or response time. In this paper, we present a novel algorithm for the scheduling of workflows that act on a stream of input data. Our algorithm focuses on the two performance metrics, latency and throughput, and minimizes the latency of workflows while satisfying strict throughput requirements. We also describe steps to use the above approach to solve the problem of meeting latency requirements while maximizing throughput. We leverage pipelined, task and data parallelism in a coordinated manner to meet these objectives and investigate the benefit of task duplication in alleviating communication overheads in the pipelined schedule for different workflow characteristics. The proposed algorithm is designed for a realistic bounded multi-port communication model, where each processor can simultaneously communicate with at most k distinct processors. Experimental evaluation using synthetic benchmarks as well as those derived from real applications shows that our algorithm consistently produces lower latency schedules that meet throughput requirements, even when previously proposed schemes fail. 相似文献

3.

Large-scale biomedical image analysis in grid environments

Vijay S Kumar Benjamin Rutt Tahsin Kurc Umit V Catalyurek Tony C Pan Sunny Chow Stephan Lamont Maryann Martone Joel H Saltz 《IEEE transactions on information technology in biomedicine》2008,12(2):154-161

This paper presents the application of a component-based Grid middleware system for processing extremely large images obtained from digital microscopy devices. We have developed parallel, out-of-core techniques for different classes of data processing operations employed on images from confocal microscopy scanners. These techniques are combined into a data preprocessing and analysis pipeline using the component-based middleware system. The experimental results show that: 1) our implementation achieves good performance and can handle very large datasets on high-performance Grid nodes, consisting of computation and/or storage clusters and 2) it can take advantage of Grid nodes connected over high-bandwidth wide-area networks by combining task and data parallelism. 相似文献

4.

Parallel four‐dimensional Haralick texture analysis for disk‐resident image datasets

Brent Woods Bradley Clymer Johannes Heverhagen Michael Knopp Joel Saltz Tahsin Kurc 《Concurrency and Computation》2007,19(1):65-87

Texture analysis is one possible method of detecting features in biomedical images. During texture analysis, texture‐related information is found by examining local variations in image brightness. Four‐dimensional (4D) Haralick texture analysis is a method that extracts local variations along space and time dimensions and represents them as a collection of 14 statistical parameters. However, application of the 4D Haralick method on large time‐dependent image datasets is hindered by data retrieval, computation, and memory requirements. This paper describes a parallel implementation using a distributed component‐based framework of 4D Haralick texture analysis on PC clusters. The experimental performance results show that good performance can be achieved for this application via combined use of task‐ and data‐parallelism. In addition, we show that our 4D texture analysis implementation can be used to classify imaged tissues. Copyright © 2006 John Wiley & Sons, Ltd. 相似文献

5.

LONG‐TERM VEGETATION DYNAMICS AFTER HIGH‐DENSITY SEEDLING ESTABLISHMENT: IMPLICATIONS FOR RIPARIAN RESTORATION AND MANAGEMENT

D. P. Bunting S. Kurc M. Grabau 《河流研究与利用》2013,29(9):1119-1130

Human disturbances have contributed to the deterioration of many western US rivers in the past century. Cottonwood‐willow communities, present historically along the Colorado River, protect watersheds and provide wildlife habitat, but are now among the most threatened forests. As a result, restoration efforts have increased to re‐establish and maintain cottonwood‐willow stands. While successful establishment has been observed using multiple strategies with varying investments, few projects are evaluated to quantify efficacy and determine long‐term sustainability. We monitored a seeded cottonwood‐willow site over a five‐year period beginning in 2007, with particular interest in how density affected vegetation diversity and stand structure over time. Fremont cottonwood (Populus fremontii) and volunteer tamarisk (Tamarix ramosissma) were the only abundant riparian trees at the site after one year. P. fremontii, compared to T. ramosissma, had higher growth rates, lower mortality, and dominated overstory and total cover each year. Vegetation diversity decreased from 2007–2009, but was similar from 2009–2011 as a result of decreased herbaceous and increased shrub species richness. Diversity was highest in the lowest density class (1‐12 stems/m²), but similar among all other classes (13–24, 25–42, 43+). High initial woody species densities resulted in single‐stemmed trees with depressed terminal and radial growths. Allometry, relating height to DBH at different densities, could prove to be an important tool for long‐term restoration management and studying habitat suitability. Understanding long‐term trends at densely‐planted or seeded sites can benefit restoration managers who aim to establish specific community structure and vegetation diversity for wildlife habitat. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

6.

Visualization of large data sets with the Active Data Repository 总被引：1，自引：0，他引：1

Kurc T. Catalyurek U. Chialin Chang Sussman A. Saltz J. 《Computer Graphics and Applications, IEEE》2001,21(4):24-33

We implement ray-casting-based volume rendering and isosurface rendering methods using the Active Data Repository (ADR) for visualizing out-of-core data sets. We have developed the ADR object-oriented framework to provide support for applications that employ range queries with user-defined mapping and aggregation operations on large-scale multidimensional data. ADR targets distributed-memory parallel machines with one or more disks attached to each node. It is designed as a set of modular services implemented in C++, which can be customized for application-specific processing. The ADR runtime system supports common operations such as memory management, data retrieval, and scheduling of processing across a parallel machine 相似文献

7.

Optimizing the execution of multiple data analysis queries on parallel and distributed environments

Andrade H. Kurc T. Sussman A. Saltz J. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(6):520-532

We investigate techniques for efficiently executing multiquery workloads from data and computation-intensive applications in parallel and/or distributed computing environments. In this context, we describe a database optimization framework that supports data and computation reuse, query scheduling, and active semantic caching to speed up the evaluation of multiquery workloads. Its most striking feature is the ability of optimizing the execution of queries in the presence of application-specific constructs by employing a customizable data and computation reuse model. Furthermore, we discuss how the proposed optimization model is flexible enough to work efficiently irrespective of the parallel/distributed environment underneath. In order to evaluate the proposed optimization techniques, we present experimental evidence using real data analysis applications. For this purpose, a common implementation for the queries under study was provided according to the database optimization framework and deployed on top of three distinct experimental configurations: a shared memory multiprocessor, a cluster of workstations, and a distributed computational Grid-like environment. 相似文献

8.

Experimental Study of Inconel 718 Surface Treatment by Edge Robotic Deburring with Force Control

A. Burghardt D. Szybicki K. Kurc M. Muszyñska J. Mucha 《Strength of Materials》2017,49(4):594-604

相似文献

9.

The virtual microscope 总被引：1，自引：0，他引：1

Catalyurek U. Beynon M.D. Chialin Chang Kurc T. Sussman A. Saltz J. 《IEEE transactions on information technology in biomedicine》2003,7(4):230-248

We present the design and implementation of the virtual microscope, a software system employing a client/server architecture to provide a realistic emulation of a high power light microscope. The system provides a form of completely digital telepathology, allowing simultaneous access to archived digital slide images by multiple clients. The main problem the system targets is storing and processing the extremely large quantities of data required to represent a collection of slides. The virtual microscope client software runs on the end user's PC or workstation, while database software for storing, retrieving and processing the microscope image data runs on a parallel computer or on a set of workstations at one or more potentially remote sites. We have designed and implemented two versions of the data server software. One implementation is a customization of a database system framework that is optimized for a tightly coupled parallel machine with attached local disks. The second implementation is component-based, and has been designed to accommodate access to and processing of data in a distributed, heterogeneous environment. We also have developed caching client software, implemented in Java, to achieve good response time and portability across different computer platforms. The performance results presented show that the Virtual Microscope systems scales well, so that many clients can be adequately serviced by an appropriately configured data server. 相似文献

10.

Constrained mirror placement on the Internet 总被引：2，自引：0，他引：2

Cronin E. Jamin S. Cheng Jin Kurc A.R. Raz D. Shavitt Y. 《Selected Areas in Communications, IEEE Journal on》2002,20(7):1369-1382

Web content providers and content distribution network (CDN) operators often set up mirrors of popular content to improve performance. Due to the scale and decentralized administration of the Internet, companies have a limited number of sites (relative to the size of the Internet) where they can place mirrors. We formalize the mirror placement problem as a case of constrained mirror placement, where mirrors can only be placed on a preselected set of candidates. We study performance improvement in terms of client round-trip time (RTT) and server load when clients are clustered by the autonomous systems (AS) in which they reside. Our results show that, regardless of the mirror placement algorithm used, for only a surprisingly small range of values there is an increase in the number of mirror sites (under the constraint) effective in reducing the client to server RTT and server load. In this range, we show that greedy placement performs the best. 相似文献