期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

张斐斐葛季栋李忠金黄子峰张胜陈兴国骆斌《软件学报》2023,34(12):5737-5756

在边缘计算场景中,通过将部分待执行任务卸载到边缘服务器执行能够达到降低移动设备的负载、提升移动应用性能和减少设备开销的目的.对于时延敏感任务,只有在截止期限内完成才具有实际意义.但是边缘服务器的资源往往有限,当同时接收来自多个设备的数据传输及处理任务时,可能造成任务长时间的排队等待,导致部分任务因超时而执行失败,因此无法兼顾多个设备的性能目标.鉴于此,在计算卸载的基础上优化边缘服务器端的任务调度顺序.一方面,将时延感知的任务调度建模为一个长期优化问题,并使用基于组合多臂赌博机的在线学习方法动态调整服务器的调度顺序.另一方面,由于不同的任务执行顺序会改变任务卸载性能提升程度,因而影响任务卸载决策的有效性.为了增加卸载策略的鲁棒性,采用了带有扰动回报的深度Q学习方法决定任务执行位置.仿真算例证明了该策略可在平衡多个用户目标的同时减少系统的整体开销. 相似文献

2.

Dynamic scheduling techniques for heterogeneous computing systems

Babak Hamidzadeh Yacine Atif David J. Lilja 《Concurrency and Computation》1995,7(7):633-652

There has been a recent increase of interest in heterogeneous computing systems, due partly to the fact that a single parallel architecture may not be adequate for exploiting all of a program's available parallelism. In some cases, heterogeneous systems have been shown to produce higher performance for lower cost than a single large machine. However, there has been only limited work on developing techniques and frameworks for partitioning and scheduling applications across the components of a heterogeneous system. In this paper we propose a general model for describing and evaluating heterogeneous systems that considers the degree of uniformity in the processing elements and the communication channels as a measure of the heterogeneity in the system. We also propose a class of dynamic scheduling algorithms for a heterogeneous computing system interconnected with an arbitrary communication network. These algorithms execute a novel optimization technique to dynamically compute schedules based on the potentially non-uniform computation and communication costs on the processors of a heterogeneous system. A unique aspect of these algorithms is that they easily adapt to different task granularities, to dynamically varying processor and system loads, and to systems with varying degrees of heterogeneity. Our simulations are designed to facilitate the evaluation of different scheduling algorithms under varying degrees of heterogeneity. The results show improved performance for our algorithms compared to the performance resulting from existing scheduling techniques. 相似文献

3.

Load Balancing in a Cluster-Based Web Server for Multimedia Applications

Jiani Guo Bhuyan L.N. 《Parallel and Distributed Systems, IEEE Transactions on》2006,17(11):1321-1334

We consider a cluster-based multimedia Web server that dynamically generates video units to satisfy the bit rate and bandwidth requirements of a variety of clients. The media server partitions the job into several tasks and schedules them on the backend computing nodes for processing. For stream-based applications, the main design criteria of the scheduling are to minimize the total processing time and maintain the order of media units for each outgoing stream. In this paper, we first design, implement, and evaluate three scheduling algorithms, first fit (FF), stream-based mapping (SM), and adaptive load sharing (ALS), for multimedia transcoding in a cluster environment. We determined that it is necessary to predict the CPU load for each multimedia task and schedule them accordingly due to the variability of the individual jobs/tasks. We, therefore, propose an online prediction algorithm that can dynamically predict the processing time per individual task (media unit). We then propose two new load scheduling algorithms, namely, prediction-based least load first (P-LLF) and prediction-based adaptive partitioning (P-AP), which can use prediction to improve the performance. The performance of the system is evaluated in terms of system throughput, out-of-order rate of outgoing media streams, and load balancing overhead through real measurements using a cluster of computers. The performance of the new load balancing algorithms is compared with all other load balancing schemes to show that P-AP greatly reduces the delay jitter and achieves high throughput for a variety of workloads in a heterogeneous cluster. It strikes a good balance between the throughput and output order of the processed media units 相似文献

4.

Constructing a video server with tertiary storage: Practice and experience

Hojung Cha Jongmin Lee Jaehak Oh 《Multimedia Systems》2002,8(5):380-394

Handling a tertiary storage device, such as an optical disk library, in the framework of a disk-based stream service model, requires a sophisticated streaming model for the server, and it should consider the device-specific performance characteristics of tertiary storage. This paper discusses the design and implementation of a video server which uses tertiary storage as a source of media archiving. We have carefully designed the streaming mechanism for a server whose key functionalities include stream scheduling, disk caching and admission control. The stream scheduling model incorporates the tertiary media staging into a disk-based scheduling process, and also enhances the utilization of tertiary device bandwidth. The disk caching mechanism manages the limited capacity of the hard disk efficiently to guarantee the availability of media segments on the hard disk. The admission controller provides an adequate mechanism which decides upon the admission of a new request based on the current resource availability of the server. The proposed system has been implemented on a general-purpose operating system and it is fully operational. The design principles of the server are validated with real experiments, and the performance characteristics are analyzed. The results guide us on how servers with tertiary storage should be deployed effectively in a real environment. RID="*" ID="*" e-mail: hjcha@cs.yonsei.ac.kr 相似文献

5.

Odyssey: a high‐performance clustered video server

Min‐You Wu Wei Shu Chow‐Sing Lin 《Software》2003,33(7):673-700

Video servers are essential in video‐on‐demand and other multimedia applications. In this paper, we present our high‐performance clustered CBR video server, Odyssey. Odyssey is a server connecting PCs with switched Ethernet. It provides efficient support for normal play and interactive browsing functions such as fast‐forward and fast‐backward. We designed a set of algorithms for scheduling, synchronization and admission control, which results in a high utilization of resources. Odyssey is able to deliver a large number of video streams. Copyright © 2003 John Wiley & Sons, Ltd. 相似文献

6.

Coalition theory based task scheduling algorithm using DLFC-NN model

Ashis Kumar Mishra Subasish Mohapatra Pradip Kumar Sahu 《Concurrency and Computation》2024,36(10):e8005

Resource management and job scheduling are essential in today's cloud computing world. Due to task scheduling and users' diverse submission of large-scale requests, co-located VM instances negatively impacted the performance of leased VM instances. This workload further led to resource rivalry across co-located VMs. In order to address the aforementioned problems, numerous strategies have been presented, however, they fail to take the asynchronous nature of the cloud environment into account. To address this issue, a novel “CTA using DLFC-NN model” is proposed. This proposed approach combines the coalition theory and DLFC-NN techniques by including IRT-OPTICS for task size clustering, digital metrology based on ionized information (DMBII) for defect detection in virtue machines (VM), and the dynamic levy flight hamster optimization algorithm for processing time optimization of the clusters. However, the implementation of task scheduling in an online environment is limited by a number of presumptions or oversimplifications made by current scheduling systems. As a result, a unique coalition theory is applied to efficiently schedule activities. In addition, the DLFC-NN model is used to reduce resource consumption, span time, and be highly accurate and energy-efficient when working on both online and offline jobs. Nevertheless, while optimizing the clusters' overall execution time, earlier approaches only decreased the make-span time for task scheduling. However, the DLFC-NN model solves the computation problem by using a fully weighted bipartite graph and the pseudo method to determine the fitness of the least makespan time. The enhanced methodology used in this study reduces the scheduling cost and minimizes job completion times according to different task counts when compared to the existing techniques. 相似文献

7.

一种机载DSP系统主控机的实时计算模型 总被引：1，自引：0，他引：1

邹勇李明树王青《计算机研究与发展》2003,40(1):47-52

现代机载数字信号处理（DSP）系统中的主控机是具有严格实时需求的计算环境，负责实时存储，显示和控制等重要功能，针对机载DSP系统的特殊性及其对实时计算的具体需求，提出了一种基于实时Linux技术的实时计算模型，它通过与操作系统层的实时支持相结合，实现了包括多任务并发的实时调度方法，硬实时和软实时任务协同工作机制以及实时事件驱动机制在内的完整运行环境，相对于现有的基于分时操作系统的方案，实时性能更为可靠，计算资源利用率高，相对于使用受严格许可证限制的商业实时操作系统的技术，应用开发灵活易行，软件成本更低。相似文献

8.

DNS dispatching algorithms with state estimators for scalable Web‐server clusters

Cardellini Valeria Colajanni Michele Yu Philip S. 《World Wide Web》1999,2(3):101-113

Replication of information across a server cluster provides a promising way to support popular Web sites. However, a Web‐server cluster requires some mechanism for the scheduling of requests to the most available server. One common approach is to use the cluster Domain Name System (DNS) as a centralized dispatcher. The main problem is that WWW address caching mechanisms (although reducing network traffic) only let this DNS dispatcher control a very small fraction of the requests reaching the Web‐server cluster. The non‐uniformity of the load from different client domains, and the high variability of real Web workload introduce additional degrees of complexity to the load balancing issue. These characteristics make existing scheduling algorithms for traditional distributed systems not applicable to control the load of Web‐server clusters and motivate the research on entirely new DNS policies that require some system state information. We analyze various DNS dispatching policies under realistic situations where state information needs to be estimated with low computation and communication overhead so as to be applicable to a Web cluster architecture. In a model of realistic scenarios for the Web cluster, a large set of simulation experiments shows that, by incorporating the proposed state estimators into the dispatching policies, the effectiveness of the DNS scheduling algorithms can improve substantially, in particular if compared to the results of DNS algorithms not using adequate state information. This revised version was published online in August 2006 with corrections to the Cover Date. 相似文献

9.

Partitioning functions for stateful data parallelism in stream processing

Buğra Gedik 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(4):517-539

In this paper, we study partitioning functions for stream processing systems that employ stateful data parallelism to improve application throughput. In particular, we develop partitioning functions that are effective under workloads where the domain of the partitioning key is large and its value distribution is skewed. We define various desirable properties for partitioning functions, ranging from balance properties such as memory, processing, and communication balance, structural properties such as compactness and fast lookup, and adaptation properties such as fast computation and minimal migration. We introduce a partitioning function structure that is compact and develop several associated heuristic construction techniques that exhibit good balance and low migration cost under skewed workloads. We provide experimental results that compare our partitioning functions to more traditional approaches such as uniform and consistent hashing, under different workload and application characteristics, and show superior performance. 相似文献

10.

Optimizing utilization of resource pools in web application servers

Alexander Totok Vijay Karamcheti 《Concurrency and Computation》2010,22(18):2421-2444

Among the web application server resources, the most critical for their performance are those that are held exclusively by a service request for the duration of its execution (or some significant part of it). Such exclusively held server resources become performance bottleneck points, with failures to obtain such a resource constituting a major portion of request rejections under server overload conditions. In this paper, we propose a methodology that computes the optimal pool sizes for two such critical resources: web server threads and database connections. Our methodology uses information about incoming request flow and about fine‐grained server resource utilization by service requests of different types, obtained through offline and online request profiling. In our methodology, we advocate (and show its benefits) the use of a database connection pooling mechanism that caches database connections for the duration of a service request execution (so‐called request‐wide database connection caching). We evaluate our methodology by testing it on the TPC‐W web application. Our method is able to accurately compute the optimal number of server threads and database connections, and the value of sustainable request throughput computed by the method always lies within a 5% margin of the actual value determined experimentally. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

11.

面向在线支付业务的容器资源弹性管理框架

任明陈晨施跃跃鲁逸丁《计算机系统应用》2016,25(7):17-22

随着在线支付业务的大规模应用,系统运维人员需要更便捷的服务器资源管理机制来满足系统扩展需求.本文提出一种基于容器技术的轻量化、弹性资源管理框架,并给出应用服务的性能诊断方法以及集群规模的动态调整方法.利用在线支付业务的典型数据处理场景进行实验,实验结果验证了方法的有效性. 相似文献

12.

Dynamic I/O-Aware Scheduling for Batch-Mode Applications on Chip Multiprocessor Systems of Cluster Platforms简

下载免费PDF全文

吕方崔慧敏王蕾刘磊武成岗冯晓兵游本中《计算机科学技术学报》2014,29(1):21-37

Efficiency of batch processing is becoming increasingly important for many modern commercial service centers, e.g., clusters and cloud computing datacenters. However, periodical resource contentions have become the major performance obstacles for concurrently running applications on mainstream CMP servers. I/O contention is such a kind of obstacle, which may impede both the co-running performance of batch jobs and the system throughput seriously. In this paper, a dynamic I/O-aware scheduling algorithm is proposed to lower the impacts of I/O contention and to enhance the co-running performance in batch processing. We set up our environment on an 8-socket, 64-core server in Dawning Linux Cluster. Fifteen workloads ranging from 8 jobs to 256 jobs are evaluated. Our experimental results show significant improvements on the throughputs of the workloads, which range from 7% to 431%. Meanwhile, noticeable improvements on the slowdown of workloads and the average runtime for each job can be achieved. These results show that a well-tuned dynamic I/O-aware scheduler is beneficial for batch-mode services. It can also enhance the resource utilization via throughput improvement on modern service platforms. 相似文献

13.

ETA: experience with an Intel Xeon processor as a packet processing engine

Regnier G. Minturn D. McAlpine G. Saletore V.A. Foong A. 《Micro, IEEE》2004,24(1):24-31

Server-based networks have well-documented performance limitations. These limitations outline a major goal of Intel's embedded transport acceleration (ETA) project, the ability to deliver high-performance server communication and I/O over standard Ethernet and transmission control protocol/Internet protocol (TCP/IP) networks. By developing this capability, Intel hopes to take advantage of the large knowledge base and ubiquity of these standard technologies. With the advent of 10 gigabit Ethernet, these standards promise to provide the bandwidth required of the most demanding server applications. We use the term packet processing engine (PPE) as a generic term for the computing and memory resources necessary for communication-centric processing. Such PPEs have certain desirable attributes; the ETA project focuses on developing PPEs with such attributes, which include scalability, extensibility, and programmability. General-purpose processors, such as the Intel Xeon in our prototype, are extensible and programmable by definition. Our results show that software partitioning can significantly increase the overall communication performance of a standard multiprocessor server. Specifically, partitioning the packet processing onto a dedicated set of compute resources allows for optimizations that are otherwise impossible when time sharing the same compute resources with the operating system and applications. 相似文献

14.

Priority scheduling for networked virtual environments

Faisstnauer C. Schmalstieg D. Purgathofer W. 《Computer Graphics and Applications, IEEE》2000,20(6):66-75

When simulating large virtual environments (VEs), contention for limited resources such as the CPU, rendering pipeline, or network bandwidth frequently degrades the system's performance. Whenever such a competition occurs and not all elements that require the resource can be serviced, an approximation must be made to avoid compromising interactive performance. We propose an enhancement to the round-robin approach called Priority Round Robin scheduling (PRR). This algorithm enforces priorities, while retaining the output sensitivity and starvation-free performance of round-robin scheduling. Priorities are set by a user-defined error metric (such as visual error), which the algorithm attempts to minimize. This permits not only scheduling the entities competing for a resource such as updates competing for the network, but also filling the gap between filtering and scheduling techniques. We evaluate our algorithm in a client-server system 相似文献

15.

带有资源冲突的Seru在线并行调度算法

江煜舟李冬妮靳洪博殷勇《自动化学报》2022,(2)

随着大规模定制的市场需求日趋显著,赛如生产系统(Seru production system,SPS)应运而生,逐渐成为研究和应用领域的热点.本文针对带有资源冲突的Seru在线并行调度问题进行研究,即需要在有限的空间位置上安排随动态需求而构建的若干Seru,以总加权完工时间最小为目标,决策Seru的构建顺序及时间.先基于平均延迟最短加权处理时间(Average delayed shortest weighted processing time,AD-SWPT)算法,针对其竞争比不为常数的局限性,引入调节参数,得到竞争比为常数的无资源冲突的Seru在线并行调度算法.接下来,引入冲突处理机制,得到有资源冲突的Seru在线并行调度算法,αAD-I(α-average delayed shortest weighted processing time-improved)算法,特殊实例下可通过实例归约的方法证明其竞争比与无资源冲突的情况相同.最后,通过实验,验证了在波动的市场环境下算法对于特殊实例与一般实例的优越性. 相似文献

16.

Adaptive resource scheduling mechanism in P2P file sharing system

Yi Zheng Fuhong Lin Yansong Yang Tong Gan 《Peer-to-Peer Networking and Applications》2016,9(6):1089-1100

Due to the inefficient resource adjustment, the current P2P file sharing systems cannot achieve the balanced relationship between supplements and demands over the resources. In this case, the uploading bandwidth of the system node cannot be utilized efficiently and the overall system QoS is degraded. In this paper, an adaptive resource scheduling mechanism called Push mechanism, is proposed, in which “proactive” strategies are provided to handle the unbalance supplement-demand relationship of some resource. Specifically, the system firstly forecasts which resource will becoming insufficient, then it pre-increase the uploaders over such resource so that the system performance is improved. Through numerical practical experiment in download platform of Tencent, it is proved that the proposed mechanism increases the downloading rate, saves the traffic on the server and optimizes the system performance. 相似文献

17.

带有资源冲突的Seru在线并行调度算法

江煜舟李冬妮靳洪博殷勇《自动化学报》2022,(2)

随着大规模定制的市场需求日趋显著,赛如生产系统(Seru production system,SPS)应运而生,逐渐成为研究和应用领域的热点.本文针对带有资源冲突的Seru在线并行调度问题进行研究,即需要在有限的空间位置上安排随动态需求而构建的若干Seru,以总加权完工时间最小为目标,决策Seru的构建顺序及时间.先基于平均延迟最短加权处理时间(Average delayed shortest weighted processing time,AD-SWPT)算法,针对其竞争比不为常数的局限性,引入调节参数,得到竞争比为常数的无资源冲突的Seru在线并行调度算法.接下来,引入冲突处理机制,得到有资源冲突的Seru在线并行调度算法,αAD-I(α-average delayed shortest weighted processing time-improved)算法,特殊实例下可通过实例归约的方法证明其竞争比与无资源冲突的情况相同.最后,通过实验,验证了在波动的市场环境下算法对于特殊实例与一般实例的优越性. 相似文献

18.

Online optimization for scheduling preemptable tasks on IaaS cloud systems

Jiayin Li Meikang Qiu Zhong Ming Gang Quan Xiao Qin Zonghua Gu 《Journal of Parallel and Distributed Computing》2012

In Infrastructure-as-a-Service (IaaS) cloud computing, computational resources are provided to remote users in the form of leases. For a cloud user, he/she can request multiple cloud services simultaneously. In this case, parallel processing in the cloud system can improve the performance. When applying parallel processing in cloud computing, it is necessary to implement a mechanism to allocate resource and schedule the execution order of tasks. Furthermore, a resource optimization mechanism with preemptable task execution can increase the utilization of clouds. In this paper, we propose two online dynamic resource allocation algorithms for the IaaS cloud system with preemptable tasks. Our algorithms adjust the resource allocation dynamically based on the updated information of the actual task executions. And the experimental results show that our algorithms can significantly improve the performance in the situation where resource contention is fierce. 相似文献

19.

边缘计算场景中基于虚拟映射的隐私保护卸载算法

余雪勇陈涛《计算机科学》2021,48(1):65-71

随着移动边缘计算(Mobile Edge Computing,MEC)和无线充电技术(Wireless Power Transmission,WPT)的诞生和发展,越来越多的计算任务被卸载至MEC服务器以进行处理,并借助WPT技术为终端设备供电,以缓解终端设备计算能力受限和设备能耗过高的问题.由于卸载的任务和数据往往携... 相似文献

20.

Comparative study of task duplication static scheduling versus clustering and non-clustering techniques

Behrooz Shirazi Hsing-Bung Chen Jeff Marquis 《Concurrency and Computation》1995,7(5):371-389

One of the major issues that needs to be addressed in distributed memory multiprocessor (DMM) systems is the program task partitioning and scheduling problems, i.e. mapping of an application program's precedence related task threads among the processing elements of a DMM system. The optimal task partitioning and scheduling problem, with the goal of minimizing the program execution time and interprocessor communication overhead, is known to be an NP-complete problem. The paper addresses the design, development and performance evaluation of a novel static task partitioning and scheduling method called linear clustering with task duplication (LCTD). LCTD employs the linear (sequential) execution of tasks and task duplication heuristics in achieving minimized computation and interprocessor communication delays in DMMs. The superiority of the proposed LCTD algorithm is demonstrated through simulation studies and comparison against several of the existing static scheduling schemes, such as heavy node first (HNF) and linear clustering. We show that the proposed method can obtain an average of 33% improvement in program execution time and 21% improvement in processor utilization compared to linear clustering and HNF methods. 相似文献