Laboratoire de l'informatique du parallélisme Jarry, Aubin Casanova, Henri Berman, Francine
Abstract :
(eng) Scheduling the tasks of a distributed application has been an active field of research for several decades. The classic scheduling problem is to find a assignment of application tasks onto a set of distributed resources in a way that minimizes the overall application execution time. This problem can be formulated for different classes of applications and computing environments. It has been shown that most non-trivial instances of the scheduling problem are NP-complete. As a result, many research works propose different solutions including heuristic-based algorithms, guided random searches, and genetic algorithms. However, it is difficult to determine which solutions are practical for real scenarios. Indeed, most available results ignore many characteristic of large computing environments such as dynamically changing resource performance and availability, complex network topologies and network contention, data storage issues, or inaccuracy of resource performance estimates. In this paper, we present DAGSim, a simulator specifically designed to evaluate scheduling algorithms for application structured as Directed Acyclic dependency Graphs (DAGs). This simulator is built on top of Simgrid~\cite{simgrid}, a toolkit for the fast and accurate simulation of distributed applications with realistic assumptions concerning the computing environment. We describe the implementation of DAGSim and how it was used to implement two well-known scheduling algorithms (DCP~\cite{DCP} and DLS~\cite{DLS}). After giving preliminary results with these two algorithms, we give implementation recommendations for further improvement of Simgrid and DAGSim.