{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T07:12:17Z","timestamp":1761808337333,"version":"3.41.2"},"reference-count":40,"publisher":"Wiley","issue":"15","license":[{"start":{"date-parts":[[2020,1,9]],"date-time":"2020-01-09T00:00:00Z","timestamp":1578528000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Swiss Platform for Advanced Scientific Computing","award":["169123"],"award-info":[{"award-number":["169123"]}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2020,8,10]]},"abstract":"<jats:title>Summary<\/jats:title><jats:p>Many scientific applications consist of large and computationally intensive loops. Dynamic loop self\u2010scheduling (DLS) techniques are used to parallelize and to balance the load of such applications during execution. Load imbalance arises from variations in the loop iteration (or tasks) execution times, caused by problem, algorithmic, or systemic characteristics. Variations in systemic characteristics are referred to as perturbations. Our hypothesis is that <jats:italic>no single DLS technique can achieve the absolute best performance under various perturbations on heterogeneous high\u2010performance computing (HPC) systems<\/jats:italic>. Therefore, the selection of the most efficient DLS technique is critical to achieve the best application performance. The goal of this work is to solve the algorithm selection problem for the scheduling of computationally intensive applications under perturbations. Existing work only considers perturbations caused by variations in the delivered computational speed of the HPC systems. However, perturbations in available network bandwidth or latency are inevitable on production HPC systems. A simulation\u2010assisted scheduling algorithm selection (SimAS) approach is introduced herein as a novel control\u2010theoretic\u2010inspired approach to select DLS techniques dynamically that improve the performance of applications executing on heterogeneous HPC systems under perturbations. The present work examines the performance of seven applications on a heterogeneous HPC system under all the above system perturbations. SimAS is evaluated using native and simulative experiments. The performance results confirm the original hypothesis that motivates this work. The experimental evaluation shows that the SimAS\u2010based DLS selection identifies the most efficient technique and improves application performance in most cases.<\/jats:p>","DOI":"10.1002\/cpe.5648","type":"journal-article","created":{"date-parts":[[2020,1,9]],"date-time":"2020-01-09T09:59:43Z","timestamp":1578563983000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["SimAS: A simulation\u2010assisted approach for the scheduling algorithm selection under perturbations"],"prefix":"10.1002","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8465-0398","authenticated-orcid":false,"given":"Ali","family":"Mohammed","sequence":"first","affiliation":[{"name":"Department of Mathematics and Computer Science University of Basel  Basel Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2773-4499","authenticated-orcid":false,"given":"Florina M.","family":"Ciorba","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science University of Basel  Basel Switzerland"}]}],"member":"311","published-online":{"date-parts":[[2020,1,9]]},"reference":[{"key":"e_1_2_8_2_1","doi-asserted-by":"crossref","unstructured":"AshrafRA EngelmannC.Analyzing the impact of system reliability events on applications in the Titan supercomputer. Paper presented at: 2018 IEEE\/ACM 8th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS);2018;Dallas TX.","DOI":"10.1109\/FTXS.2018.00008"},{"volume-title":"The Landscape of Parallel Computing Research: A View From Berkeley","year":"2006","author":"Asanovic K","key":"e_1_2_8_3_1"},{"key":"e_1_2_8_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.1985.231547"},{"key":"e_1_2_8_5_1","first-page":"437","volume-title":"Scalable Computing and Communications: Theory and Practice","author":"Banicescu I","year":"2013"},{"key":"e_1_2_8_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.1987.5009495"},{"key":"e_1_2_8_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.205655"},{"key":"e_1_2_8_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/135226.135232"},{"key":"e_1_2_8_9_1","doi-asserted-by":"crossref","unstructured":"HummelSF SchmidtJ UmaRN WeinJ.Load\u2010sharing in heterogeneous systems via weighted factoring. In: Proceedings of the Annual ACM Symposium on Parallel Algorithms and Architectures;1996;Padua Italy.","DOI":"10.1145\/237502.237576"},{"key":"e_1_2_8_10_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1023588520138"},{"key":"e_1_2_8_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-007-0148-y"},{"key":"e_1_2_8_12_1","unstructured":"BanicescuI LiuZ.Adaptive factoring: a dynamic scheduling method tuned to the rate of weight changes. In: Proceedings of the High Performance Computing Symposium;2000;Washington DC."},{"key":"e_1_2_8_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0065-2458(08)60520-3"},{"key":"e_1_2_8_14_1","doi-asserted-by":"crossref","unstructured":"SukhijaN MaloneB SrivastavaS BanicescuI CiorbaFM.Portfolio\u2010based selection of robust dynamic loop scheduling algorithms using machine learning. In: Proceedings of the 28th IEEE International Parallel and Distributed Processing Symposium Workshops;2014;Phoenix AZ.","DOI":"10.1109\/IPDPSW.2014.183"},{"key":"e_1_2_8_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2014.06.008"},{"key":"e_1_2_8_16_1","doi-asserted-by":"crossref","unstructured":"BoulmierA BanicescuI CiorbaFM AbdennadherN.An autonomic approach for the selection of robust dynamic loop scheduling techniques. In: Proceedings of the 16th International Symposium on Parallel and Distributed Computing;2017;Innsbruck Austria.","DOI":"10.1109\/ISPDC.2017.9"},{"key":"e_1_2_8_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2009.84"},{"key":"e_1_2_8_18_1","unstructured":"MohammedA CiorbaFM.A study of the performance of scientific applications with dynamic loop scheduling under perturbations. Paper presented at: Poster at 2018 Platform for Advanced Scientific Computing Conference (PASC);2018;Basel Switzerland."},{"key":"e_1_2_8_19_1","doi-asserted-by":"crossref","unstructured":"EleliemyA MohammedA CiorbaFM.Efficient generation of parallel spin\u2010images using dynamic loop scheduling. In: Proceedings of the 19th IEEE International Conference on High Performance Computing and Communications Workshops;2017;Bangkok Thailand.","DOI":"10.1109\/HPCCWS.2017.00012"},{"key":"e_1_2_8_20_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1749-6632.1980.tb29690.x"},{"volume-title":"Euro\u2010Par 2018: Parallel Processing Workshops: Euro\u2010Par 2018 International Workshops, Turin, Italy, August 27\u201028, 2018, Revised Selected Papers","year":"2018","author":"Mohammed A","key":"e_1_2_8_21_1"},{"issue":"3","key":"e_1_2_8_22_1","first-page":"249","article-title":"A tool for a two\u2010level dynamic load balancing strategy in scientific applications","volume":"8","author":"Cari\u00f1o RL","year":"2007","journal-title":"Scalable Comput Pract Exp"},{"volume-title":"Proceedings of the 1986 International Conference on Parallel Processing: August 19\u201022, 1986","year":"1986","author":"Tang P","key":"e_1_2_8_23_1"},{"key":"e_1_2_8_24_1","doi-asserted-by":"crossref","unstructured":"VelhoP LegrandA.Accuracy study and improvement of network simulation in the SimGrid framework. In: Proceedings of the 2nd International Conference on Simulation Tools and Techniques;2009;Rome Italy.","DOI":"10.4108\/ICST.SIMUTOOLS2009.5592"},{"key":"e_1_2_8_25_1","doi-asserted-by":"crossref","unstructured":"MohammedA EleliemyA CiorbaFM.Performance reproduction and prediction of selected dynamic loop scheduling experiments. In: Proceedings of the 2018 International Conference on High Performance Computing and Simulation;2018;Orl\u00e9ans France.","DOI":"10.1109\/HPCS.2018.00071"},{"key":"e_1_2_8_26_1","doi-asserted-by":"crossref","unstructured":"MohammedA EleliemyA CiorbaFM KasielkeF BanicescuI.Experimental verification and analysis of dynamic loop scheduling in scientific applications. In: Proceedings of the 17th International Symposium on Parallel and Distributed Computing;2018;Geneva Switzerland.","DOI":"10.1109\/ISPDC2018.2018.00028"},{"key":"e_1_2_8_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2004.24"},{"key":"e_1_2_8_28_1","unstructured":"YangY CasanovaH.RUMR: Robust scheduling for divisible workloads. In: Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing;2003;Seattle WA."},{"key":"e_1_2_8_29_1","doi-asserted-by":"crossref","unstructured":"SukhijaN BanicescuI SrivastavaS CiorbaFM.Evaluating the flexibility of dynamic loop scheduling on heterogeneous systems in the presence of fluctuating load using SimGrid. In: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum;2013;Cambridge MA.","DOI":"10.1109\/IPDPSW.2013.132"},{"key":"e_1_2_8_30_1","unstructured":"ZhangY VossM RogersES.Runtime empirical selection of loop schedulers on hyperthreaded SMPs. In: Proceedings of the 19th International Parallel and Distributed Processing Symposium;2005;Denver CO."},{"key":"e_1_2_8_31_1","doi-asserted-by":"crossref","unstructured":"MenonH ChandrasekarK KaleLV.POSTER: Automated load balancer selection based on application characteristics. In: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming;2017;Austin TX.","DOI":"10.1145\/3018743.3019033"},{"key":"e_1_2_8_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/37.845037"},{"key":"e_1_2_8_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2019.10.007"},{"key":"e_1_2_8_34_1","first-page":"10","volume-title":"The importance and need for system monitoring and analysis in HPC operations and research","author":"Ciorba FM","year":"2017"},{"key":"e_1_2_8_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4939-2092-1_5"},{"volume-title":"Design and Analysis of Experiments","year":"2017","author":"Montgomery DC","key":"e_1_2_8_36_1"},{"key":"e_1_2_8_37_1","unstructured":"MohammedA CiorbaFM.SimAS: A simulation\u2010assisted approach for the scheduling algorithm selection under perturbations.2019.http:\/\/arxiv.org\/abs\/1912.02050"},{"key":"e_1_2_8_38_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.3933"},{"key":"e_1_2_8_39_1","doi-asserted-by":"publisher","DOI":"10.1177\/109434200001400303"},{"key":"e_1_2_8_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2016.2599513"},{"key":"e_1_2_8_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-34356-9_26"}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fcpe.5648","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.5648","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/cpe.5648","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.5648","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,6]],"date-time":"2023-09-06T09:22:41Z","timestamp":1693992161000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.5648"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1,9]]},"references-count":40,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2020,8,10]]}},"alternative-id":["10.1002\/cpe.5648"],"URL":"https:\/\/doi.org\/10.1002\/cpe.5648","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"type":"print","value":"1532-0626"},{"type":"electronic","value":"1532-0634"}],"subject":[],"published":{"date-parts":[[2020,1,9]]},"assertion":[{"value":"2018-11-02","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-12-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-01-09","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"e5648"}}