Skip to main content

Research Repository

Advanced Search

CMS workflow execution using intelligent job scheduling and data access strategies

Metson, Simon; Hernandez, José M.; Hasham, Khawar; Delgado Peris, Antonio; Anjum, Ashiq; Evans, Dave; Gowdy, Stephen; Huedo, Eduardo; Hufnagel, Dirk; Van Lingen, Frank; McClatchey, Richard

Authors

Simon Metson

José M. Hernandez

Khawar Hasham

Antonio Delgado Peris

Ashiq Anjum

Dave Evans

Stephen Gowdy

Eduardo Huedo

Dirk Hufnagel

Frank Van Lingen



Abstract

Complex scientific workflows can process large amounts of data using thousands of tasks. The turnaround times of these workflows are often affected by various latencies such as the resource discovery, scheduling and data access latencies for the individual workflow processes or actors. Minimizing these latencies will improve the overall execution time of a workflow and thus lead to a more efficient and robust processing environment. In this paper, we propose a pilot job concept that has intelligent data reuse and job execution strategies to minimize the scheduling, queuing, execution and data access latencies. The results have shown that significant improvements in the overall turnaround time of a workflow can be achieved with this approach. The proposed approach has been evaluated, first using the CMS Tier0 data processing workflow, and then simulating the workflows to evaluate its effectiveness in a controlled environment. © 2011 IEEE.

Citation

Metson, S., Hernandez, J. M., Hasham, K., Delgado Peris, A., Anjum, A., Evans, D., …McClatchey, R. (2011). CMS workflow execution using intelligent job scheduling and data access strategies. IEEE Transactions on Nuclear Science, 58(3 PART 3), 1221-1232. https://doi.org/10.1109/TNS.2011.2146276

Journal Article Type Article
Publication Date Jun 1, 2011
Journal IEEE Transactions on Nuclear Science
Print ISSN 0018-9499
Publisher Institute of Electrical and Electronics Engineers
Peer Reviewed Peer Reviewed
Volume 58
Issue 3 PART 3
Pages 1221-1232
DOI https://doi.org/10.1109/TNS.2011.2146276
Keywords data cahce, grid, latency, pilot jobs, workflow
Public URL https://uwe-repository.worktribe.com/output/961950
Publisher URL http://dx.doi.org/10.1109/TNS.2011.2146276
Additional Information Additional Information : © 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.

Files







You might also like



Downloadable Citations