About Crawling Scheduling Problems.

No Thumbnail Available
Date
2017-09
Authors
Pechnikov, A.A
Chernobrovkin, D.I
Nwohiri, A.M
Journal Title
Journal ISSN
Volume Title
Publisher
CEUR Workshop Proceedings
Abstract
This paper investigates the task of scheduling jobs across several servers in a software system similar to the Enterprise Desktop Grid. One of the features of this system is that it has a specific area of action –collects outbound hyperlinks for a given set of websites. The target set is scanned continuously (and regularly) at certain time intervals. Data obtained from previous scans are used to construct the next scanning task for the purpose of enhancing efficiency (shortening the scanning time in this case). A mathematical model for minimizing the scanning time for a batch mode is constructed; an approximate algorithm for the solution of the model is proposed. A series of experiments are carried out in a real software system. The results obtained from the experiments enabled to compare the proposed batch mode with the known round robin mode. This revealed the advantagesand disadvantages of the batch mode
Description
Conference Papers
Keywords
Enterprise Desktop Grid , Web Crawling , Round-Robin Scheduling , Batch Job Scheduling , Research Subject Categories::TECHNOLOGY::Information technology::Computer science
Citation
Pechnikov, A.A, Chernobrovkin, and D.I, Nwohiri, A.M (2017). About Crawling Scheduling Problems. Proceedings of the Third International Conference BOINC:FAST, Russia, 49-55pp.