Аннотация:There are different approaches that help to solve the issue of low efficiency of modern supercomputer usage. One of them is based on constant monitoring of a supercomputer job flow in order to promptly detect inefficient programs. The execution dynamics of such programs usually differs from the “normal” behavior of common programs; however, it is very difficult to establish exact criteria for determining abnormal behavior. Machine learning methods are therefore used in this study for detecting abnormal jobs. This paper deals with an important aspect of working with machine learning methods, namely data preparation. The solution proposed herein was evaluated on the Lomonosov-2 supercomputer.
The issue of optimal input data selection is one of the key steps for transferring the methods suggested in the paper to other supercomputers. The analysis described in the article has served as a starting point for developing a methodology for applying overall solutions to other supercomputers, which is also described in this paper.