WO2024155206A1

WO2024155206A1 - Method and system for detecting anomalous interaction between nodes in a computer network

Info

Publication number: WO2024155206A1
Application number: PCT/RU2023/000390
Authority: WO
Inventors: Кирилл Евгеньевич ВЫШЕГОРОДЦЕВ; Александр Михайлович Кузьмин; Иван Григорьевич НАГОРНОВ; Дмитрий Владимирович СМИРНОВ
Original assignee: Публичное Акционерное Общество "Сбербанк России"
Priority date: 2023-01-16
Filing date: 2023-12-22
Publication date: 2024-07-25

Abstract

The invention relates to monitoring the operation of computer network nodes. The present method includes the steps of: a) obtaining data about the exchange of messages between computer network nodes; b) generating a graph on the basis of the obtained data; c) determining, with the aid of said graph, the values of the shortest paths between all of the computer network nodes; d) generating a vector representation for each computer network node on the basis of the shortest path values; e) reducing the dimension of the vector representations; f) clustering the computer network nodes on the basis of the vector representations obtained in step e); g) generating a first model of interaction between the computer network nodes, in which the nodes are grouped according to the clustering performed; h) generating a second model of interaction between the computer network nodes by iteratively repeating steps a) - g); i) comparing the models of interaction between the computer network nodes and identifying, in the process, the computer network nodes in the different groups; and j) generating a signal representing the identifiers of the computer network nodes found during the comparison performed in step i) to exhibit anomalous interaction. This increases the efficiency and speed of detecting anomalous interaction between computer network nodes.

Description

СПОСОБ И СИСТЕМА ВЫЯВЛЕНИЯ АНОМАЛЬНОГО ВЗАИМОДЕЙСТВИЯ УЗЛОВ ИНФОРМАЦИОННО-ВЫЧИСЛИТЕЛЬНОЙ СЕТИ METHOD AND SYSTEM FOR DETECTING ANOMALIC INTERACTION OF INFORMATION COMPUTING NETWORK NODES

ОБЛАСТЬ ТЕХНИКИ TECHNICAL FIELD

[0001] Заявленное решение относится к области компьютерной техники, в частности, к решениям для мониторинга работы узлов информационно-вычислительной сети (ИВС) и определения аномального взаимодействия между узлами. [0001] The claimed solution relates to the field of computer technology, in particular, to solutions for monitoring the operation of information computer network (ICN) nodes and determining abnormal interactions between nodes.

УРОВЕНЬ ТЕХНИКИ BACKGROUND OF THE ART

[0002] Из уровня техники известен способ анализа узлов ИВС для оценки возможных аномальных явлений в работе ИВС (US 20200021607 А1, 16.01.2020). Решение раскрывает платформу безопасности для обнаружения аномалий и угроз в среде компьютерной сети. Платформа безопасности основана на “больших данных” (Big data) и использует машинное обучение для выполнения аналитики безопасности. Платформа безопасности выполняет анализ поведения пользователя /объекта (UEBA) для обнаружения аномалий и угроз, связанных с безопасностью, независимо от того, были ли такие аномалии / угрозы известны ранее. Выполняется анализ путей и режимов обнаружения аномалий и угроз в реальном времени, с последующей оценкой рисков сетевой безопасности узлов. [0002] A method of analyzing IVS nodes to assess possible anomalous phenomena in the operation of IVS is known from the prior art (US 20200021607 A1, 01/16/2020). The solution exposes a security platform for detecting anomalies and threats in a computer network environment. The security platform is based on Big Data and uses machine learning to perform security analytics. The security platform performs User/Entity Behavior Analysis (UEBA) to detect security anomalies and threats, regardless of whether such anomalies/threats were previously known. An analysis of paths and modes for detecting anomalies and threats in real time is carried out, followed by an assessment of the network security risks of nodes.

[0003] Недостатками известного подхода является его недостаточная эффективность в части выявления аномальной работы узлов ИВС, в связи с тем, что не выполняется анализ активности узлов в части их взаимодействия между собой и использования данной информации для формирования моделей поведения узлов ИВС. [0003] The disadvantages of the known approach are its insufficient effectiveness in identifying abnormal operation of IVS nodes, due to the fact that the activity of nodes is not analyzed in terms of their interaction with each other and the use of this information to form behavior models of IVS nodes.

СУЩНОСТЬ ИЗОБРЕТЕНИЯ SUMMARY OF THE INVENTION

[0004] Заявленное изобретение направлено на решение технической проблемы в части создания более эффективного подхода в анализе аномального взаимодействия между узлами в ИВС. [0004] The claimed invention is aimed at solving a technical problem in terms of creating a more effective approach to analyzing anomalous interactions between nodes in an IVS.

[0005] Техническим результатом является повышение эффективности и скорости выявления аномального взаимодействия между узлами ИВС, за счет постоянного мониторинга количества сообщений между узлами в рамках заданного кластера. [0005] The technical result is to increase the efficiency and speed of identifying anomalous interaction between IVS nodes, due to constant monitoring of the number of messages between nodes within a given cluster.

[0006] Заявленный технический результат достигается за счет способа выявления аномального взаимодействия узлов информационно-вычислительной сети (ИВС), который содержит этапы, на которых: a) получают данные обмена сообщениями между узлами ИВС, при этом данные содержат по меньшей мере информацию о количестве сообщений, передаваемых между упомянутыми узлами; b) формируют граф на основании полученных данных, в котором вершинами являются идентификаторы узлов ИВС, а ребрами - факт обмена сообщениями между узлами, при этом каждое ребро имеет вес, характеризующий интенсивность обмена сообщениями между соответствующими узлами; c) определяют с помощью полученного графа значения кратчайших путей между всеми узлами ИВС; d) формируют для каждого узла ИВС векторное представление на основании полученных значений кратчайших путей; e) понижают размерность полученных векторных представлений; f) выполняют кластеризацию узлов ИВС на основании векторных представлений, полученных на этапе е); g) формируют первую модель взаимодействия узлов ИВС, в которой узлы определены в группы в соответствии с выполненной кластеризацией; h) формируют вторую модель взаимодействия узлов ИВС с помощью итеративного повторения этапов а) - g); i) выполняют сравнение первой и второй моделей взаимодействия узлов ИВС, в ходе которого выявляют узлы ИВС, находящиеся в различных группах; и j) формируют сигнал, характеризующий идентификаторы узлов ИВС, демонстрирующих аномальное взаимодействие, в ходе выполненного сравнения на этапе i). [0006] The claimed technical result is achieved through a method for detecting anomalous interaction of nodes of an information and computer network (ICN), which contains stages in which: a) receive data on the exchange of messages between the nodes of the IVS, wherein the data contains at least information about the number of messages transmitted between the mentioned nodes; b) form a graph based on the received data, in which the vertices are the identifiers of the IVS nodes, and the edges are the fact of message exchange between nodes, and each edge has a weight characterizing the intensity of message exchange between the corresponding nodes; c) using the resulting graph, determine the values of the shortest paths between all nodes of the IVS; d) form a vector representation for each IVS node based on the obtained values of the shortest paths; e) reducing the dimension of the resulting vector representations; f) perform clustering of IVS nodes based on the vector representations obtained at step e); g) form the first model of interaction of IVS nodes, in which the nodes are defined in groups in accordance with the performed clustering; h) form a second model of interaction between IVS nodes using iterative repetition of stages a) - g); i) perform a comparison of the first and second models of interaction between IVS nodes, during which IVS nodes located in different groups are identified; and j) generate a signal characterizing the identifiers of the IVS nodes demonstrating anomalous interaction during the comparison performed at stage i).

[0007] В одном из частных вариантов осуществления способа вес ребра представляет собой величину, обратную количеству сообщений между узлами ИВС в единицу времени. [0007] In one of the particular embodiments of the method, the weight of the edge is the reciprocal of the number of messages between the IVS nodes per unit of time.

[0008] В другом частном варианте осуществления способа этап е) выполняется с помощью алгоритма машинного обучения. [0008] In another particular embodiment of the method, step e) is performed using a machine learning algorithm.

[0009] В другом частном варианте осуществления способа векторное представление узлов характеризует накопленную ретроспективную информацию о взаимодействии узла. [0009] In another particular embodiment of the method, the vector representation of nodes characterizes accumulated historical information about the interaction of the node.

[0010] Заявленный результат также достигается с помощью системы выявления аномального взаимодействия узлов ИВС, при этом система содержит по меньшей мере один процессор и по меньшей мере одну память, которая хранит машиночитаемые инструкции, которые при их исполнении процессором осуществляют вышеуказанный способ. КРАТКОЕ ОПИСАНИЕ ЧЕРТЕЖЕЙ [0010] The claimed result is also achieved using a system for detecting anomalous interaction of IVS nodes, wherein the system contains at least one processor and at least one memory that stores machine-readable instructions, which, when executed by the processor, implement the above method. BRIEF DESCRIPTION OF THE DRAWINGS

[0011] Фиг. 1 иллюстрирует блок-схему выполнения заявленного способа. [0011] FIG. 1 illustrates a block diagram of the claimed method.

[0012] Фиг. 2 иллюстрирует пример графа, сформированного на основании информации об узлах ИВС. [0012] FIG. Figure 2 illustrates an example of a graph generated based on information about IVS nodes.

[0013] Фиг. 3 иллюстрирует схемы вычислительной системы. [0013] FIG. 3 illustrates the circuit diagrams of the computing system.

ОСУЩЕСТВЛЕНИЕ ИЗОБРЕТЕНИЯ IMPLEMENTATION OF THE INVENTION

[0014] На Фиг. 1 представлена блок-схема выполнения заявленного способа (100) определения аномального взаимодействия узлов ИВС. На первом этапе (101) выполняется сбор данных об узлах ИВС, в частности, такими данными могут выступать идентификаторы устройств внутри сети, IP-адреса, МАС-адреса и т.п. Сбор данных может осуществляться с помощью автоматизированных решений, например, модулей-сборщиков данных, или с помощью программных компонент, или приложений, осуществляющих мониторинг ИВС. [0015] Далее по факту собранных данных об узлах ИВС на этапе (102) формируется граф (200), отображающий модель взаимодействия узлов. На Фиг. 2 приведен пример формируемого графа (200) между узлами (201) - (206). Идентификаторы узлов формируют вершины графа, в то время как ребра характеризуют факт обмена сообщениями между узлами. Каждое ребро имеет вес, который зависит интенсивности обмена сообщениями между узлами (201) - (206) за единицу времени. Как представлено в примере на Фиг. 2 количество сообщений, передаваемых между узлами (201)-(206), имеет следующий вид: 201-202: 4; 201-203: 5; 201-204: 2; 202-201 : 8; 202-206: 14; 203-201 : 3; 203-205: 2; 204-201: 5; 204-206: 3; 204-205: 7; 205-204: 12; 205-203: 4; 205-206: 8; 206-202: 6; 206-204: 1; 206- 205: 5. [0014] In FIG. Figure 1 shows a block diagram of the implementation of the claimed method (100) for determining the anomalous interaction of IVS nodes. At the first stage (101), data about the IVS nodes is collected, in particular, such data can be device identifiers within the network, IP addresses, MAC addresses, etc. Data collection can be carried out using automated solutions, for example, data collector modules, or using software components or applications that monitor IVS. [0015] Next, based on the collected data about the IVS nodes, at stage (102), a graph (200) is generated that displays the node interaction model. In FIG. Figure 2 shows an example of the generated graph (200) between nodes (201) - (206). Node identifiers form the vertices of the graph, while edges characterize the fact that messages are exchanged between nodes. Each edge has a weight that depends on the intensity of message exchange between nodes (201) - (206) per unit of time. As shown in the example in FIG. 2, the number of messages transmitted between nodes (201)-(206) is as follows: 201-202: 4; 201-203: 5; 201-204: 2; 202-201: 8; 202-206: 14; 203-201: 3; 203-205: 2; 204-201: 5; 204-206: 3; 204-205: 7; 205-204: 12; 205-203: 4; 205-206: 8; 206-202: 6; 206-204: 1; 206-205: 5.

[0016] Характеристики интенсивности можно представить, как величину, обратную количеству сообщений от узла к узлу. При этом, если сообщений между узлами нет, то можно произвести добавление связи между такими узлами в данный граф (200). За количество сообщений между такими узлами можно определить величину равную l/(sum(N)), где N - количество сообщений, передаваемых между узлами. Тогда количество сообщений между такими узлами будет равно: 1 / (4+5+2+8+14+...) = 1/89 = 0,01123. [0016] Intensity characteristics can be thought of as the inverse of the number of messages from node to node. Moreover, if there are no messages between nodes, then you can add connections between such nodes to this graph (200). For the number of messages between such nodes, one can determine a value equal to l/(sum(N)), where N is the number of messages transmitted between nodes. Then the number of messages between such nodes will be equal to: 1 / (4+5+2+8+14+...) = 1/89 = 0.01123.

[0017] На основании полученных данных формируется матричное представления связного графа, представленное в таблице 1. Для описания вышеуказанного графа (200) можно представить его в виде матрицы, наследуемой от матрицы смежности. В таком представлении каждая (i, j) позиция будет соответствовать связи между i и j узлами ИВС. Значениями в матрице будут расстояния dij, которые характеризую длину пути в графе от i-ro до j-ro узла. Таблица 1. Матричное представление связного графа

[0017] Based on the obtained data, a matrix representation of a connected graph is formed, presented in Table 1. To describe the above graph (200), it can be represented as a matrix inherited from the adjacency matrix. In this representation, each (i, j) position will correspond to the connection between the i and j nodes of the IVS. The values in the matrix will be the distances dij, which characterize the length of the path in the graph from i-ro to j-ro node. Table 1. Matrix representation of a connected graph

[0018] На этапе (103) выполняется определение кратчайших путей на основании сформированного графа и матричного представления графа связанности. Поиск кратчайших путей может выполняться с помощью любого алгоритма поиска кратчайших расстояний между вершинами графа, например, Веллмана, Флойда-Уоршелла, Дейкстры, Джонсона и т.п. Пример значений кратчайших путей для некоторых узлов представлен в Таблице 2. [0018] At step (103), the shortest paths are determined based on the generated graph and the matrix representation of the connectivity graph. The search for shortest paths can be performed using any algorithm for finding the shortest distances between vertices of a graph, for example, Wellman, Floyd-Warshell, Dijkstra, Johnson, etc. An example of the shortest path values for some nodes is presented in Table 2.

Таблица 2. Значения кратчайших путей между узлами ИВС

Table 2. Values of the shortest paths between IVS nodes

[0019] Далее на этапе (104) для каждого узла (201) - (206) на основании полученных значений кратчайших путей формируется векторное представление, которое может иметь следующий вид: [0019] Next, at step (104) for each node (201) - (206), based on the obtained values of the shortest paths, a vector representation is formed, which can have the following form:

- Для узла (201) вектор = (3/8; 1/4; 1/5; 1/2; 9/14; 9/28); - For node (201) vector = (3/8; 1/4; 1/5; 1/2; 9/14; 9/28);

- Для узла (202) вектор = (1/8; 5/11; 13/40; 15/14; 1/35; 1/14). - For node (202) vector = (1/8; 5/11; 13/40; 15/14; 1/35; 1/14).

[0020] Поскольку ширина матрицы не фиксирована, то для перехода к векторному представлению связей узлов с заданным размером на этапе (105) формируется пространство меньшей размерности. Для этого могут применяться такие методы как: матричное разложение, SVD (сингулярное матричное разложение), РСА (метод главных компонент), IncrementalPCA, KemelPCA, SparsePCA (Анализ разреженных основных компонентов), MiniBatchSparsePCA, ICA (независимый компонентный анализ), NMF или NNMF (неотрицательная матричная факторизация), LDA (Скрытое распределение Дирихле), FactorAnalysis (Факторный анализ), K-means квантизация для размерностей (К-средних), SOM для размерностей (Самоорганизующаяся карта Кохонена), LVQ (квантование векторов обучения), t- SNE (T-distributed Stochastic Neighbor Embedding), UMAP (Uniform Manifold Approximation and Projection), Автоэнкодеры. Каждая характеристика в таком векторе будет характеризовать соответствие некоторой латентной связанности. Под латентной связанностью понимается скрытая (латентная), неявная связь узла с другими узлами по нескольким параметрам в совокупности. В этом случае каждый параметр вектора характеризует связь узла не только с одним выбранным узлом, а с группой узлов. При этом формирование таких групп производится на основе схожести узлов (например, функциональное назначение узла, серверы управления и т.п.). [0020] Since the width of the matrix is not fixed, in order to transition to a vector representation of connections of nodes with a given size, a space of a lower dimension is formed at step (105). For this purpose, methods such as matrix decomposition, SVD (singular matrix decomposition), PCA (principal component analysis), IncrementalPCA, KemelPCA, SparsePCA (Sparse Principal Component Analysis), MiniBatchSparsePCA, ICA (Independent Component Analysis), NMF or NNMF ( non-negative matrix factorization), LDA (Latent Dirichlet Allocation), FactorAnalysis (Factor Analysis), K-means quantization for dimensions (K-means), SOM for dimensions (Self Organizing Map of Kohonen), LVQ (Learning Vector Quantization), t-SNE (T-distributed Stochastic Neighbor Embedding ), UMAP (Uniform Manifold Approximation and Projection), Autoencoders. Each characteristic in such a vector will characterize the correspondence to some latent connectivity. Latent connectivity is understood as a hidden (latent), implicit connection of a node with other nodes along several parameters in the aggregate. In this case, each vector parameter characterizes the connection of a node not only with one selected node, but with a group of nodes. In this case, the formation of such groups is carried out on the basis of the similarity of nodes (for example, the functional purpose of the node, control servers, etc.).

[0021] Векторное представление для узлов (201) - (206) с пониженной размерностью может иметь следующий вид: [0021] The vector representation for the reduced dimensionality nodes (201) - (206) may be as follows:

201 (15/12 7/8) 201 (15/12 7/8)

202 (7/5 43/17) 202 (7/5 43/17)

203 (13/12 9/8) 203 (13/12 9/8)

204 (31/11 23/12) 204 (31/11 23/12)

205 (7/3 5/2) 205 (7/3 5/2)

206 (3/2 3/2) 206 (3/2 3/2)

[0022] На основании полученных на этапе (105) векторных представлений пониженной размерности выполняется дальнейшая кластеризация узлов ИВС на этапе (106). Кластеризация выполняется автоматически с учетом взаимодействия узлов в заданном временном промежутке. Для решения поставленной задачи могут применяться саморегулируемые алгоритмы (неконтролируемые) без начального задания количества кластеров, с учетом природы нормального или иного распределения. [0022] Based on the reduced-dimensional vector representations obtained at step (105), further clustering of IVS nodes is performed at step (106). Clustering is performed automatically taking into account the interaction of nodes in a given time period. To solve the problem, self-regulating algorithms (uncontrolled) can be used without initially specifying the number of clusters, taking into account the nature of a normal or other distribution.

[0023] Может применяться модель гауссовой смеси, которая предполагает, что данные должны быть разделены на кластеры таким образом, чтобы каждая точка данных в данном кластере соответствовала определенному многовариантному распределению Гаусса, а распределения многомерного гаусса каждого кластера не зависели друг от друга. Чтобы кластеризовать данные в такой модели, необходимо рассчитать апостериорную вероятность точки данных, принадлежащей данному кластеру с учетом наблюдаемых данных. Примерным методом для этой цели является метод Байеса. Поскольку существует только необходимость найти наиболее вероятный кластер для данной точки, можно использовать методы аппроксимации, т.к. они уменьшают вычислительную работу. Одним из лучших приближенных методов является использование метода вариационного байесовского вывода. Вариационное байевское гауссовское смешение - это максимизации математического ожидания, которое максимизирует нижнюю границу параметров модели (включая априорные вероятности) вместо вероятности данных. [0023] A Gaussian mixture model may be used, which assumes that the data should be divided into clusters such that each data point in a given cluster follows a specific multivariate Gaussian distribution, and the multivariate Gaussian distributions of each cluster are independent of each other. To cluster data in such a model, it is necessary to calculate the posterior probability of a data point belonging to a given cluster given the observed data. An example method for this purpose is the Bayesian method. Since there is only a need to find the most probable cluster for a given point, one can use approximation methods, because they reduce computational work. One of the best approximation methods is to use variational Bayesian inference. Variational Baye Gaussian mixing is an expectation maximization that maximizes the lower bound of the model parameters (including prior probabilities) instead of the probability of the data.

[0024] Принцип, лежащий в основе вариационных методов, такой же, как и максимизация ожидания (то есть оба являются итерационными алгоритмами, которые чередуются между нахождением вероятностей для каждой точки, которая должна быть сгенерирована каждой смесью, и подгонкой смеси к этим назначенным точкам), но вариационные методы добавляют регуляризацию с интеграцией информации из предыдущих распределений. Это позволяет избежать особенностей, часто встречающихся в решениях максимизации ожидания, но вносит в модель некоторые тонкие искажения. Вывод часто происходит значительно медленнее, но обычно не настолько, чтобы сделать его использование нецелесообразным. [0024] The principle behind variational methods is the same as expectation maximization (that is, both are iterative algorithms that alternate between finding probabilities for each point that each mixture should generate and fitting the mixture to those designated points) , but variational methods add regularization by integrating information from previous distributions. This avoids the features often found in expectation maximization solutions, but introduces some subtle biases into the model. Inference is often significantly slower, but usually not so much as to make its use impractical.

[0025] Из-за своей байесовской природы вариационный алгоритм требует больше гиперпараметров, чем максимизация математического ожидания, наиболее важным из которых является параметр концентрации. Задание низкого значения для концентрации заставит модель определить большую часть веса на несколько узлов ИВС, а веса остальных узлов в кластере будут очень близки к нулю. Высокие значения концентрации позволят большему количеству узлов быть в кластере. [0025] Because of its Bayesian nature, the variational algorithm requires more hyperparameters than expectation maximization, the most important of which is the concentration parameter. Setting concentration to a low value will cause the model to assign most of the weight to a few IVS nodes, and the weights of the remaining nodes in the cluster will be very close to zero. High concentration values will allow more nodes to be in the cluster.

[0026] Для моделирования концентрации применяется распределение Дирихле. Предпроцесс Дирихле - это априорное распределение вероятностей для кластеризации с бесконечным неограниченным числом разбиений. Вариационные методы позволяют включить эту априорную структуру в модели гауссовой смеси практически без потери времени расчёта по сравнению с моделью конечной гауссовой смеси. Большее значение концентрации будет заставлять формироваться более плотные кластера. [0026] The Dirichlet distribution is used to model the concentration. The Dirichlet preprocess is a prior probability distribution for clustering with an infinite unlimited number of partitions. Variational methods make it possible to incorporate this a priori structure into Gaussian mixture models with virtually no loss of calculation time compared to a finite Gaussian mixture model. A higher concentration value will cause denser clusters to form.

[0027] Предпроцесс Дирихле может использовать бесконечные и неограниченное число кластеров. Основа алгоритма заключается в итеративном процессе выборки объектов и основывается на подходе «ломка палки» (Stick-breaking). Начинается с полной выборки и на каждом шаге отделяется его часть. Каждый раз выполняется связывание узлов из выборки, которые попадают в кластера. В конце выполняется связь узлов, не попадающих во все другие группы. В некоторых задачах это формирует определённый недостаток, который связан с тем, что все несвязанные точки (узлы графа) будут объединены в один общий, мусорный кластер. Однако этот кластер можно интерпретировать, как «кластер различных узлов», в котором узлы объединены по физическому смыслу в один кластер не потому что имеют схожесть между собой по характеру соединений, а потому что имеют признак схожести между собой в части несхожести с узлами из других кластеров. [0027] The Dirichlet preprocess can use an infinite and unlimited number of clusters. The basis of the algorithm is an iterative process of selecting objects and is based on the “Stick-breaking” approach. It starts with the full sample and at each step a part of it is separated. Each time, the nodes from the selection that fall into the cluster are linked. At the end, a connection is made between nodes that do not fall into all other groups. In some problems, this creates a certain disadvantage, which is associated with the fact that all unconnected points (graph nodes) will be combined into one common, garbage cluster. However, this cluster can be interpreted as a “cluster of different nodes”, in which the nodes are physically united into one cluster not because they are similar to each other in terms of the nature of connections, and because they have a sign of similarity to each other in terms of dissimilarity with nodes from other clusters.

[0028] Пример разбиения на кластера узлов (201) - (206) может иметь следующий вид: Кластер !: (201, 203) Кластер II: (202) Кластер III: (205, 206) Кластер IV: (204). [0028] An example of partitioning into clusters of nodes (201) - (206) may look like this: Cluster !: (201, 203) Cluster II: (202) Cluster III: (205, 206) Cluster IV: (204).

[0029] На этапе (107) по итогам выполненной кластеризации создается первая модель взаимодействия узлов ИВС, отображающая определение узлов в группе в части их взаимодействия между собой. [0029] At step (107), based on the results of the completed clustering, the first model of interaction of IVS nodes is created, displaying the definition of nodes in the group in terms of their interaction with each other.

[0030] Далее на этапе (108) алгоритм осуществляет итеративное выполнение этапов (101)-(107) через заданный временной отрезок (например, день, неделя, и т.п.), по итогу чего формируется вторая модель взаимодействия узлов ИВС, которая может иметь следующий вид распределения кластеров: [0030] Next, at step (108), the algorithm iteratively executes steps (101)-(107) through a given time period (for example, a day, a week, etc.), as a result of which a second model of interaction of IVS nodes is formed, which may have the following form of cluster distribution:

Кластер I: (206, 205) Cluster I: (206, 205)

Кластер II: (201, 203, 202) Cluster II: (201, 203, 202)

Кластер III: (204). Cluster III: (204).

[0031] На этапе (109) выполняется сравнение полученных второй и первой моделей взаимодействия узлов ИВС для выявления узлов, которые во второй модели поменяли свое размещение, что может говорить об аномальном взаимодействии с другими узлами ИВС. Из примера выше видно, что узел (202) попал в кластер с узлами (201), (203). [0031] At step (109), the obtained second and first models of interaction between IVS nodes are compared to identify nodes that have changed their placement in the second model, which may indicate an anomalous interaction with other IVS nodes. From the example above it is clear that node (202) is in a cluster with nodes (201), (203).

[0032] Пример такого поведение можно рассмотреть в следующем случае. Пусть узлы (201) и (203) это серверы с операционной системой Windows, узлы (205) и (206) это серверы с операционной системой Ubuntu, а узлы (202) и (204) это клиентские ЭВМ. В первой модели в один кластер попали узлы с операционной системой Windows, в другой кластер узлы с операционной системой Ubuntu. Клиентские ЭВМ не относятся ни к какому кластеру (в другом примере клиентские ЭВМ могут объединяться в один или несколько общих кластеров). При построении следующей модели клиентский ЭВМ (202) попал в кластер к серверным узлам. Это может свидетельствовать о том, что клиентский ЭВМ заражен вирусом и начал производить активную рассылку широковещательных или иных сообщений, что свойственно для серверов. С помощью заявленного способа осуществляется возможность оперативного реагирования на такого рода аномалию и формирования сигнала о появлении аномалии в ИВС для узла (202). [0032] An example of this behavior can be considered in the following case. Let nodes (201) and (203) be servers with the Windows operating system, nodes (205) and (206) be servers with the Ubuntu operating system, and nodes (202) and (204) be client computers. In the first model, one cluster included nodes with the Windows operating system, and another cluster included nodes with the Ubuntu operating system. Client computers do not belong to any cluster (in another example, client computers can be combined into one or more common clusters). When building the next model, the client computer (202) got into the cluster with the server nodes. This may indicate that the client computer is infected with a virus and has begun to actively send broadcasts or other messages, which is typical for servers. Using the claimed method, it is possible to quickly respond to this kind of anomaly and generate a signal about the appearance of an anomaly in the IVS for node (202).

[0033] По факту выполнения этапа (109) на этапе (НО) выполняется формирования сигнала с помощью системы контроля, оповещающего о факте аномального взаимодействия для определенного узла ИВС. Система контроля формирует сигнал, передаваемый, как правило, на устройство ответственного сотрудника службы кибербезопасности. Дополнительно может применяться изолирование выявленного узла, в части его отключения в сети ИВС. [0033] Upon completion of step (109), at step (NO), a signal is generated using a monitoring system notifying about the fact of an anomalous interaction for a specific IVS node. The control system generates a signal, which is usually transmitted to the device of the responsible cybersecurity officer. Additionally, isolation of the identified node can be used, in terms of its disconnection in the IVS network.

[0034] Предлагаемый подход позволяет постоянно формировать ретроспективный «портрет» нормальной работы узлов ИВС. Данный «портрет» будет представлять собой накопление (интеграцию) множества предыдущих состояний хоста и отражает его изменение в истории. Подобная характеристика является гибкой к настройке, может концентрировать внимание на последних состояниях или наоборот, быть более консервативной. Интегральный исторический портрет позволит производить сравнение текущего, вновь получаемого состояния хоста с его предыдущими состояниями, и выявлять резкие (аномальные) изменения в работе. Данное детектирование аномальной работы даст возможность быстро выявлять нехарактерные изменения в хосте и проводить соответствующие расследования. [0034] The proposed approach allows us to constantly form a retrospective “portrait” of the normal operation of IVS nodes. This “portrait” will represent the accumulation (integration) of many previous states of the host and reflect its change in history. This characteristic is flexible to customize and can focus attention on the latest states or, conversely, be more conservative. An integral historical portrait will allow you to compare the current, newly acquired state of the host with its previous states, and identify sudden (abnormal) changes in operation. This abnormal operation detection will make it possible to quickly identify unusual changes in the host and conduct appropriate investigations.

[0035] На Фиг. 3 представлен общий вид вычислительной системы, реализованной на базе вычислительного устройства (300) и обеспечивающей выполнение заявленного способа (100). В общем случае, вычислительное устройство (300) содержит объединенные общей шиной информационного обмена один или несколько процессоров (301), средства памяти, такие как ОЗУ (302) и ПЗУ (303), интерфейсы ввода/вывода (304), устройства ввода/вывода (305), и устройство для сетевого взаимодействия (306). [0035] In FIG. 3 shows a general view of a computing system implemented on the basis of a computing device (300) and ensuring the implementation of the claimed method (100). In general, a computing device (300) contains one or more processors (301), memory devices such as RAM (302) and ROM (303), input/output interfaces (304), and input/output devices connected by a common data exchange bus. (305), and a networking device (306).

[0036] Процессор (301) (или несколько процессоров, многоядерный процессор) могут выбираться из ассортимента устройств, широко применяемых в текущее время, например, компаний Intel™, AMD™, Apple™, Samsung Exynos™, MediaTEK™, Qualcomm Snapdragon™ и т.п. Под процессором также необходимо учитывать графический процессор, например, GPU NVIDIA или ATI, который также является пригодным для полного или частичного выполнения способа (100). При этом, средством памяти может выступать доступный объем памяти графической карты или графического процессора. [0036] The processor (301) (or multiple processors, multi-core processor) may be selected from a variety of devices commonly used today, such as those from Intel™, AMD™, Apple™, Samsung Exynos™, MediaTEK™, Qualcomm Snapdragon™ and etc. By processor it is also necessary to take into account a graphics processor, for example an NVIDIA or ATI GPU, which is also suitable for carrying out the method (100) in whole or in part. In this case, the memory means can be the available memory capacity of the graphics card or graphics processor.

[0037] ОЗУ (302) представляет собой оперативную память и предназначено для хранения исполняемых процессором (301) машиночитаемых инструкций для выполнение необходимых операций по логической обработке данных. ОЗУ (302), как правило, содержит исполняемые инструкции операционной системы и соответствующих программных компонент (приложения, программные модули и т.п.). [0037] RAM (302) is a random access memory and is designed to store computer-readable instructions executable by the processor (301) to perform the necessary logical data processing operations. The RAM (302) typically contains executable operating system instructions and associated software components (applications, program modules, etc.).

[0038] ПЗУ (303) представляет собой одно или более устройств постоянного хранения данных, например, жесткий диск (HDD), твердотельный накопитель данных (SSD), флэш- память (EEPROM, NAND и т.п.), оптические носители информации (CD-R/RW, DVD- R/RW, BlueRay Disc, MD) и др. [0038] ROM (303) is one or more permanent storage devices, such as a hard disk drive (HDD), solid state drive (SSD), flash memory (EEPROM, NAND, etc.), optical storage media (CD-R/RW, DVD-R/RW, BlueRay Disc, MD), etc.

[0039] Для организации работы компонентов устройства (300) и организации работы внешних подключаемых устройств применяются различные виды интерфейсов В/В (304). Выбор соответствующих интерфейсов зависит от конкретного исполнения вычислительного устройства, которые могут представлять собой, не ограничиваясь: PCI, AGP, PS/2, IrDa, FireWire, LPT, COM, SATA, IDE, Lightning, USB (2.0, 3.0, 3.1, micro, mini, type C), TRS/Audio jack (2.5, 3.5, 6.35), HDMI, DVI, VGA, Display Port, RJ45, RS232 и т.п. [0039] To organize the operation of device components (300) and organize the operation of external connected devices, various types of I/O interfaces (304) are used. The choice of appropriate interfaces depends on the specific design of the computing device, which can be, but is not limited to: PCI, AGP, PS/2, IrDa, FireWire, LPT, COM, SATA, IDE, Lightning, USB (2.0, 3.0, 3.1, micro, mini, type C), TRS/Audio jack (2.5, 3.5, 6.35), HDMI, DVI, VGA, Display Port, RJ45, RS232, etc.

[0040] Для обеспечения взаимодействия пользователя с вычислительным устройством (300) применяются различные средства (305) В/В информации, например, клавиатура, дисплей (монитор), сенсорный дисплей, тач-пад, джойстик, манипулятор мышь, световое перо, стилус, сенсорная панель, трекбол, динамики, микрофон, средства дополненной реальности, оптические сенсоры, планшет, световые индикаторы, проектор, камера, средства биометрической идентификации (сканер сетчатки глаза, сканер отпечатков пальцев, модуль распознавания голоса) и т.п. [0040] To ensure user interaction with the computing device (300), various means (305) of I/O information are used, for example, a keyboard, a display (monitor), a touch display, a touch pad, a joystick, a mouse, a light pen, a stylus, touch panel, trackball, speakers, microphone, augmented reality tools, optical sensors, tablet, light indicators, projector, camera, biometric identification tools (retina scanner, fingerprint scanner, voice recognition module), etc.

[0041] Средство сетевого взаимодействия (306) обеспечивает передачу данных устройством (300) посредством внутренней или внешней вычислительной сети, например, Интранет, Интернет, ЛВС и т.п. В качестве одного или более средств (306) может использоваться, но не ограничиваться: Ethernet карта, GSM модем, GPRS модем, LTE модем, 5G модем, модуль спутниковой связи, NFC модуль, Bluetooth и/или BLE модуль, Wi-Fi модуль и др. [0041] The network communication facility (306) allows the device (300) to transmit data via an internal or external computer network, such as an Intranet, Internet, LAN, or the like. One or more means (306) may be used, but not limited to: Ethernet card, GSM modem, GPRS modem, LTE modem, 5G modem, satellite communication module, NFC module, Bluetooth and/or BLE module, Wi-Fi module and etc.

[0042] Дополнительно могут применяться также средства спутниковой навигации в составе устройства (300), например, GPS, ГЛОНАСС, BeiDou, Galileo. [0042] Additionally, satellite navigation tools can also be used as part of the device (300), for example, GPS, GLONASS, BeiDou, Galileo.

[0043] Представленные материалы заявки раскрывают предпочтительные примеры реализации технического решения и не должны трактоваться как ограничивающие иные, частные примеры его воплощения, не выходящие за пределы испрашиваемой правовой охраны, которые являются очевидными для специалистов соответствующей области техники. [0043] The submitted application materials disclose preferred examples of implementation of a technical solution and should not be interpreted as limiting other, particular examples of its implementation that do not go beyond the scope of the requested legal protection, which are obvious to specialists in the relevant field of technology.

Claims

FORMULA

1. A method for detecting anomalous interaction between nodes of an information computer network (ICS), comprising the steps of: a) receiving data exchanged between the nodes of the ICS, wherein the data contains at least information about the number of messages transmitted between the mentioned nodes; b) form a graph based on the received data, in which the vertices are the identifiers of the IVS nodes, and the edges are the fact of message exchange between nodes, and each edge has a weight characterizing the intensity of message exchange between the corresponding nodes; c) using the resulting graph, determine the values of the shortest paths between all nodes of the IVS; d) form a vector representation for each IVS node based on the obtained values of the shortest paths; e) reducing the dimension of the resulting vector representations; f) perform clustering of IVS nodes based on the vector representations obtained at step e); g) form the first model of interaction of IVS nodes, in which the nodes are defined in groups in accordance with the performed clustering; h) form a second model of interaction between IVS nodes using iterative repetition of stages a) - g); i) perform a comparison of the first and second models of interaction between IVS nodes, during which IVS nodes located in different groups are identified; and j) generate a signal characterizing the identifiers of the IVS nodes demonstrating anomalous interaction during the comparison performed at stage i).

2. The method according to claim 1, in which the weight of the edge is the reciprocal of the number of messages between the IVS nodes per unit of time.

3. The method according to claim 1, wherein step e) is performed using a machine learning algorithm.

4. The method according to claim 1, in which the vector representation of the nodes characterizes the accumulated retrospective information about the interaction of the node.

5. A system for detecting anomalous interaction of IVS nodes, containing at least one processor and at least one memory that stores machine-readable instructions, which, when executed by a processor, implement the method according to any one of claims. 1-4.