CN118132303A - Cloud service equipment detection method, device, equipment and readable storage medium - Google Patents
Cloud service equipment detection method, device, equipment and readable storage medium Download PDFInfo
- Publication number
- CN118132303A CN118132303A CN202410207997.1A CN202410207997A CN118132303A CN 118132303 A CN118132303 A CN 118132303A CN 202410207997 A CN202410207997 A CN 202410207997A CN 118132303 A CN118132303 A CN 118132303A
- Authority
- CN
- China
- Prior art keywords
- detection
- cloud service
- item
- resource
- service device
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0709—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0766—Error or fault reporting or storing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/079—Root cause analysis, i.e. error or fault diagnosis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0793—Remedial or corrective actions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computer Hardware Design (AREA)
- Debugging And Monitoring (AREA)
Abstract
本申请实施例提供了一种云服务设备的检测方法、装置、设备及可读存储介质,通过获取云服务设备对应的检查清单,根据检查清单中包含的多项待检测的资源项配置云服务设备对应的设备检测实例;执行设备检测实例,从云服务设备采集每个资源项对应的资源参数,得到设备检测数据;按照多个检测项类别对设备检测数据进行结构化处理,得到结构化数据集,结构化数据集包括多个检测项类别以及检测项类别的特征值;针对检测项类别的特征值,对云服务设备中的资源项进行异常判定,得到资源项的检测结果;基于检测项类别的特征值以及资源项的检测结果,生成云服务设备的检测报告。本申请能够提高对云服务设备检测的效率和稳定性,且提高检测的准确性。
The embodiment of the present application provides a detection method, apparatus, device and readable storage medium for cloud service equipment, by obtaining a checklist corresponding to the cloud service equipment, configuring a device detection instance corresponding to the cloud service equipment according to multiple resource items to be detected contained in the checklist; executing the device detection instance, collecting resource parameters corresponding to each resource item from the cloud service equipment, and obtaining device detection data; performing structured processing on the device detection data according to multiple detection item categories, and obtaining a structured data set, the structured data set includes multiple detection item categories and feature values of the detection item categories; performing abnormal determination on the resource items in the cloud service equipment according to the feature values of the detection item categories, and obtaining detection results of the resource items; generating a detection report for the cloud service equipment based on the feature values of the detection item categories and the detection results of the resource items. The present application can improve the efficiency and stability of cloud service equipment detection, and improve the accuracy of detection.
Description
技术领域Technical Field
本申请涉及计算机技术领域,尤其涉及一种云服务设备的检测方法、装置、设备及可读存储介质。The present application relates to the field of computer technology, and in particular to a detection method, apparatus, device and readable storage medium for a cloud service device.
背景技术Background technique
设备检测可以是对包括云服务设备中包括云基础设施、服务器、存储设备、网络设备以及其他相关技术组件等进行定期检测和评估,设备检测旨在确保云服务设备的正常运行、安全性和稳定性,以及实时监测运行状况并及时发现和解决问题。Equipment detection can be the regular detection and evaluation of cloud service equipment, including cloud infrastructure, servers, storage devices, network equipment and other related technical components. Equipment detection aims to ensure the normal operation, security and stability of cloud service equipment, as well as to monitor the operating status in real time and promptly discover and resolve problems.
相关技术中,一般通过人工方式直接对云服务设备进行检测,效率低,且在进行异常判断时,需要相关技术人员基于检测数据进行分析判断,导致检测结果稳定性较差,准确性低。In the related art, cloud service equipment is generally tested directly by manual means, which is inefficient. In addition, when making abnormal judgments, relevant technical personnel are required to analyze and judge based on the detection data, resulting in poor stability and low accuracy of the detection results.
发明内容Summary of the invention
本申请实施例的主要目的在于提出一种云服务设备的检测方法、装置、设备及可读存储介质,能够提高对云服务设备检测的效率和稳定性,且提高检测的准确性。The main purpose of the embodiments of the present application is to propose a cloud service device detection method, apparatus, device and readable storage medium, which can improve the efficiency and stability of cloud service device detection and improve the accuracy of detection.
为实现上述目的,本申请实施例的第一方面提出了一种云服务设备的检测方法,所述方法包括:To achieve the above-mentioned purpose, a first aspect of an embodiment of the present application proposes a cloud service device detection method, the method comprising:
获取云服务设备对应的检查清单,并根据所述检查清单中包含的多项待检测的资源项配置所述云服务设备对应的设备检测实例;Obtaining a checklist corresponding to a cloud service device, and configuring a device detection instance corresponding to the cloud service device according to a plurality of resource items to be detected contained in the checklist;
通过执行所述设备检测实例,以从所述云服务设备采集每个资源项对应的资源参数,得到设备检测数据;By executing the device detection instance, resource parameters corresponding to each resource item are collected from the cloud service device to obtain device detection data;
按照多个检测项类别对所述设备检测数据进行结构化处理,得到结构化数据集,所述结构化数据集包括多个检测项类别以及每个检测项类别的特征值;Structuring the device detection data according to multiple detection item categories to obtain a structured data set, wherein the structured data set includes multiple detection item categories and feature values of each detection item category;
针对每个检测项类别的特征值,对所述云服务设备中的资源项进行异常判定,得到每个资源项的检测结果;According to the characteristic value of each detection item category, an abnormality determination is performed on the resource item in the cloud service device to obtain a detection result of each resource item;
基于所述每个检测项类别的特征值以及所述每个资源项的检测结果,生成所述云服务设备的检测报告。Based on the feature value of each detection item category and the detection result of each resource item, a detection report of the cloud service device is generated.
相应的,本申请实施例的第二方面提出了一种云服务设备的检测装置,所述装置包括:Accordingly, a second aspect of an embodiment of the present application provides a detection device for a cloud service device, the device comprising:
配置模块,用于获取云服务设备对应的检查清单,并根据所述检查清单中包含的多项待检测的资源项配置所述云服务设备对应的设备检测实例;A configuration module, used to obtain a checklist corresponding to a cloud service device, and configure a device detection instance corresponding to the cloud service device according to a plurality of resource items to be detected contained in the checklist;
执行模块,用于通过执行所述设备检测实例,以从所述云服务设备采集每个资源项对应的资源参数,得到设备检测数据;An execution module, configured to collect resource parameters corresponding to each resource item from the cloud service device by executing the device detection instance, and obtain device detection data;
处理模块,用于按照多个检测项类别对所述设备检测数据进行结构化处理,得到结构化数据集,所述结构化数据集包括多个检测项类别以及每个检测项类别的特征值;A processing module, configured to perform structured processing on the device detection data according to a plurality of detection item categories to obtain a structured data set, wherein the structured data set includes a plurality of detection item categories and a feature value of each detection item category;
判定模块,用于针对每个检测项类别的特征值,对所述云服务设备中的资源项进行异常运行判定,得到每个资源项的检测结果;A determination module, configured to determine abnormal operation of resource items in the cloud service device according to the characteristic value of each detection item category, and obtain a detection result of each resource item;
生成模块,用于基于所述每个检测项类别的特征值以及所述每个资源项的检测结果,生成所述云服务设备的检测报告。A generation module is used to generate a detection report for the cloud service device based on the feature value of each detection item category and the detection result of each resource item.
在一些实施方式中,所述云服务设备的检测装置还包括报告生成模块,用于:In some embodiments, the detection device of the cloud service device further includes a report generation module, which is used to:
若所述检测结果为异常时,将所述检测结果为异常的特征值作为待分析特征;If the detection result is abnormal, the feature value of the abnormal detection result is used as the feature to be analyzed;
将所述待分析特征输入至目标模型,通过所述目标模型基于所述待分析特征中每个子特征以及所述子特征之间的特征关系进行分类,得到异常类别标签;Inputting the feature to be analyzed into a target model, and classifying the feature based on each sub-feature in the feature to be analyzed and the feature relationship between the sub-features through the target model to obtain an abnormal category label;
将所述检测结果为正常的特征值确定为目标特征值;Determine the characteristic value with a normal detection result as a target characteristic value;
根据所述目标特征值、所述目标特征值对应的检测结果、所述待分析特征以及所述待分析特征对应的异常类别标签,生成所述云服务设备的检测报告。A detection report for the cloud service device is generated according to the target feature value, the detection result corresponding to the target feature value, the feature to be analyzed, and the abnormal category label corresponding to the feature to be analyzed.
在一些实施方式中,所述云服务设备的检测装置还包括训练模块,用于:In some embodiments, the detection device of the cloud service device further includes a training module for:
获取样本待分析特征以及所述样本待分析特征对应的样本异常类别标签;Obtaining features of the sample to be analyzed and sample abnormality category labels corresponding to the features of the sample to be analyzed;
将所述样本待分析特征输入至预设模型,得到预测类别标签;Input the features of the sample to be analyzed into a preset model to obtain a predicted category label;
基于所述样本异常类别标签与所述预测类别标签之间的差异,构建目标损失函数;Constructing a target loss function based on the difference between the sample abnormal category label and the predicted category label;
根据所述目标损失函数对所述预设模型进行迭代训练,直至达到预设条件,得到训练后的目标模型。The preset model is iteratively trained according to the target loss function until a preset condition is met to obtain a trained target model.
在一些实施方式中,所述配置模块,还用于:In some implementations, the configuration module is further configured to:
对所述检查清单中包含的多项待检测的资源项进行分组,得到多个数据采集组;Grouping a plurality of resource items to be inspected contained in the inspection list to obtain a plurality of data collection groups;
从预设指令列表中确定每个所述数据采集组对应的设备检测实例。Determine the device detection instance corresponding to each data collection group from the preset instruction list.
在一些实施方式中,所述执行模块,还用于:In some implementations, the execution module is further configured to:
针对所述数据采集组中的每个待检测的资源项,执行所述设备检测实例中对应的所述检测指令,得到每个所述资源项对应的资源参数;For each resource item to be detected in the data collection group, executing the corresponding detection instruction in the device detection instance to obtain resource parameters corresponding to each resource item;
将每个所述资源项对应的所述资源参数填充至所述设备检测实例中,并根据填充后的每个设备检测实例生成所述云服务设备的设备检测数据。The resource parameters corresponding to each of the resource items are filled into the device detection instance, and device detection data of the cloud service device is generated according to each filled device detection instance.
在一些实施方式中,所述处理模块,还用于:In some embodiments, the processing module is further used to:
获取数据解析模板,所述数据解析模板包含预设关键词与预设检测项类别的对应关系;Obtaining a data analysis template, wherein the data analysis template includes a correspondence between preset keywords and preset detection item categories;
按照所述数据解析模板中的预设关键词,对所述设备检测数据中目标词语进行配对,以确定每个目标词语对应的检测项类别;According to the preset keywords in the data parsing template, the target words in the device detection data are matched to determine the detection item category corresponding to each target word;
从所述设备检测数据中获取每个检测项类别对应的特征值;Obtaining a feature value corresponding to each detection item category from the device detection data;
基于多个检测项类别以及每个检测项类别的特征值,得到结构化数据集。Based on multiple detection item categories and feature values of each detection item category, a structured data set is obtained.
在一些实施方式中,所述判定模块,还用于:In some implementations, the determination module is further configured to:
针对所述结构化数据集中每个所述检测项类别,获取所述检测项类别对应的至少一个历史特征值;For each of the detection item categories in the structured data set, obtaining at least one historical feature value corresponding to the detection item category;
根据每个所述检测项类别的特征值和对应的至少一个历史特征值,确定每个所述检测项类别的特征均值和标准差;Determine a feature mean and a standard deviation of each detection item category according to the feature value of each detection item category and the corresponding at least one historical feature value;
获取所述检测项类别的阈值范围,基于所述特征均值、所述标准差和所述阈值范围对所述云服务设备中的资源项进行异常运行判定,得到每个资源项的检测结果。A threshold range of the detection item category is obtained, and abnormal operation determination is performed on the resource items in the cloud service device based on the feature mean, the standard deviation, and the threshold range to obtain a detection result for each resource item.
在一些实施方式中,所述判定模块,还用于:In some implementations, the determination module is further configured to:
基于每个所述检测项类别对应的当前的特征值,获取所述检测项类别对应的多个历史特征值;Based on the current feature value corresponding to each of the detection item categories, obtaining multiple historical feature values corresponding to the detection item category;
根据所述当前的特征值和所述历史特征值,生成每个所述检测项类别对应的检测箱线图;Generate a detection box plot corresponding to each detection item category according to the current feature value and the historical feature value;
在所述检测箱线图中选定预设范围,将超过所述预设范围的所述特征值确定为异常数据;Selecting a preset range in the detection box plot, and determining the feature values exceeding the preset range as abnormal data;
基于所述异常数据,生成对应的所述资源项的检测结果。Based on the abnormal data, a detection result of the corresponding resource item is generated.
在一些实施方式中,所述生成模块,还用于In some embodiments, the generating module is further used to
将所述每个检测项类别的特征值以及所述每个资源项的检测结果转换为目标格式数据;Converting the characteristic value of each detection item category and the detection result of each resource item into target format data;
获取预设的可视化模板,并将所有所述资源项对应的所述目标格式数据填充至所述可视化模板中,得到填充后的目标可视化模板;Acquire a preset visualization template, and fill the target format data corresponding to all the resource items into the visualization template to obtain a filled target visualization template;
基于填充后的所述目标可视化模板中的内容,展示所述云服务设备的检测报告。Based on the content in the filled target visualization template, the detection report of the cloud service device is displayed.
在一些实施方式中,所述云服务设备的检测装置还包括保存模块,用于:In some implementations, the detection device of the cloud service device further includes a storage module, which is used to:
获取在执行所述设备检测实例时每个所述资源项对应的检测日志;Obtaining a detection log corresponding to each resource item when executing the device detection instance;
按照所述检测结果,对每个所述资源项的所述检测日志进行分类保存。According to the detection result, the detection log of each resource item is classified and saved.
为实现上述目的,本申请实施例的第三方面提出了一种计算机设备,所述计算机设备包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现本申请第一方面实施例任一项所述的云服务设备的检测方法。To achieve the above-mentioned purpose, the third aspect of the embodiments of the present application proposes a computer device, which includes a memory and a processor, the memory stores a computer program, and the processor implements the detection method of the cloud service device described in any one of the embodiments of the first aspect of the present application when executing the computer program.
为实现上述目的,本申请实施例的第四方面提出了一种计算机可读存储介质,所述存储介质存储有计算机程序,所述计算机程序被处理器执行时实现本申请第一方面实施例任一项所述的云服务设备的检测方法。To achieve the above-mentioned purpose, the fourth aspect of the embodiments of the present application proposes a computer-readable storage medium, which stores a computer program. When the computer program is executed by a processor, it implements the detection method of the cloud service device described in any one of the embodiments of the first aspect of the present application.
本申请实施例通过获取云服务设备对应的检查清单,并根据检查清单中包含的多项待检测的资源项配置云服务设备对应的设备检测实例;通过执行设备检测实例,以从云服务设备采集每个资源项对应的资源参数,得到设备检测数据;按照多个检测项类别对设备检测数据进行结构化处理,得到结构化数据集,结构化数据集包括多个检测项类别以及每个检测项类别的特征值;针对每个检测项类别的特征值,对云服务设备中的资源项进行异常判定,得到每个资源项的检测结果;基于每个检测项类别的特征值以及每个资源项的检测结果,生成云服务设备的检测报告。因此,可针对云服务设备所包含的资源项来准确配置对应的设备检测实例,通过执行设备检测实例,以实现从云服务器设备自动化采集每个资源项对应的资源参数,有效提高了对云服务设备检测的效率,进而,通过对设备检测数据进行结构化处理,以基于结构化数据中的每个检测项类别的特征值进行异常判定,无需人工干预,快速分析得到每个资源项的检测结果,结合检测结果生成检测报告,可在提高检测效率的同时,提高对云服务设备进行检测时的稳定性和准确性。The embodiment of the present application obtains a checklist corresponding to a cloud service device, and configures a device detection instance corresponding to the cloud service device according to multiple resource items to be detected contained in the checklist; collects resource parameters corresponding to each resource item from the cloud service device by executing the device detection instance to obtain device detection data; performs structured processing on the device detection data according to multiple detection item categories to obtain a structured data set, which includes multiple detection item categories and feature values of each detection item category; performs abnormality judgment on the resource items in the cloud service device according to the feature values of each detection item category to obtain a detection result for each resource item; and generates a detection report for the cloud service device based on the feature values of each detection item category and the detection result of each resource item. Therefore, the corresponding device detection instances can be accurately configured for the resource items contained in the cloud service device. By executing the device detection instances, the resource parameters corresponding to each resource item can be automatically collected from the cloud server device, which effectively improves the efficiency of cloud service device detection. Furthermore, by structuring the device detection data, anomaly judgment is made based on the characteristic values of each detection item category in the structured data. Without human intervention, the detection results of each resource item can be quickly analyzed and generated based on the detection results. The detection report can be generated based on the detection efficiency, while improving the stability and accuracy of cloud service device detection.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本申请实施例提供的云服务设备的检测系统的结构示意图;FIG1 is a schematic diagram of the structure of a detection system for a cloud service device provided in an embodiment of the present application;
图2是本申请实施例提供的云服务设备的检测方法的流程图;FIG2 is a flow chart of a method for detecting a cloud service device provided in an embodiment of the present application;
图3是本申请实施例提供的云服务设备的检测装置的结构示意图;FIG3 is a schematic diagram of the structure of a detection device for a cloud service device provided in an embodiment of the present application;
图4是本申请实施例提供的计算机设备的硬件结构示意图。FIG4 is a schematic diagram of the hardware structure of a computer device provided in an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application more clearly understood, the present application is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application and are not used to limit the present application.
需要说明的是,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that, although the functional modules are divided in the device schematic diagram and the logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in a different order than the module division in the device or the order in the flowchart. The terms "first", "second", etc. in the specification, claims and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by those skilled in the art to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of this application and are not intended to limit this application.
随着云计算技术的不断发展和普及,云服务设备的规模不断扩大、复杂性不断提高,需要对云服务设备进行定期检测,以保障云平台的安全性和稳定性。With the continuous development and popularization of cloud computing technology, the scale of cloud service equipment continues to expand and the complexity continues to increase. It is necessary to conduct regular inspections of cloud service equipment to ensure the security and stability of the cloud platform.
相关技术中,一般通过人工对云服务设备进行检测。具体来说,人工操作员可以通过监控平台和日志分析工具来检查云服务设备的运行情况和日志信息,以了解设备的性能、响应时间、异常情况等,当云服务设备出现问题或故障时,人工操作员可以进行排查和处理,并通过手动操作、命令行工具等方式来解决设备的异常情况。In the related art, cloud service equipment is generally tested manually. Specifically, human operators can use monitoring platforms and log analysis tools to check the operation status and log information of cloud service equipment to understand the performance, response time, abnormal conditions, etc. of the equipment. When problems or failures occur in cloud service equipment, human operators can conduct investigations and processing, and resolve abnormal conditions of the equipment through manual operations, command line tools, etc.
然而,通过人工进行检测存在一定的局限性,例如,通过人工进行检测无法处理大规模的云服务设备、人工成本高,并且,由于需要人工获取对检测数据进行分析,检测的效率低,检测结果的稳定程度低,难以保证检测结果的准确性。However, manual testing has certain limitations. For example, manual testing cannot handle large-scale cloud service equipment, the labor cost is high, and because manual acquisition and analysis of test data are required, the efficiency of testing is low, the stability of test results is low, and it is difficult to ensure the accuracy of test results.
基于此,本申请实施例提供了一种云服务设备的检测方法、系统、设备及可读存储介质,能够提高对云服务设备检测的效率和稳定性,且提高检测的准确性。Based on this, the embodiments of the present application provide a cloud service device detection method, system, device and readable storage medium, which can improve the efficiency and stability of cloud service device detection and improve the accuracy of detection.
本申请实施例提供的云服务设备的检测方法、系统、设备及可读存储介质,具体通过如下实施例进行说明,首先描述本申请实施例中的云服务设备的检测系统。The cloud service device detection method, system, device and readable storage medium provided in the embodiments of the present application are specifically described through the following embodiments. First, the cloud service device detection system in the embodiments of the present application is described.
请参照图1,在一些实施方式中,云服务设备的检测系统包括云服务设备11、客户端12以及服务器端13。在一些实施方式中,云服务设备11被检测的对象,它需要提供接口让客户端12或服务器能够获取设备的检测数据。1, in some embodiments, the cloud service device detection system includes a cloud service device 11, a client 12, and a server 13. In some embodiments, the cloud service device 11 is an object to be detected, and it needs to provide an interface for the client 12 or the server to obtain the detection data of the device.
可以理解的是,云服务设备11涵盖了云计算环境中的各种基础设施、服务和应用,可以指任何基于云计算架构的设备或服务。示例性地,云服务设备11可以为云服务器、云存储服务、云数据库服务、云应用服务以及云网络设备等等。云服务设备11可以通过执行相应的设备检测实例,采集每个资源项对应的资源参数,并将设备检测数据传输给服务器端13进行进一步处理和分析。It is understandable that the cloud service device 11 covers various infrastructures, services and applications in the cloud computing environment, and can refer to any device or service based on the cloud computing architecture. Exemplarily, the cloud service device 11 can be a cloud server, a cloud storage service, a cloud database service, a cloud application service, a cloud network device, etc. The cloud service device 11 can collect resource parameters corresponding to each resource item by executing the corresponding device detection instance, and transmit the device detection data to the server 13 for further processing and analysis.
在一些实施方式中,客户端12可以是发起云服务设备检测请求并接收检测结果的设备或应用程序。客户端12可以是Web浏览器、移动应用程序、桌面应用程序、嵌入式系统或传感器设备等。例如,当客户端12为移动应用程序时,用户可以通过安装在移动设备上的应用程序来发起对云服务设备的检测请求,同时在移动设备上查看检测结果报告。In some embodiments, the client 12 may be a device or application that initiates a cloud service device detection request and receives the detection result. The client 12 may be a web browser, a mobile application, a desktop application, an embedded system, or a sensor device, etc. For example, when the client 12 is a mobile application, a user may initiate a detection request to a cloud service device through an application installed on a mobile device, and view the detection result report on the mobile device.
示例性地,在对云服务设备进行检测的过程中,服务器端13可以是接收并处理客户端发起的检测请求,并返回检测结果的云服务设备,服务器端13可以是承担云服务设备检测任务和返回检测结果的各种云服务、平台或子系统,涵盖了监控服务、安全服务、管理平台、存储服务、计算服务等。Exemplarily, in the process of detecting cloud service equipment, the server side 13 can be a cloud service equipment that receives and processes the detection request initiated by the client and returns the detection result. The server side 13 can be various cloud services, platforms or subsystems that undertake cloud service equipment detection tasks and return detection results, covering monitoring services, security services, management platforms, storage services, computing services, etc.
在一些实施方式中,客户端12通过用户界面与用户交互,用户设定配置参数并触发设备检测实例的执行,云服务设备11通过执行检测实例,采集数据并将其传输给服务器端13。服务器端13接收并处理来自云服务设备11的数据,按照预设流程进行结构化处理、异常判定等,并执行目标模型对数据进行分析,以生成最终的检测报告。最后,服务器端13将生成的检测报告传输给客户端12,客户端12可对检测报告进行查看。In some embodiments, the client 12 interacts with the user through a user interface, the user sets configuration parameters and triggers the execution of the device detection instance, and the cloud service device 11 collects data and transmits it to the server 13 by executing the detection instance. The server 13 receives and processes the data from the cloud service device 11, performs structured processing, abnormality determination, etc. according to the preset process, and executes the target model to analyze the data to generate a final detection report. Finally, the server 13 transmits the generated detection report to the client 12, and the client 12 can view the detection report.
本申请实施例中的云服务设备的检测方法可以通过如下实施例进行说明。The detection method of the cloud service device in the embodiment of the present application can be illustrated by the following embodiment.
需要说明的是,在本申请的各个具体实施方式中,当涉及到需要根据用户信息、用户行为数据,用户历史数据以及用户位置信息等与用户身份或特性相关的数据进行相关处理时,都会先获得用户的许可或者同意,例如,登录云服务设备时,会先获得用户的许可或者同意。而且,对这些数据的收集、使用和处理等,都会遵守相关法律法规和标准。此外,当本申请实施例需要获取用户的敏感个人信息时,会通过弹窗或者跳转到确认页面等方式获得用户的单独许可或者单独同意,在明确获得用户的单独许可或者单独同意之后,再获取用于使本申请实施例能够正常运行的必要的用户相关数据。It should be noted that in each specific implementation of the present application, when it comes to the need to perform relevant processing based on data related to user identity or characteristics such as user information, user behavior data, user historical data, and user location information, the user's permission or consent will be obtained first. For example, when logging into a cloud service device, the user's permission or consent will be obtained first. Moreover, the collection, use, and processing of these data will comply with relevant laws, regulations, and standards. In addition, when the embodiment of the present application needs to obtain the user's sensitive personal information, the user's separate permission or separate consent will be obtained through a pop-up window or by jumping to a confirmation page. After clearly obtaining the user's separate permission or separate consent, the necessary user-related data for enabling the normal operation of the embodiment of the present application will be obtained.
在本申请实施例中,将从云服务设备的检测装置的维度进行描述,该检测装置具体可以集成在计算机设备中。参见图2,图2为本申请实施例提供的云服务设备的检测方法的一个可选的流程图,本申请实施例以密钥管理装置具体集成在如终端或服务器端上为例,该终端或服务器端上的处理器执行云服务设备的检测方法对应的程序指令时,具体流程如下:In the embodiment of the present application, the cloud service device will be described from the dimension of the detection device, which can be integrated in a computer device. Referring to FIG. 2, FIG. 2 is an optional flow chart of the cloud service device detection method provided in the embodiment of the present application. The embodiment of the present application takes the key management device as an example, which is specifically integrated on a terminal or a server. When the processor on the terminal or the server executes the program instructions corresponding to the cloud service device detection method, the specific process is as follows:
步骤101,获取云服务设备对应的检查清单,并根据检查清单中包含的多项待检测的资源项配置云服务设备对应的设备检测实例。Step 101 : obtaining a checklist corresponding to a cloud service device, and configuring a device detection instance corresponding to the cloud service device according to multiple resource items to be detected included in the checklist.
在本申请实施例中,为了实现对云服务设备进行自动化检测,需要根据云服务设备中的实际资源项情况来生成用于实现自动化的实例,以便后续基于生成的实例对云服务设备执行自动化检测。具体的,为了获取设备检测实例,可先了解需要检测的设备的检查清单,以根据检查清单中所包含的资源项来准确配置对应的设备检测实例,提高设备检测实例与云服务设备之间的契合度,以便后续有靶向性地对云服务设备的目标资源项执行自动化检测,以实现对云服务设备的准确、高效检测。In the embodiment of the present application, in order to realize the automatic detection of cloud service equipment, it is necessary to generate an instance for realizing automation according to the actual resource items in the cloud service equipment, so as to perform automatic detection on the cloud service equipment based on the generated instance. Specifically, in order to obtain the device detection instance, you can first understand the checklist of the equipment to be detected, so as to accurately configure the corresponding device detection instance according to the resource items included in the checklist, improve the fit between the device detection instance and the cloud service equipment, so as to perform automatic detection on the target resource items of the cloud service equipment in a targeted manner, so as to realize accurate and efficient detection of the cloud service equipment.
需要说明的是,本申请实施例主要是对云服务设备中的资源项进行检测,该云服务设备可以是提供云计算、云存储等业务服务的服务器,该服务器可以包括主机设备(即物理机)和用于提供主机设备之间的网络连接的网络资源设备。其中,若服务器为由多个物理机构成的分布式服务系统,则该云服务设备的主机设备可以是分布式服务系统中的任一物理机或多各物理机。It should be noted that the embodiments of the present application mainly detect resource items in a cloud service device, which may be a server providing cloud computing, cloud storage and other business services, and the server may include a host device (i.e., a physical machine) and a network resource device for providing a network connection between the host devices. If the server is a distributed service system composed of multiple physical machines, the host device of the cloud service device may be any physical machine or multiple physical machines in the distributed service system.
其中,检查清单可以是包含云服务设备中一个或多个待检测的资源项的信息列表。具体地,不同的云服务设备可以对应不同的检查清单,检查清单可以由云服务提供商提供的官方文档获取,用户可以在官方网站或者文档中心查找或者下载对应的检查清单;或者,用户可以通过在云服务控制台中查找相应的资源配置和性能指标作为参考,自行定制检查清单;或者,用户可以通过将检查清单保存至云服务设备内部的数据库中,后续需要对云服务设备时可以直接从数据库获取对应的检查清单,或者,服务器端可以储存一个数据库,数据库对于每个云服务设备都配置有对应的检查清单,等等。Among them, the checklist can be an information list containing one or more resource items to be detected in the cloud service device. Specifically, different cloud service devices can correspond to different checklists, and the checklists can be obtained from the official documents provided by the cloud service provider. Users can search or download the corresponding checklists on the official website or document center; or, users can customize the checklists by finding the corresponding resource configuration and performance indicators in the cloud service console as a reference; or, users can save the checklists to the database inside the cloud service device, and can directly obtain the corresponding checklists from the database when the cloud service device is needed later, or, the server side can store a database, and the database is configured with corresponding checklists for each cloud service device, and so on.
其中,资源项可以是检查清单中记载的对云服务设备的至少一个检测条目,资源项可以是虚拟机资源、物理计算资源、物理存储资源、服务器带外资源、底层网络资源(underlay网络资源)等等。Among them, the resource item can be at least one detection item for the cloud service device recorded in the checklist, and the resource item can be a virtual machine resource, a physical computing resource, a physical storage resource, a server out-of-band resource, an underlying network resource (underlay network resource), etc.
其中,设备检测实例可以是根据云服务设备对应的检查清单中包含的多项待检测的资源项配置的检测模板,至少一个资源项对应一个检测实例,设备检测实例至少包括对资源项进行检测的检测指令,通过逐一执行或者并行执行检测指令,可以采集到对应的资源项所需要检测的资源参数。可选的,设备检测实例可以根据检测需求或者检测环境进行更改。示例性的,设备检测实例可以包括对某个资源项的检测条目和每个检测条目对应的检测指令。例如,资源项为虚拟机资源时,虚拟机资源可以对应一个设备检测实例,每个设备检测实例可以包括虚拟机的配置和状态数据、CPU使用率、内存利用率、磁盘空间、操作系统版本等检测条目,每个检测条目对应一个检测指令,根据检测指令,可以从云服务设备中获取检测条目对应的资源参数,例如,通过检测条目为CPU使用率对应的检测指令,可以获取到资源参数为66%,也即是说CPU使用率为66%。进一步地,设备检测实例可以重复使用,例如,对应于资源项A有设备检测实施例A,那么,每次对资源项A进行检测时,均可以配置设备检测实例A,无需再重新编写指令。Among them, the device detection instance can be a detection template configured according to multiple resource items to be detected contained in the checklist corresponding to the cloud service device, at least one resource item corresponds to a detection instance, and the device detection instance at least includes a detection instruction for detecting the resource item. By executing the detection instructions one by one or in parallel, the resource parameters required for the detection of the corresponding resource item can be collected. Optionally, the device detection instance can be changed according to the detection requirements or the detection environment. Exemplarily, the device detection instance may include a detection item for a certain resource item and a detection instruction corresponding to each detection item. For example, when the resource item is a virtual machine resource, the virtual machine resource may correspond to a device detection instance, and each device detection instance may include the configuration and status data of the virtual machine, CPU usage, memory utilization, disk space, operating system version and other detection items. Each detection item corresponds to a detection instruction. According to the detection instruction, the resource parameters corresponding to the detection item can be obtained from the cloud service device. For example, through the detection item corresponding to the detection instruction of the CPU usage, the resource parameter can be obtained as 66%, that is, the CPU usage is 66%. Furthermore, the device detection instance can be reused. For example, if there is a device detection implementation example A corresponding to resource item A, then each time the resource item A is detected, the device detection instance A can be configured without rewriting the instructions.
需要说明的是,获取云服务设备对应的检查清单之前,用户可以远程登录云服务设备,以便完成后续对云服务设备的检测。具体来说,用户可以根据不同的登录场景选择对应的检测任务执行器,也即选择不同的远程连接插件,从而顺利登录至云服务设备进行相关操作。具体的,对于登陆场景为Linux主机和其他网络设备进行安全的远程管理时,可以选用基于安全外壳协议(Secure Shell,SSH)的执行器;对于登录场景为对Windows主机进行登录时,可以选择的检测任务执行器为基于Windows远程管理(Windows RemoteManagement,WinRM)的执行器。通过上述方式,对于不同的登录场景选择对应的检测任务执行器,可以确保在不同的操作系统环境下,顺利地远程登录并执行相应的操作,进而完成针对云服务设备的检测任务。It should be noted that before obtaining the corresponding checklist for the cloud service device, the user can remotely log in to the cloud service device in order to complete the subsequent detection of the cloud service device. Specifically, the user can select the corresponding detection task executor according to different login scenarios, that is, select different remote connection plug-ins, so as to successfully log in to the cloud service device to perform related operations. Specifically, when the login scenario is for secure remote management of Linux hosts and other network devices, an executor based on the Secure Shell Protocol (Secure Shell, SSH) can be selected; when the login scenario is to log in to a Windows host, the detection task executor that can be selected is an executor based on Windows Remote Management (Windows RemoteManagement, WinRM). In the above manner, by selecting the corresponding detection task executor for different login scenarios, it can be ensured that in different operating system environments, remote login can be successfully performed and corresponding operations can be performed, thereby completing the detection task for the cloud service device.
进一步的,可以根据不同的检测任务执行器选取对应的加密算法对数据进行加密处理,确保通信过程中的数据安全。例如,在选用基于SSH的执行器时,可以使用对称加密算法,如AES算法、3DES算法等用于加密通信数据,并使用相同的密钥进行解密;或者,客户端和服务器端也可以进行非对称加密的协商,通常使用公钥和私钥进行加密和解密操作。而对于基于Windows远程管理的执行器时,可以使用HTTP协议或HTTPS协议进行通信,确保通信过程中的安全。Furthermore, the corresponding encryption algorithm can be selected according to different detection task executors to encrypt the data and ensure data security during the communication process. For example, when selecting an SSH-based executor, a symmetric encryption algorithm such as the AES algorithm, 3DES algorithm, etc. can be used to encrypt communication data, and the same key can be used for decryption; alternatively, the client and server can also negotiate asymmetric encryption, usually using public and private keys for encryption and decryption operations. For Windows remote management-based executors, HTTP or HTTPS protocols can be used for communication to ensure security during the communication process.
在一些实施方式中,每个用户在登录至云服务设备之后,需要根据允许检测的权限对云服务设备进行检测。具体来说,每个云服务设备内部都维护一个检测权限表,检测权限表内记载的内容包括每个用户对各个资源项的访问权限和操作权限。例如,资源项A只能允许用户A检测或者编辑,而用户B不可检测或者编辑,以确保云服务设备中的数据的安全。In some implementations, after logging into the cloud service device, each user needs to detect the cloud service device according to the permissions allowed for detection. Specifically, each cloud service device maintains a detection permission table, and the contents recorded in the detection permission table include each user's access rights and operation permissions for each resource item. For example, resource item A can only be detected or edited by user A, while user B cannot detect or edit it, so as to ensure the security of data in the cloud service device.
示例性的,用户在登录云服务设备之后,可以通过管理控制台中获取对应存储的检查清单,或者直接使用自定义的检查清单,并对检查清单的格式进行解析,以确定需要检测的资源项,并基于资源项从设备检测实例的存储区域获取对应的设备检测实例。进一步的,设备检测实例可以存储在预设数据库或者预设指令列表中,用户可以对其进行定期维护或更新。For example, after logging into the cloud service device, the user can obtain the corresponding stored checklist through the management console, or directly use a custom checklist and parse the format of the checklist to determine the resource items that need to be detected, and obtain the corresponding device detection instance from the storage area of the device detection instance based on the resource items. Furthermore, the device detection instance can be stored in a preset database or a preset instruction list, and the user can regularly maintain or update it.
在一些实施方式中,在为待检测的资源项配置云服务设备的设备检测实例之前/之后,可以基于资源项的分组选取对应的运维工具。具体来说,可以对待检测的资源项进行分组,得到分组结果,之后,基于分组结果确定对云服务设备进行检测的运维工具。示例性的,可以将资源项划分为主机类资源和网络设备类资源,对于主机类资源,可以选用的运维工具为Ansible;对于网络设备类资源,可以选用的运维工具为Nornir等等。可以理解的是,基于资源项的分组选取对应的运维工具,可以简化操作流程,降低了操作的复杂性和错误率,提高了整体的检测和管理效率。In some embodiments, before/after configuring a device detection instance of a cloud service device for a resource item to be detected, a corresponding operation and maintenance tool can be selected based on the grouping of resource items. Specifically, the resource items to be detected can be grouped to obtain grouping results, and then the operation and maintenance tool for detecting the cloud service device can be determined based on the grouping results. Exemplarily, the resource items can be divided into host resources and network device resources. For host resources, the operation and maintenance tool that can be selected is Ansible; for network device resources, the operation and maintenance tool that can be selected is Nornir, etc. It can be understood that selecting the corresponding operation and maintenance tool based on the grouping of resource items can simplify the operation process, reduce the complexity and error rate of the operation, and improve the overall detection and management efficiency.
通过以上方式,可以通过获取云服务设备对应的检查清单,根据检测对云服务设备进行全面的检测,并通过检查清单中的资源项快速配置对应的设备检测实例,以在后续执行对应的设备检测实例获取资源参数,提高了对云服务设备进行检测的效率。Through the above method, you can obtain the inspection list corresponding to the cloud service device, perform a comprehensive inspection of the cloud service device based on the inspection, and quickly configure the corresponding device detection instance through the resource items in the inspection list, so as to obtain resource parameters when executing the corresponding device detection instance later, thereby improving the efficiency of cloud service device inspection.
在一些实施方式中,为了提高后续在对云服务设备进行准确、高效地自动化检测,针对可独立运行的物理机,可分别生成对应的自动化检测实例,具体可以将检查清单中的资源项按照物理机进行分组管理,以针对每个分组中的资源项生成一个设备检测实例。例如,步骤101中的“根据所述检查清单中包含的多项待检测的资源项配置所述云服务设备对应的设备检测实例”,可以包括:In some implementations, in order to improve the accuracy and efficiency of subsequent automated detection of cloud service devices, corresponding automated detection instances can be generated for physical machines that can run independently. Specifically, the resource items in the checklist can be grouped and managed according to the physical machines, so as to generate a device detection instance for the resource items in each group. For example, in step 101, "configuring the device detection instance corresponding to the cloud service device according to the multiple resource items to be detected included in the checklist" can include:
(101.1)对所述检查清单中包含的多项待检测的资源项进行分组,得到多个数据采集组;(101.1) Grouping a plurality of resource items to be inspected contained in the inspection list to obtain a plurality of data collection groups;
(101.2)从预设指令列表中确定每个所述数据采集组对应的设备检测实例。(101.2) Determine a device detection instance corresponding to each of the data collection groups from a preset instruction list.
其中,该分组处理过程为按照资源项的类别进行分组。例如,将属于主机类资源的资源项分为一个组,将属于网络设备类资源的资源项分为一个组,将属于服务器带外资源的资源项分为一个组,等等。示例性的,可以将虚拟机资源、物理计算资源、物理存储资源、服务器带外资源的资源项划分为主机类资源,将underlay网络资源的资源项划分为网络设备类资源。可以理解的是,由于同一组别的资源项往往对应的检测条目也相同,也即是说,设备检测实例也相同,因此,通过对资源项划分组别,可以对每个数据采集组内的多个资源项配置同一个设备检测实例,不必对每个资源项进行重复配置,在每次需要对数据采集组内的任意一个资源项进行检测时,只需要直接调用资源项所在数据采集组关联的设备检测实例即可,从而节约了存储众多设备检测实例的空间。Among them, the grouping process is to group according to the category of resource items. For example, resource items belonging to host class resources are divided into one group, resource items belonging to network device class resources are divided into one group, resource items belonging to server out-of-band resources are divided into one group, and so on. Exemplarily, resource items of virtual machine resources, physical computing resources, physical storage resources, and server out-of-band resources can be divided into host class resources, and resource items of underlay network resources can be divided into network device class resources. It can be understood that since the detection items corresponding to the resource items in the same group are often the same, that is, the device detection instances are also the same, therefore, by dividing the resource items into groups, the same device detection instance can be configured for multiple resource items in each data collection group, and there is no need to repeatedly configure each resource item. Every time any resource item in the data collection group needs to be detected, it is only necessary to directly call the device detection instance associated with the data collection group where the resource item is located, thereby saving space for storing many device detection instances.
其中,该数据采集组可以用于组织和管理待检测的资源项,确保每个资源项都能对应一个设备检测实例,每个数据采集组可以包括至少一个资源项。The data collection group may be used to organize and manage resource items to be detected, ensuring that each resource item corresponds to a device detection instance, and each data collection group may include at least one resource item.
可选的,也可以不对资源项进行分组,每个资源项都可以关联一个设备检测实例,以实现对资源项进行检测时的精细化检测,对此不作具体限制。Optionally, the resource items may not be grouped, and each resource item may be associated with a device detection instance to achieve refined detection of the resource items, and there is no specific restriction on this.
其中,预设指令列表可以是一个包含了各种设备检测实例的模板或指令集合,预设指令列表可以位于数据库中,由操作人员进行定期维护,预设指令列表还可以存储各个检测条目以及对应的检测指令,在对资源项划分为对应的数据采集组之后,可以基于数据采集组内的资源项确定检测条目,并根据检测条目从预设指令列表中获取对应的检测指令,并根据检测指令生成设备检测实例,从而将设备检测实例应用于对应的数据采集组中的每个资源项的检测中,不必对每个资源项逐一获取设备检测实例,缩短了检测的周期,提高了检测的效率。Among them, the preset instruction list can be a template or instruction set containing various equipment detection instances. The preset instruction list can be located in a database and maintained regularly by operators. The preset instruction list can also store various detection items and corresponding detection instructions. After the resource items are divided into corresponding data collection groups, the detection items can be determined based on the resource items in the data collection group, and the corresponding detection instructions can be obtained from the preset instruction list according to the detection items, and the equipment detection instance is generated according to the detection instruction, so that the equipment detection instance is applied to the detection of each resource item in the corresponding data collection group. There is no need to obtain the equipment detection instance for each resource item one by one, which shortens the detection cycle and improves the detection efficiency.
通过为每个数据采集组确定对应的设备检测实例,可以有效提高资源参数的获取效率。并且,设备检测实例可以重复、不限次获取,运维人员需要对云服务设备进行巡检时,直接基于待检测的资源项所在的分组获取即可,不必重复制定新的设备检测实例,提高了对云服务设备检测的效率。By determining the corresponding device detection instance for each data collection group, the efficiency of obtaining resource parameters can be effectively improved. In addition, the device detection instance can be obtained repeatedly and unlimited times. When the operation and maintenance personnel need to inspect the cloud service equipment, they can directly obtain it based on the group where the resource items to be detected are located, without having to repeatedly formulate new device detection instances, which improves the efficiency of cloud service equipment detection.
步骤102,通过执行设备检测实例,以从云服务设备采集每个资源项对应的资源参数,得到设备检测数据。Step 102 : executing a device detection instance to collect resource parameters corresponding to each resource item from the cloud service device to obtain device detection data.
可以理解的是,为了从云服务设备获取资源项对应的资源参数,以便于后续对资源参数进行处理和分析,可以执行待检测的资源项对应的设备检测实例,以从云服务设备采集每个资源项对应的资源参数,提高对云服务设备检测的效率。It can be understood that in order to obtain resource parameters corresponding to resource items from cloud service devices, so as to facilitate subsequent processing and analysis of the resource parameters, a device detection instance corresponding to the resource item to be detected can be executed to collect the resource parameters corresponding to each resource item from the cloud service device, thereby improving the efficiency of cloud service device detection.
其中,资源参数可以是每个资源项中的检测条目对应的检测数据参数,资源参数可能包括设备的状态、性能指标、配置信息、工作负载等。例如,若待检测的资源项为虚拟机资源,检测条目可能包括CPU利用率、内存使用率、磁盘空间等,资源参数则为检测得到的CPU利用率、内存使用率、磁盘空间的具体参数,例如CPU利用率为65%等等。The resource parameters may be detection data parameters corresponding to the detection items in each resource item, and the resource parameters may include the status of the device, performance indicators, configuration information, workload, etc. For example, if the resource item to be detected is a virtual machine resource, the detection items may include CPU utilization, memory utilization, disk space, etc., and the resource parameters are the specific parameters of the CPU utilization, memory utilization, and disk space obtained by detection, such as CPU utilization of 65%, etc.
其中,设备检测数据可以是对所有的资源项检测完毕之后,基于检测得到的资源参数进行汇总得到的数据。通过执行设备检测实例,可以逐一或者同时获取的来自云服务设备的设备检测数据。The device detection data may be data obtained by summarizing resource parameters obtained after all resource items are detected. By executing the device detection instance, the device detection data from the cloud service device may be obtained one by one or simultaneously.
在一些实施方式中,可以通过运维工具执行设备检测实例,由于设备检测实例中记载了每个资源项需要检测的检测条目和对应的检测指令,因此,可以通过运维工具逐一执行检测指令,从云服务设备中采集每个检测条目对应的资源参数,并将资源参数填充至设备检测实例中,在对所有资源项的资源参数采集完毕之后,生成设备检测数据。In some implementations, a device detection instance can be executed through an operation and maintenance tool. Since the device detection instance records the detection items and corresponding detection instructions that need to be detected for each resource item, the operation and maintenance tool can execute the detection instructions one by one, collect the resource parameters corresponding to each detection item from the cloud service device, and fill the resource parameters into the device detection instance. After the resource parameters of all resource items are collected, the device detection data is generated.
示例性地,当待检测的资源项为虚拟机资源时,通过执行设备检测实例对虚拟机资源进行检测的过程可以如下:Exemplarily, when the resource item to be detected is a virtual machine resource, the process of detecting the virtual machine resource by executing the device detection instance may be as follows:
首先,通过{%for virtual_machine in virtual_machines%}指令,对名为"virtual_machines"的列表中的每个虚拟机对象进行遍历,并在每一次循环中,生成一个HTML表格的行(即标签),标签可以分别显示每个虚拟机的标识符(id)、虚拟机名称(name)和状态数据(state)。并且,可以使用{{virtual_machine.id}}、{{virtual_machine.name}}和{{virtual_machine.state}}来显示具体虚拟机对象的资源参数的占位符,在每次循环迭代时,会用新的虚拟机对象的资源参数来替换这些占位符,之后,通过设备检测实例中的{%endfor%}表示循环的结束,代表着对云服务设备的整个虚拟机列表的遍历完成,而生成的HTML表格可以显示在对资源项进行采集的过程中所有虚拟机的资源参数。First, through the {% for virtual_machine in virtual_machines %} instruction, each virtual machine object in the list named "virtual_machines" is traversed, and in each loop, a row of an HTML table (i.e., a label) is generated, and the label can display the identifier (id), virtual machine name (name) and state data (state) of each virtual machine respectively. In addition, {{virtual_machine.id}}, {{virtual_machine.name}} and {{virtual_machine.state}} can be used to display the placeholders of the resource parameters of the specific virtual machine object. In each loop iteration, these placeholders will be replaced with the resource parameters of the new virtual machine object. After that, the end of the loop is indicated by {% endfor %} in the device detection instance, indicating that the traversal of the entire virtual machine list of the cloud service device is completed, and the generated HTML table can display the resource parameters of all virtual machines in the process of collecting resource items.
在一些实施方式中,为了对资源项的每个检测条目进行精细化检测,并存储检测得到的资源参数,可以执行设备检测实例中每个检测条目对应的检测指令,以得到每个检测条目对应得资源参数,并将每个资源项对应的资源参数填充至设备检测实例中,实现资源参数的分类存储。例如,步骤102可以包括:In some implementations, in order to perform refined detection on each detection item of a resource item and store the resource parameters obtained by the detection, the detection instruction corresponding to each detection item in the device detection instance may be executed to obtain the resource parameters corresponding to each detection item, and the resource parameters corresponding to each resource item are filled into the device detection instance to implement classified storage of resource parameters. For example, step 102 may include:
(102.1)针对数据采集组中的每个待检测的资源项,执行设备检测实例中对应的检测指令,得到每个资源项对应的资源参数;(102.1) For each resource item to be detected in the data collection group, execute the corresponding detection instruction in the device detection instance to obtain the resource parameters corresponding to each resource item;
(102.2)将每个资源项对应的资源参数填充至设备检测实例中,并根据填充后的每个设备检测实例生成云服务设备的设备检测数据。(102.2) Fill the resource parameters corresponding to each resource item into the device detection instance, and generate device detection data of the cloud service device according to each filled device detection instance.
其中,检测指令可以是每个待检测的资源项所执行的特定命令或操作序列,用于从云服务设备中获取该资源项对应的资源参数。具体来说,每个资源项可以对应至少一个检测条目,通过执行设备检测实例对于每个检测条目配置的检测指令,可以获取每个检测条目对应的资源参数。示例性的,检测指令可以是设备特定的命令、脚本或API调用,能够触发云服务设备根据资源项中的检测条目进行检测并获取相关的资源参数。Among them, the detection instruction can be a specific command or operation sequence executed by each resource item to be detected, which is used to obtain the resource parameters corresponding to the resource item from the cloud service device. Specifically, each resource item can correspond to at least one detection item, and the resource parameters corresponding to each detection item can be obtained by executing the detection instruction configured for each detection item of the device detection instance. Exemplarily, the detection instruction can be a device-specific command, script or API call, which can trigger the cloud service device to detect and obtain relevant resource parameters according to the detection items in the resource items.
示例性的,针对数据采集组中的每个待检测的资源项,可以将资源项对应的所有检测条目和对应的检测指令作为静态模板,而执行检测指令后获取的资源参数作为动态数据,将资源参数填充至设备检测实例中,从而结合填充后,设备检测实例内的静态模板和动态数据,生成设备检测数据。在一些实施方式中,可以通过Jinja2模板定制得到设备检测实例。Exemplarily, for each resource item to be detected in the data collection group, all detection items and corresponding detection instructions corresponding to the resource item can be used as static templates, and resource parameters obtained after executing the detection instructions can be used as dynamic data, and the resource parameters are filled into the device detection instance, so as to combine the static template and dynamic data in the device detection instance after filling to generate device detection data. In some embodiments, the device detection instance can be obtained by Jinja2 template customization.
示例性地,填充后的设备检测实例可以如下:Exemplarily, the populated device detection instance may be as follows:
上述填充后的设备检测实例仅仅作为实施例,具体的检测条目、检测指令和资源参数均可以有所不同,具体可以基于实际情况调整。The above-mentioned filled device detection examples are merely examples of embodiments, and specific detection items, detection instructions, and resource parameters may be different and may be adjusted based on actual conditions.
通过执行对应的设备检测指令,可以自动收集资源项的资源参数,而不需要手动对资源项进行逐一检测,节省了大量的检测时间和人力成本。另一方面,由于设置了统一的检测指令,可以确保数据的收集是按照一致的标准进行的,避免了因为不同运维人员执行而导致的数据差异,确保了检测到的资源参数的数据质量和准确性,以便于后续能够获取规范的资源参数填充至设备检测实例中。通过以上方式获取的设备检测数据可以使得运维人员清晰地了解云服务设备的情况,包括各个方面的状态和参数,以便于对设备检测数据进行分析和比对。By executing the corresponding device detection instructions, the resource parameters of resource items can be automatically collected without manually detecting the resource items one by one, saving a lot of detection time and labor costs. On the other hand, since unified detection instructions are set, it can ensure that data collection is carried out in accordance with consistent standards, avoiding data differences caused by execution by different operation and maintenance personnel, and ensuring the data quality and accuracy of the detected resource parameters, so that standardized resource parameters can be obtained and filled into the device detection instance in the future. The device detection data obtained in the above way can enable the operation and maintenance personnel to clearly understand the status of the cloud service equipment, including the status and parameters of various aspects, so as to analyze and compare the device detection data.
步骤103,按照多个检测项类别对设备检测数据进行结构化处理,得到结构化数据集。Step 103 , structure the device detection data according to multiple detection item categories to obtain a structured data set.
在本申请实施例中,为了提高对检测到的设备检测数据的可分析性,可以对设备检测数据进行结构化处理,使其符合特定的格式和标准。In the embodiment of the present application, in order to improve the analyzability of the detected device detection data, the device detection data may be structured so as to conform to a specific format and standard.
其中,检测项类别可以是设备监测数据中不同类别或者类型的检测条目和对应的资源参数,如网卡类别、网络流量类别和存储类别等等。The detection item category may be detection items of different categories or types in the device monitoring data and corresponding resource parameters, such as network card category, network traffic category, storage category, and the like.
其中,按照多个检测项类别对设备检测数据进行结构化处理,可以通过使用文本文件流处理工具(如TextFSM工具)来对设备检测数据进行结构化处理,通过创建不同的结构化模板来从设备检测数据中识别和提取出符合模板的关键字数据,如网卡名称、网络接口、CPU利用率、内存使用量、网络流量等对应的资源参数作为特征值,然后将这些特征值归类到相应的检测项类别中。在一些实施方式中,还可以利用自然语言处理技术、数据建模和特征提取、数据库技术等方法对设备检测数据进行结构化处理,并将特征值归类到不同的检测项类别中。Among them, the device detection data is structured according to multiple detection item categories. The device detection data can be structured by using a text file stream processing tool (such as TextFSM tool), and different structured templates are created to identify and extract keyword data that conforms to the template from the device detection data, such as network card name, network interface, CPU utilization, memory usage, network traffic and other corresponding resource parameters as feature values, and then these feature values are classified into corresponding detection item categories. In some embodiments, natural language processing technology, data modeling and feature extraction, database technology and other methods can also be used to structure the device detection data and classify the feature values into different detection item categories.
需要说明的是,还可以通过自然语言处理技术、数据建模和特征提取、数据库技术、数据标记和分类等方式对设备检测数据进行结构化处理,得到每个检测项类别的特征值。示例性地,可以通过词袋模型、词嵌入或者递归神经网络,将自然语言数据(即设备检测数据)转化为结构化的特征值,并将特征值归类到不同的检测项类别中,具体进行结构化的方法可根据需求进行选择,也可以由以上一个或者多个方式进行组合,在此不作过多限制。It should be noted that the device detection data can also be structured through natural language processing technology, data modeling and feature extraction, database technology, data labeling and classification, etc. to obtain the characteristic value of each detection item category. Exemplarily, the natural language data (i.e., the device detection data) can be converted into structured characteristic values through a bag-of-words model, word embedding, or a recursive neural network, and the characteristic values can be classified into different detection item categories. The specific structuring method can be selected according to the needs, or it can be combined by one or more of the above methods, and no excessive restrictions are made here.
其中,结构化数据集可以是对设备检测数据进行结构化处理之后得到的数据集,结构化数据集可以包括多个检测项类别以及每个检测项类别的特征值。The structured data set may be a data set obtained after structured processing of the device detection data, and the structured data set may include multiple detection item categories and feature values of each detection item category.
通过以上方式,可以将设备检测数据转换为结构化数据,统一数据格式,得到结构化数据集,以便后续对设备检测数据中的异常数据进行快速分析识别。Through the above method, the device detection data can be converted into structured data, the data format can be unified, and a structured data set can be obtained to facilitate subsequent rapid analysis and identification of abnormal data in the device detection data.
在一些实施方式中,为了获取结构化数据集,可以通过获取数据解析模板,并将数据解析模板中的预设关键词跟设备检测数据中的目标词语进行配对,确定每个目标词语的检测项类别,并通过检测项类别获取对应的特征值,得到结构化数据集,以实现对设备检测数据的结构化处理,以便于后续根据结构化的数据对云服务设备的运行状态进行分析。例如,步骤103中的“按照多个检测项类别对设备检测数据进行结构化处理,得到结构化数据集”可以包括:In some implementations, in order to obtain a structured data set, a data parsing template may be obtained, and preset keywords in the data parsing template may be paired with target words in the device detection data to determine the detection item category of each target word, and corresponding feature values may be obtained through the detection item category to obtain a structured data set, so as to achieve structured processing of the device detection data, so as to facilitate subsequent analysis of the operating status of the cloud service device based on the structured data. For example, "structuring the device detection data according to multiple detection item categories to obtain a structured data set" in step 103 may include:
(103.1)获取数据解析模板,数据解析模板包含预设关键词与预设检测项类别的对应关系;(103.1) Obtaining a data analysis template, the data analysis template including a correspondence between preset keywords and preset detection item categories;
(103.2)按照数据解析模板中的预设关键词,对设备检测数据中目标词语进行配对,以确定每个目标词语对应的检测项类别;(103.2) Pairing target words in the device detection data according to preset keywords in the data parsing template to determine the detection item category corresponding to each target word;
(103.3)从设备检测数据中获取每个检测项类别对应的特征值;(103.3) Obtaining feature values corresponding to each detection item category from the device detection data;
(103.4)基于多个检测项类别以及每个检测项类别的特征值,得到结构化数据集。(103.4) Based on multiple detection item categories and feature values of each detection item category, a structured data set is obtained.
其中,数据解析模板可以是一个包括预设关键词与预设检测项类别的对应关系的规则文件或者配置文件,可以用于指导系统从设备检测数据中提取和标记特定类型的设备检测数据,并设备检测数据进行结构化处理。在一些实施方式中,数据解析模板还可以是一个基于正则表达式的配置文件,或者一个数据库中的查找规则表格。示例性的,数据解析模板可以是以下形式:The data parsing template may be a rule file or configuration file including the correspondence between preset keywords and preset detection item categories, which may be used to guide the system to extract and mark specific types of device detection data from the device detection data, and perform structured processing on the device detection data. In some embodiments, the data parsing template may also be a configuration file based on regular expressions, or a search rule table in a database. Exemplarily, the data parsing template may be in the following form:
"检测项类别1":["预设关键词1","预设关键词2","预设关键词3"],"Detection Item Category 1": ["Preset Keyword 1", "Preset Keyword 2", "Preset Keyword 3"],
"检测项类别2":["预设关键词4","预设关键词5"],"Detection Item Category 2": ["Preset Keyword 4", "Preset Keyword 5"],
"检测项类别3":["预设关键词6","预设关键词7","预设关键词8"];"Detection Item Category 3": ["Preset Keyword 6", "Preset Keyword 7", "Preset Keyword 8"];
其中,预设关键词可以是数据解析模板中明确定义的关键词,用于识别和标记设备检测数据中的特定信息,如性能指标、错误日志、配置信息等。每个检测项类别都可以包括至少一个预设关键词,以便于根据预设关键词直接匹配确定对应的设备检测数据的检测项类别。The preset keywords may be keywords clearly defined in the data parsing template, which are used to identify and mark specific information in the device detection data, such as performance indicators, error logs, configuration information, etc. Each detection item category may include at least one preset keyword, so as to directly match and determine the detection item category of the corresponding device detection data according to the preset keywords.
其中,目标词语可以是设备检测数据中需要匹配的具体词语或短语,通过将目标词语和预设关键词进行匹配,可以确定目标词语所对应的检测项类别。The target word may be a specific word or phrase that needs to be matched in the device detection data. By matching the target word with a preset keyword, the detection item category corresponding to the target word may be determined.
其中,特征值可以是从设备检测数据中获取的对应于每个检测项类别的具体数值或特征,如CPU利用率、内存使用量和网络流量等对应的资源参数,例如,CPU利用率为50%,那么特征值可以为50%。The characteristic value may be a specific numerical value or feature corresponding to each detection item category obtained from the device detection data, such as resource parameters corresponding to CPU utilization, memory usage, and network traffic. For example, if the CPU utilization is 50%, then the characteristic value may be 50%.
在一些实施方式中,可以基于预设规则对设备检测数据中目标词语的提取方式进行确定。例如,可以设置不对数字进行提取、不对符号进行提取或者不对谓语进行提取,等等。In some implementations, the method for extracting target words from device detection data may be determined based on preset rules. For example, it may be configured not to extract numbers, not to extract symbols, or not to extract predicates, and so on.
进一步地,提取了设备检测数据中的目标词语之后,可以基于目标词语和预设关键词进行匹配,一旦匹配到对应的预设关键词,即可将对应的设备检测数据标记为预设关键词对应的检测项类别。Furthermore, after extracting the target words in the device detection data, matching can be performed based on the target words and preset keywords. Once the corresponding preset keywords are matched, the corresponding device detection data can be marked as the detection item category corresponding to the preset keywords.
在一些实施方式中,还可以通过预设关键词在设备检测数据中进行匹配,若匹配到设备检测数据中存在于预设关键词对应的目标词语,那么可以将目标词语对应的设备检测数据快速归类为预设关键词所在的检测项类别,提高了对设备检测数据进行结构化的效率。In some embodiments, matching can also be performed in the device detection data through preset keywords. If a target word corresponding to the preset keyword is matched in the device detection data, the device detection data corresponding to the target word can be quickly classified into the detection item category where the preset keyword is located, thereby improving the efficiency of structuring the device detection data.
示例性的,若是提取到的目标词语为CPU、内存、带宽,那么可以根据目标词语匹配对应的预设关键词,若是匹配到预设关键词也为CPU、内存、带宽,且预设关键词对应的检测项类别为性能指标,那么,可以将目标词语对应的设备检测数据归类为性能指标。同样地,其他划分检测项类别的方式与上述实施例相同,在此不予赘述。For example, if the extracted target words are CPU, memory, and bandwidth, then the corresponding preset keywords can be matched according to the target words. If the preset keywords matched are also CPU, memory, and bandwidth, and the detection item category corresponding to the preset keywords is a performance indicator, then the device detection data corresponding to the target words can be classified as a performance indicator. Similarly, other ways of dividing the detection item categories are the same as the above embodiment and will not be repeated here.
可以理解是,为了对每个资源项进行分析,可以从设备检测数据中获取每个检测项类别对应的特征值,也即具体的资源参数,以此,可以将设备检测数据结构化为检测项类别和特征值,便于后续根据特征值进行分析,确定每个检测项类别对应资源项的资源参数是否出现异常。It can be understood that in order to analyze each resource item, the characteristic value corresponding to each detection item category, that is, the specific resource parameter, can be obtained from the device detection data. In this way, the device detection data can be structured into detection item categories and characteristic values, which is convenient for subsequent analysis based on the characteristic values to determine whether there are any abnormalities in the resource parameters of the resource items corresponding to each detection item category.
示例性的,在结构化数据集中,每个检测项类别以及每个检测项类别的特征值都对应保存。具体地,在结构化数据集中,每个检测项类别可以作为表格的列(字段),而每个具体的检测项类别的特征值可以保存在相应的行中。或者,结构化数据集可以存储至数据库中,每个检测项类别可以作为数据库表的名称,而每个检测项类别的特征值可以保存为表中的记录,通过数据库的查询和索引功能,可以方便地对数据进行检索和分析。具体的结构化数据集的存储方式可以根据实际情况进行设置,本申请实施例不作过多限制。Exemplary, in a structured data set, each detection item category and the eigenvalue of each detection item category are correspondingly preserved. Specifically, in a structured data set, each detection item category can be used as the column (field) of a table, and the eigenvalue of each specific detection item category can be stored in a corresponding row. Or, a structured data set can be stored in a database, and each detection item category can be used as the name of a database table, and the eigenvalue of each detection item category can be stored as a record in a table, and by the query and indexing function of a database, data can be easily retrieved and analyzed. The storage mode of a specific structured data set can be arranged according to actual conditions, and the present application embodiment does not make too many restrictions.
通过预设关键词与检测项类别的对应关系,可以快速对设备检测数据划分为对应的检测项类别,避免了人工或半自动化方式中的大量重复劳动,显著提高了数据处理的效率,并且,通过将散乱的设备检测数据转换为易于被计算机程序读取和处理的结构化数据集,不仅便于数据的存储、管理和查询,也为后续的数据分析和应用提供了便利。By presetting the correspondence between keywords and detection item categories, the equipment detection data can be quickly divided into corresponding detection item categories, avoiding a lot of repetitive work in manual or semi-automatic methods and significantly improving the efficiency of data processing. In addition, by converting scattered equipment detection data into structured data sets that are easy to read and process by computer programs, it not only facilitates data storage, management and query, but also provides convenience for subsequent data analysis and application.
步骤104,针对每个检测项类别的特征值,对云服务设备中的资源项进行异常判定,得到每个资源项的检测结果。Step 104 , based on the feature value of each detection item category, perform an abnormality determination on the resource items in the cloud service device to obtain a detection result for each resource item.
可以理解的是,为了发现云服务设备运行过程中的安全风险和潜在问题,可以通过对特征值进行分析,从而确定设备检测数据中是否存在异常数据,以便于及时发现异常数据并对异常数据进行处理,确保云服务设备的正常运行。It is understandable that in order to discover security risks and potential problems during the operation of cloud service equipment, the characteristic values can be analyzed to determine whether there is abnormal data in the device detection data, so as to timely discover and process the abnormal data and ensure the normal operation of the cloud service equipment.
在一些实施方式中,每个检测项类别的特征值均为具体的数值,因此,可以通过计算特征值的特征均值和标准差,超出一定范围的特征值作为异常值,从而识别出每个检测项类别中的异常的特征值,得到每个资源项的检测结果。或者,也可以通过特征值绘制箱线图进行异常判定,超出预设范围的特征值即被判定为异常值。在一些实施方式中,还可以通过机器学习进行异常检测,具体可以将特征值输入至预先训练好的目标模型中,由目标模型进行异常数据的判定和分类。或者,也可以通过无监督学习的算法对资源项进行异常判定,具体可以通过基于密度的方法对特征值的密度进行检测,并将密度低于预设密度的特征值对应的资源项判定为异常,得到每个资源项的检测结果。In some embodiments, the eigenvalues of each detection item category are specific numerical values. Therefore, by calculating the eigenvalue mean and standard deviation of the eigenvalue, the eigenvalues exceeding a certain range can be used as abnormal values, thereby identifying the abnormal eigenvalues in each detection item category and obtaining the detection result of each resource item. Alternatively, a box plot can be drawn by the eigenvalues for abnormal determination, and the eigenvalues exceeding the preset range are determined as abnormal values. In some embodiments, abnormal detection can also be performed by machine learning, specifically, the eigenvalues can be input into a pre-trained target model, and the target model determines and classifies the abnormal data. Alternatively, the resource items can also be determined to be abnormal by an unsupervised learning algorithm, specifically, the density of the eigenvalues can be detected by a density-based method, and the resource items corresponding to the eigenvalues with a density lower than the preset density are determined to be abnormal, and the detection result of each resource item is obtained.
其中,对每个资源项进行异常判定时,可以基于当前的特征值与至少一个历史特征值进行计算分析;也可以对当前同一维度的特征值进行计算分析,如对多个设备的运行温度进行计算分析,从而识别出异常的资源项;也可以通过当前的特征值与历史特征值进行分析,也即将当前温度与历史温度进行分析,从而识别出异常的资源项;或者,也可以基于历史特征值与同一维度的特征值进行计算分析,之后再通过人工对异常数据进行分类,或者通过目标模型对异常数据进行快速分类,具体可视情况进行选择。Among them, when making an abnormal judgment for each resource item, calculation and analysis can be performed based on the current eigenvalue and at least one historical eigenvalue; calculation and analysis can also be performed on the current eigenvalues of the same dimension, such as calculation and analysis on the operating temperatures of multiple devices, so as to identify abnormal resource items; current eigenvalues and historical eigenvalues can also be analyzed, that is, current temperatures can be analyzed with historical temperatures, so as to identify abnormal resource items; or, calculation and analysis can be performed based on historical eigenvalues and eigenvalues of the same dimension, and then the abnormal data can be manually classified, or the abnormal data can be quickly classified through the target model, and the specific selection can be made according to the situation.
在一些实施方式中,为了对云服务设备的更全面分析,可以基于特征值和历史的特征值对检测项类别进行分析,以确保分析结果的准确性。例如,步骤104可以包括:In some implementations, in order to conduct a more comprehensive analysis of the cloud service device, the detection item category may be analyzed based on the feature value and the historical feature value to ensure the accuracy of the analysis result. For example, step 104 may include:
(104.1)针对结构化数据集中每个检测项类别,获取检测项类别对应的至少一个历史特征值;(104.1) For each detection item category in the structured data set, obtaining at least one historical feature value corresponding to the detection item category;
(104.2)根据每个检测项类别的特征值和对应的至少一个历史特征值,确定每个检测项类别的特征均值和标准差;(104.2) determining a feature mean and a standard deviation of each detection item category based on the feature value of each detection item category and at least one corresponding historical feature value;
(104.3)获取检测项类别的阈值范围,基于特征均值、标准差和阈值范围对云服务设备中的资源项进行异常运行判定,得到每个资源项的检测结果。(104.3) Obtain the threshold range of the detection item category, determine the abnormal operation of the resource items in the cloud service device based on the feature mean, standard deviation and threshold range, and obtain the detection result of each resource item.
其中,历史特征值为同一检测项类别在历史时间的特征值,通过特征值结合历史特征值对检测项类别进行分析,可以得到更全面、准确的分析结果。示例性的,历史特征值可以通过查询数据库获取。具体来说,当检测项类别为CPU利用率时,可以将采集到的CPU利用率数据存储在数据库中,在需要对CPU利用率的特征值进行分析时,可以通过查询数据库,获取CPU利用率的历史特征值,比如过去1小时、1天或1周的CPU利用率数据,或者,也可以设定获取历史的10个CPU利用率数据,可以按照顺序选取,也可以随机选取,具体方式不作限定。Among them, the historical characteristic value is the characteristic value of the same detection item category in the historical time. By analyzing the detection item category by combining the characteristic value with the historical characteristic value, a more comprehensive and accurate analysis result can be obtained. Exemplarily, the historical characteristic value can be obtained by querying the database. Specifically, when the detection item category is CPU utilization, the collected CPU utilization data can be stored in the database. When it is necessary to analyze the characteristic value of CPU utilization, the historical characteristic value of CPU utilization can be obtained by querying the database, such as the CPU utilization data of the past hour, day or week. Alternatively, it can be set to obtain 10 historical CPU utilization data, which can be selected in sequence or randomly, and the specific method is not limited.
在一些实施例中,若某一项检测项的特征值超出了正常范围,可能意味着对应的检测项类别出现了异常,因此,需要确定特征值是否超出了正常范围。In some embodiments, if the characteristic value of a certain detection item exceeds the normal range, it may mean that the corresponding detection item category is abnormal. Therefore, it is necessary to determine whether the characteristic value exceeds the normal range.
其中,特征均值是一个检测项类别对应的一组特征值中所有特征值的总和除以特征值的个数,因此,特征均值也称之为特征平均值。特征均值可以代表特征值和历史特征值的平均水平。The feature mean is the sum of all feature values in a set of feature values corresponding to a detection item category divided by the number of feature values. Therefore, the feature mean is also called the feature average value. The feature mean can represent the average level of feature values and historical feature values.
其中,特征值的标准差可以反映特征值和历史特征值的离散程度,也就是波动的大小,如果特征值的标准差较小,说明数据波动较小,检测项类别的特征值相对稳定,如果标准差较大,则说明特征值为异常数据。Among them, the standard deviation of the eigenvalue can reflect the degree of discreteness of the eigenvalue and the historical eigenvalue, that is, the magnitude of the fluctuation. If the standard deviation of the eigenvalue is small, it means that the data fluctuation is small and the eigenvalue of the detection item category is relatively stable. If the standard deviation is large, it means that the eigenvalue is abnormal data.
示例性地,若检测项类别为CPU使用率,CPU使用率的特征均值和标准差计算过程如下:For example, if the detection item category is CPU usage, the characteristic mean and standard deviation of CPU usage are calculated as follows:
若选取特征值和历史特征值为60%、65%、70%、63%、75%、68%、62%、70%、73%、65%、64%、66%,之后,那么特征均值为:If the selected characteristic values and historical characteristic values are 60%, 65%, 70%, 63%, 75%, 68%, 62%, 70%, 73%, 65%, 64%, 66%, then the characteristic mean is:
(60%+65%+70%+63%+75%+68%+62%+70%+73%+65%+64%+66%)/12=67.33%;(60%+65%+70%+63%+75%+68%+62%+70%+73%+65%+64%+66%)/12=67.33%;
在一些实施方式中,可以计算每个特征值或者历史特征值与特征均值的差的平方值,然后将这些差的平方值之和除以特征值的数量再开根号,即得标准差。In some implementations, the square value of the difference between each eigenvalue or historical eigenvalue and the eigenmean value may be calculated, and then the sum of the square values of these differences may be divided by the number of eigenvalues and the square root may be taken to obtain the standard deviation.
标准差的计算过程为:The calculation process of standard deviation is:
计算每次CPU使用率中每个特征值或者历史特征值与特征均值的差的平方值:Calculate the square of the difference between each characteristic value or historical characteristic value and the characteristic mean value in each CPU usage:
(60%-67.33%)^2=54;(60% - 67.33%)^2 = 54;
(65%-67.33%)^2=5.4;(65%-67.33%)^2=5.4;
(70%-67.33%)^2=7.56;(70% - 67.33%)^2 = 7.56;
......
其他差的平方值的计算过程不再赘述。The calculation process of the square values of other differences will not be repeated here.
将这些差的平方值之和除以特征值数量再开根号:Divide the sum of the squares of these differences by the number of eigenvalues and take the square root:
标准差=sqrt((54+5.4+7.56+...);Standard deviation = sqrt((54+5.4+7.56+...);
具体计算方式不再详细展开。The specific calculation method will not be elaborated in detail.
通过以上方式,获取检测项类别的历史特征值,从而可以及时发现检测项类别的异常变化并进行异常运行判定。Through the above method, the historical characteristic values of the detection item category are obtained, so that abnormal changes in the detection item category can be discovered in time and abnormal operation judgment can be made.
其中,阈值范围可以是根据每个检测项类别对应的特征值的变化实时调整的值,也可以是运维人员根据经验设定的值,阈值范围可以用来判断特征值是否异常。例如,超出阈值范围的特征值可以认定为异常的特征值。The threshold range can be a value adjusted in real time according to the change of the characteristic value corresponding to each detection item category, or a value set by the operation and maintenance personnel based on experience. The threshold range can be used to determine whether the characteristic value is abnormal. For example, a characteristic value that exceeds the threshold range can be identified as an abnormal characteristic value.
示例性地,阈值范围可以是某个检测项的数值偏离特征均值超过了预设的标准差倍数的范围,超过了阈值范围的特征值就可以判断为异常特征值。示例性地,若检测项类别为内存利用率,可以将偏离特征均值超过2个标准差(具体可以进行调节)的特征值确定为异常值。当计算得到特征均值为70%、标准差为10%,特征均值加减2个标准差,则确定阈值范围为50%到90%。根据计算得到的阈值范围,可以对实时的内存利用率的特征值进行判定,若特征值为95%,则根据阈值范围(50%至90%),特征值超出了阈值范围,因此可以判定该特征值对应的资源项处于异常运行状态。Exemplarily, the threshold range can be the range in which the numerical deviation of a certain detection item from the feature mean exceeds a preset standard deviation multiple, and the feature value exceeding the threshold range can be judged as an abnormal feature value. Exemplarily, if the detection item category is memory utilization, the feature value that deviates from the feature mean by more than 2 standard deviations (specifically adjustable) can be determined as an abnormal value. When the calculated feature mean is 70% and the standard deviation is 10%, the feature mean is plus or minus 2 standard deviations, and the threshold range is determined to be 50% to 90%. According to the calculated threshold range, the feature value of real-time memory utilization can be determined. If the feature value is 95%, then according to the threshold range (50% to 90%), the feature value exceeds the threshold range, so it can be determined that the resource item corresponding to the feature value is in an abnormal operating state.
可以理解的是,通过设置阈值范围建立基准,能够对每个资源项进行监测和异常判定,以此检测出运行异常的资源项,从而采取相应的措施来解决问题,确保云服务设备的正常运行和稳定。It is understandable that by setting a threshold range to establish a benchmark, each resource item can be monitored and abnormalities can be determined, so as to detect resource items with abnormal operation, so that corresponding measures can be taken to solve the problem and ensure the normal operation and stability of cloud service equipment.
在一些实施方式中,为了对云服务设备的更全面分析,可以基于特征值和历史的特征值对检测项类别进行分析,以确保分析结果的准确性。例如,步骤104还可以包括:In some implementations, in order to conduct a more comprehensive analysis of the cloud service device, the detection item category may be analyzed based on the feature value and the historical feature value to ensure the accuracy of the analysis result. For example, step 104 may also include:
(104.4)基于每个检测项类别对应的当前的特征值,获取检测项类别对应的多个历史特征值;(104.4) based on the current feature value corresponding to each detection item category, obtaining multiple historical feature values corresponding to the detection item category;
(104.5)根据当前的特征值和历史特征值,生成每个检测项类别对应的检测箱线图;(104.5) Generate a detection box plot corresponding to each detection item category based on the current feature value and the historical feature value;
(104.6)在检测箱线图中选定预设范围,将超过预设范围的特征值确定为异常数据;(104.6) Selecting a preset range in the detection box plot, and determining feature values exceeding the preset range as abnormal data;
(104.7)基于异常数据,生成对应的资源项的检测结果。(104.7) Based on the abnormal data, generate detection results of corresponding resource items.
可以理解的是,为了对设备状态进行更全面的分析,可以获取检测项类别对应的至少一个历史特征值作为当前的特征值的参照,以确定当前的特征值是否出现异常。具体的获取检测项类别对应的多个历史特征值的方式已在上文展开,对此不予赘述。It is understandable that in order to conduct a more comprehensive analysis of the device status, at least one historical feature value corresponding to the detection item category can be obtained as a reference for the current feature value to determine whether the current feature value is abnormal. The specific method of obtaining multiple historical feature values corresponding to the detection item category has been developed above and will not be repeated here.
其中,检测箱线图可以用于表征每个检测项类别对应的一组特征值的离散程度,检测箱线图示出了特征值和历史特征值的分布情况。例如,检测箱线图可以包括特征值和历史特征值的最小值、最大值、上四分位数(Q3)、下四分位数(Q1)、中位数(Q2)。通过这些统计量,可以了解特征值的分布范围、集中趋势以及是否存在异常值。Among them, the detection box plot can be used to characterize the discrete degree of a set of eigenvalues corresponding to each detection item category, and the detection box plot shows the distribution of eigenvalues and historical eigenvalues. For example, the detection box plot can include the minimum value, maximum value, upper quartile (Q3), lower quartile (Q1), and median (Q2) of the eigenvalues and historical eigenvalues. Through these statistics, the distribution range, central tendency, and whether there are outliers of the eigenvalues can be understood.
示例性的,检测项类别为云服务设备的CPU利用率,若CPU利用率对应的特征值和历史特征值为:[10,20,30,40,50,60,70,80,90,100]。之后,计算检测箱线图的各个关键值。其中,最小值为10、下四分位数(Q1)为25、中位数(Q2)为55、上四分位数(Q3)为75、最大值为100,之后,可以基于以上数据生成检测项类别对应的检测箱线图。For example, the detection item category is the CPU utilization of the cloud service device. If the characteristic value and historical characteristic value corresponding to the CPU utilization are: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100], then calculate the key values of the detection box plot. Among them, the minimum value is 10, the lower quartile (Q1) is 25, the median (Q2) is 55, the upper quartile (Q3) is 75, and the maximum value is 100. Then, the detection box plot corresponding to the detection item category can be generated based on the above data.
其中,预设范围可以是用于判断特征值是否为异常值的范围,可以将超过预设范围的特征值确定为异常数据,以实现对特征值的快速检测。The preset range may be a range for determining whether a characteristic value is an abnormal value, and a characteristic value exceeding the preset range may be determined as abnormal data to achieve rapid detection of the characteristic value.
在一些实施方式中,可以使用基于离群值的判定标准,根据箱线图中的下四分位数(Q3)与上四分位数(Q1)之差计算得到四分位距(IQR),之后,基于上四分位数、下四分位数和四分位距计算得到检测箱线图的上下界限。具体的,下界限为Q1-1.5*IQR,上界限为Q3+1.5*IQR,之后将上界限和下界限作为预设范围。In some embodiments, a judgment criterion based on outliers can be used to calculate the interquartile range (IQR) based on the difference between the lower quartile (Q3) and the upper quartile (Q1) in the box plot, and then the upper and lower limits of the detection box plot are calculated based on the upper quartile, the lower quartile and the interquartile range. Specifically, the lower limit is Q1-1.5*IQR, the upper limit is Q3+1.5*IQR, and then the upper and lower limits are used as preset ranges.
在一些实施方式中,可以将特征值与上述计算方法计算得到的预设范围进行对比,如果特征值超出了预设范围,即可将特征值确定为异常数据,并基于异常数据,生成对应的资源项的检测结果,例如,检测结果可以为CPU利用率:95%,异常,等等,具体的形式可自行设定。In some implementations, the characteristic value can be compared with a preset range calculated by the above-mentioned calculation method. If the characteristic value exceeds the preset range, the characteristic value can be determined as abnormal data, and based on the abnormal data, a detection result of the corresponding resource item is generated. For example, the detection result may be CPU utilization: 95%, abnormal, etc. The specific form can be set by oneself.
需要说明的是,为了后续对异常的资源项进行分类,可以基于异常数据,生成对应的资源项的检测结果,以助于预测云服务设备可能发生的故障,从而提前采取措施进行防范、规划或调整。具体地,当资源项的特征值为正常时,那么对应的检测结果也为正常,当资源项的特征值为异常时,那么对应的检测结果也为异常。根据检测结果,可以对异常数据进行深入分析和预测,以提高云服务设备的稳定性和可靠性。It should be noted that in order to subsequently classify abnormal resource items, the detection results of the corresponding resource items can be generated based on the abnormal data to help predict possible failures of cloud service equipment, so as to take measures in advance to prevent, plan or adjust. Specifically, when the characteristic value of the resource item is normal, the corresponding detection result is also normal, and when the characteristic value of the resource item is abnormal, the corresponding detection result is also abnormal. Based on the detection results, the abnormal data can be deeply analyzed and predicted to improve the stability and reliability of cloud service equipment.
步骤105,基于每个检测项类别的特征值以及每个资源项的检测结果,生成云服务设备的检测报告。Step 105: Generate a detection report for the cloud service device based on the feature value of each detection item category and the detection result of each resource item.
需要说明的是,为了便于运维人员对检测结果进行查看和逐一分析,及时发现云服务设备可能存在的问题和异常情况,可以基于每个检测项类别的特征值以及每个资源项的检测结果,生成云服务设备的检测报告。It should be noted that in order to facilitate operation and maintenance personnel to view and analyze the detection results one by one, and to promptly discover possible problems and abnormalities in cloud service equipment, a detection report for cloud service equipment can be generated based on the characteristic values of each detection item category and the detection results of each resource item.
其中,检测报告为每个检测项类别的特征值和对应的资源项的检测结果;或者,检测报告还可以为一组资源项的检测结果,例如网络资源组检测均为正常,对于异常的值再单独显示。在一些实施方式中,可以将每个检测项类别的特征值、以及对应的资源项的检测结果的格式进行转换。具体地,可以通过每个检测项类别的特征值、以及对应的资源项的检测结果逐一与Python的JSON库进行比对,将每个检测项类别的特征值、以及对应的资源项的检测结果均转换为JSON格式,再根据具体的云服务设备的检测报告的需求,选取对应的可视化模板,并将转化为JSON格式之后的每个检测项类别的特征值、以及对应的资源项的检测结果填充入对应的可视化模板中,生成独立的HTML文件,这个HTML文件包含了完整的巡检报告内容。进一步的,可以将HTML文件转化为PDF等便于运维人员查看的格式。Among them, the detection report is the characteristic value of each detection item category and the detection result of the corresponding resource item; or, the detection report can also be the detection result of a group of resource items, for example, the detection of the network resource group is normal, and the abnormal values are displayed separately. In some embodiments, the format of the characteristic value of each detection item category and the detection result of the corresponding resource item can be converted. Specifically, the characteristic value of each detection item category and the detection result of the corresponding resource item can be compared with the JSON library of Python one by one, and the characteristic value of each detection item category and the detection result of the corresponding resource item can be converted into JSON format, and then according to the needs of the detection report of the specific cloud service device, the corresponding visualization template is selected, and the characteristic value of each detection item category after conversion into JSON format and the detection result of the corresponding resource item are filled into the corresponding visualization template to generate an independent HTML file, which contains the complete inspection report content. Further, the HTML file can be converted into a format such as PDF that is convenient for operation and maintenance personnel to view.
在一些实施方式中,为了便于运维人员查看和分析,可以将检测结果以及相关数据填充到预设的可视化模板中,从而将检测结果展示给运维人员进行查看,以及时对异常数据进行处理。例如,步骤105可以包括:In some implementations, in order to facilitate the operation and maintenance personnel to view and analyze, the detection results and related data can be filled into a preset visualization template, so that the detection results can be displayed to the operation and maintenance personnel for viewing and timely processing of abnormal data. For example, step 105 can include:
(105.1)将每个检测项类别的特征值以及每个资源项的检测结果转换为目标格式数据;(105.1) Converting the characteristic value of each detection item category and the detection result of each resource item into target format data;
(105.2)获取预设的可视化模板,并将所有资源项对应的目标格式数据填充至可视化模板中,得到填充后的目标可视化模板;(105.2) Obtain a preset visualization template, and fill the target format data corresponding to all resource items into the visualization template to obtain a filled target visualization template;
(105.3)基于填充后的目标可视化模板中的内容,展示云服务设备的检测报告。(105.3) Based on the content in the filled target visualization template, display the detection report of the cloud service device.
其中,目标格式数据可以用于统一检测项类别的特征值和每个资源项的检测结果的规划化格式数据,通过将检测项类别的特征值和每个资源项的检测结果转化为目标格式数据,可以将数据以统一的格式呈现在可视化模板中,从而实现数据的动态展示。示例性的,目标格式数据可以为JSON数据,JSON数据存储于JSON库中,通过将特征值和检测结果在JSON库中查询,可以得到特征值和检测结果对应的JSON格式的数据,并将特征值和检测结果转化为对应的格式,以实现数据的规范化、标准化,便于后续生成可视化页面。Among them, the target format data can be used to unify the feature values of the detection item category and the planned format data of the detection results of each resource item. By converting the feature values of the detection item category and the detection results of each resource item into the target format data, the data can be presented in a unified format in the visualization template, thereby realizing the dynamic display of the data. Exemplarily, the target format data can be JSON data, and the JSON data is stored in a JSON library. By querying the feature values and the detection results in the JSON library, the JSON format data corresponding to the feature values and the detection results can be obtained, and the feature values and the detection results can be converted into the corresponding format to realize the normalization and standardization of the data, so as to facilitate the subsequent generation of visualization pages.
其中,可视化模板可以是预先设置的用于展示数据的结构化模板,可视化模板可以是一个运行脚本,可以根据填充后的目标格式数据运行对应的脚本,并生成可视化页面。示例性的,可视化模板可以由结构化语言构成,如Python语言。示例性的,当特征值和检测结果转化为JSON格式之后,可以将JSON格式的数据转化为结构化语言,具体同样可以通过结构化语言库进行格式转换,之后,将转换后的结构化数据填充至可视化模板中,例如。示例性的,目标可视化模板可以为HTML模板,也可以为图表库模板或者其他可视化工具提供的模板,目标可视化模板需设置占位符或者模板变量,用于填充结构化数据。Among them, the visualization template can be a pre-set structured template for displaying data. The visualization template can be a running script, which can run the corresponding script according to the filled target format data and generate a visualization page. Exemplarily, the visualization template can be composed of a structured language, such as Python. Exemplarily, after the feature values and detection results are converted into JSON format, the data in JSON format can be converted into a structured language. Specifically, the format conversion can also be performed through a structured language library. After that, the converted structured data is filled into the visualization template, for example. Exemplarily, the target visualization template can be an HTML template, or it can be a chart library template or a template provided by other visualization tools. The target visualization template needs to set placeholders or template variables for filling in structured data.
其中,检测报告可以将填充后的目标可视化模板以可视化的形式展示,用于帮助用户直观了解云服务设备的状态和各项检测结果。示例性的,可以通过服务器端的模板引擎或者前端JavaScript,运行填充后的目标可视化模板生成检测报告,即可视化页面,并将生成的可视化页面展示给用户。检测报告可以包括云服务设备的基本信息,例如设备型号、状态、性能数据等,以及其他检测结果,比如网络连接状况、安全性评估等。进一步地,检测报告可以通过浏览器访问、嵌入到其他应用程序中进行展示。Among them, the test report can display the filled target visualization template in a visual form to help users intuitively understand the status of the cloud service device and various test results. Exemplarily, the filled target visualization template can be run through the server-side template engine or front-end JavaScript to generate a test report, i.e., a visualization page, and the generated visualization page is displayed to the user. The test report can include basic information of the cloud service device, such as device model, status, performance data, etc., as well as other test results, such as network connection status, security assessment, etc. Furthermore, the test report can be accessed through a browser and embedded in other applications for display.
示例性地,云服务设备的检测报告可以为:Exemplarily, the detection report of the cloud service device may be:
设备信息Device Information
设备名称:CloudServer1Device name: CloudServer1
设备型号:ABC-2000Equipment model: ABC-2000
检测报告Test Report
网络连接状态:正常Network connection status: Normal
存储空间:60%已使用Storage space: 60% used
CPU利用率:40%CPU Utilization: 40%
上述的检测报告仅作为举例,用户可以通过自身需要对检测报告进行个性化设置。The above test reports are only examples, and users can personalize the test reports according to their own needs.
在本申请实施例中,一方面,当每个资源项的检测结果都为正常时,可以基于每个检测项类别的特征值以及每个资源项的检测结果,生成云服务设备的检测报告,检测报告可以包括具体的资源项、对应的特征值以及检测为正常的结论,也可以是整个云服务设备运行正常的结论,具体的展示内容、展示形式都可以个性化设定,对此不予限制。另一方面,当存在资源项的检测结果异常时,则需要对检测异常的特征作进一步分析,以提高对检测资源项的准确性。需要说明的是,该判定检测结果正常与否,可通过特征值与预设阈值或预设范围值进行对比,以确定是否存在检测异常。其中,该预设阈值和预设范围值可根据相应的待检测资源项或待检测类别对应的一个或多个历史特征值来确定,比如通过处于正常运行状态的最大和最小得历史特征值来设定该预设范围值,或者根据处于正常运行状态的最大的历史特征值确定该预设阈值。In the embodiment of the present application, on the one hand, when the detection result of each resource item is normal, a detection report of the cloud service device can be generated based on the characteristic value of each detection item category and the detection result of each resource item. The detection report can include specific resource items, corresponding characteristic values and the conclusion that the detection is normal, or it can be the conclusion that the entire cloud service device is running normally. The specific display content and display form can be personalized and are not limited to this. On the other hand, when there is an abnormal detection result of a resource item, it is necessary to further analyze the characteristics of the abnormal detection to improve the accuracy of the detection resource item. It should be noted that the determination of whether the detection result is normal or not can be compared with the characteristic value and the preset threshold value or the preset range value to determine whether there is a detection abnormality. Among them, the preset threshold value and the preset range value can be determined according to one or more historical characteristic values corresponding to the corresponding resource item to be detected or the category to be detected, such as setting the preset range value by the maximum and minimum historical characteristic values in the normal operation state, or determining the preset threshold value according to the maximum historical characteristic value in the normal operation state.
在一些实施方式中,当存在资源项的检测结果异常时,特征值超过了预设范围/阈值范围时,可以对检测结果为异常的资源项对应的特征值输入到预先训练好的目标模型中,以使得目标模型根据特征值进行分类,得到异常的特征值的异常标签,从而生成检测报告,因此,检测报告的生成过程还可以包括:In some embodiments, when there is an abnormal detection result of a resource item and the feature value exceeds a preset range/threshold range, the feature value corresponding to the resource item with the abnormal detection result can be input into a pre-trained target model, so that the target model is classified according to the feature value, and an abnormal label of the abnormal feature value is obtained, thereby generating a detection report. Therefore, the generation process of the detection report may also include:
(A.1)若检测结果为异常时,将检测结果为异常的特征值作为待分析特征;(A.1) If the test result is abnormal, the feature value of the abnormal test result is used as the feature to be analyzed;
(A.2)将待分析特征输入至目标模型,通过目标模型基于待分析特征中每个子特征以及子特征之间的特征关系进行分类,得到异常类别标签;(A.2) Input the feature to be analyzed into the target model, and classify it based on each sub-feature in the feature to be analyzed and the feature relationship between the sub-features through the target model to obtain an abnormal category label;
(A.3)将检测结果为正常的特征值确定为目标特征值;(A.3) determining the characteristic value with a normal detection result as the target characteristic value;
(A.4)根据目标特征值、目标特征值对应的检测结果、待分析特征以及待分析特征对应的异常类别标签,生成云服务设备的检测报告。(A.4) Generate a detection report for the cloud service device based on the target feature value, the detection result corresponding to the target feature value, the feature to be analyzed, and the abnormal category label corresponding to the feature to be analyzed.
其中,检测结果为异常表明云服务设备可能存在故障、配置错误或者受到了意外的网络活动影响,异常情况可能导致云服务设备的服务中断、性能下降、安全漏洞等问题。因此,发现异常的检测结果需要及时对异常的特征值进行分析和处理,以免导致系统故障或数据泄露等风险。Among them, abnormal detection results indicate that the cloud service device may have a fault, misconfiguration, or be affected by unexpected network activities. Abnormal situations may cause service interruption, performance degradation, security vulnerabilities, etc. of the cloud service device. Therefore, when abnormal detection results are found, the abnormal feature values need to be analyzed and processed in a timely manner to avoid risks such as system failure or data leakage.
其中,待分析特征为检测结果为异常的资源项对应的特征值,特征值可以包括各种指标和属性。示例性的,在云服务设备中监测到异常的网络流量,而异常网络流量有两种情况:一种是由于DDoS攻击导致的大量恶意流量进入系统,另一种是由于突发性的用户行为或者系统故障导致的异常流量突增。因此,需要通过目标模型对待分析特征进行分类,以具体了解造成异常网络流量的原因,从而有针对性地采取相应的措施。例如,前者可能需要尽快采取封堵攻击的措施,后者可能需要进一步排查故障原因并及时处理。因此,通过将特征值作为待分析特征输入至目标模型中,可以对特征值进行分类,从而根据分类结果有针对性地解决相关的问题,提高检测结果的准确性。Among them, the feature to be analyzed is the feature value corresponding to the resource item whose detection result is abnormal, and the feature value may include various indicators and attributes. Exemplarily, abnormal network traffic is monitored in the cloud service device, and there are two situations of abnormal network traffic: one is a large amount of malicious traffic entering the system due to DDoS attacks, and the other is a sudden increase in abnormal traffic due to sudden user behavior or system failure. Therefore, it is necessary to classify the features to be analyzed through the target model to understand the specific causes of abnormal network traffic, so as to take corresponding measures in a targeted manner. For example, the former may need to take measures to block the attack as soon as possible, and the latter may need to further investigate the cause of the failure and deal with it in time. Therefore, by inputting the feature value as the feature to be analyzed into the target model, the feature value can be classified, so as to solve related problems in a targeted manner according to the classification results, and improve the accuracy of the detection results.
其中,目标模型可以为已经训练好的Transformer模型,通过Transformer模型的自注意力机制,可以根据输入序列中不同位置的子特征之间的依赖关系动态地分配注意力,从而得到异常类别标签。Among them, the target model can be a trained Transformer model. Through the self-attention mechanism of the Transformer model, attention can be dynamically allocated according to the dependency between sub-features at different positions in the input sequence, so as to obtain the abnormal category label.
示例性的,将待分析特征输入至目标模型之前,可以对输入的待分析特征进行预处理,例如归一化处理,将待分析特征整理成适合Transformer模型输入的格式,当待分析特征有多个子特征时,可以将多个子特征组织成一个输入张量。Exemplarily, before inputting the feature to be analyzed into the target model, the input feature to be analyzed can be preprocessed, such as normalization, to organize the feature to be analyzed into a format suitable for Transformer model input. When the feature to be analyzed has multiple sub-features, the multiple sub-features can be organized into one input tensor.
进一步的,将待分析特征输入至目标模型时,目标模型可以基于待分析特征中的每个子特征以及子特征之间的关系进行分类,得到异常类别标签。例如,当待分析特征为网络流量特征值时,网络流量特征值的子特征可以为源IP地址、目标IP地址、端口号、协议类型、数据包大小等,之后,可以将这些子特征表示为一个矩阵,每一行代表一个时间窗口内的网络流量数据,而每一列代表一个子特征,并将这个矩阵输入到目标模型中。在目标模型内部,经过多层的自注意力机制和前馈神经网络层的处理,目标模型将学习网络流量特征值中各个子特征之间的复杂关系。具体的,通过自注意力机制,目标模型可以区分不同时间窗口内各个子特征之间的依赖关系,例如发现某个特定的源IP地址与异常流量之间的关联或者某种特定的协议类型与网络攻击之间的相关性等等,从而实现对待分析特征的精确识别和分类。Furthermore, when the feature to be analyzed is input into the target model, the target model can classify based on each sub-feature in the feature to be analyzed and the relationship between the sub-features to obtain an abnormal category label. For example, when the feature to be analyzed is a network traffic feature value, the sub-features of the network traffic feature value can be the source IP address, the destination IP address, the port number, the protocol type, the packet size, etc. After that, these sub-features can be represented as a matrix, each row represents the network traffic data in a time window, and each column represents a sub-feature, and this matrix is input into the target model. Inside the target model, after being processed by multiple layers of self-attention mechanism and feedforward neural network layer, the target model will learn the complex relationship between each sub-feature in the network traffic feature value. Specifically, through the self-attention mechanism, the target model can distinguish the dependency between each sub-feature in different time windows, such as discovering the association between a specific source IP address and abnormal traffic or the correlation between a specific protocol type and network attack, etc., so as to achieve accurate identification and classification of the feature to be analyzed.
其中,检测结果可以包括正常的特征值和异常的特征值。正常的特征值即为表明用户或系统对云服务设备的登录行为符合正常的使用模式的特征值、表明正常的数据访问模式和资源利用模式的特征值、表明正常的系统行为的特征值、表明正常的流量模式的特征值、表明合法的IP地址和端口的特征值,等等。由于正常的特征值不必进行分类,也不必进一步处理,因此,无需将正常的特征值输入到目标模型中。The detection results may include normal feature values and abnormal feature values. Normal feature values are feature values indicating that the user or system's login behavior to the cloud service device conforms to normal usage patterns, feature values indicating normal data access patterns and resource utilization patterns, feature values indicating normal system behavior, feature values indicating normal traffic patterns, feature values indicating legal IP addresses and ports, etc. Since normal feature values do not need to be classified or further processed, there is no need to input normal feature values into the target model.
可以理解的是,为了对云服务设备的整体运行状态进行汇总,可以生成对云服务设备的整体检测结果,包括正常的检测结果和异常的检测结果,以助于运维人员了解系统当前的运行状态。It is understandable that in order to summarize the overall operating status of the cloud service equipment, an overall detection result of the cloud service equipment can be generated, including normal detection results and abnormal detection results, to help operation and maintenance personnel understand the current operating status of the system.
示例性的,生成的云服务设备的检测报告可以采取多种形式,包括文本报告、图表报告和可视化报告等。具体来说,可以先将目标特征值、目标特征值对应的检测结果、待分析特征、待分析特征对应的异常类别标签均转化为JSON格式的数据(或者其他特定格式),再通过JSON格式的数据填充至HTML模板中,根据HTML模板生成对应形式的检测报告。Exemplarily, the detection report of the generated cloud service device can take multiple forms, including text reports, chart reports, and visual reports, etc. Specifically, the target feature value, the detection result corresponding to the target feature value, the feature to be analyzed, and the abnormal category label corresponding to the feature to be analyzed can be converted into JSON format data (or other specific formats), and then filled into the HTML template with the JSON format data, and the corresponding form of the detection report is generated according to the HTML template.
具体的,文字报告可以是包含文字描述的报告,用于详细说明目标特征值、目标特征值对应的检测结果、待分析特征以及待分析特征对应的异常类别标签。例如:“根据对CPU利用率的分析,发现系统CPU利用率持续超过80%,经过检测,发现该特征值对应的检测结果为异常。待分析特征显示了内存占用率及网络流量出现异常波动,对应的异常类别标签为网络故障”。在一些实施方式中,可以根据预先设置的数据库查找网络故障对应的处理措施,并将对应的处理措施附加于异常类别标签之后,例如,沿用上述例子,可以在“对应的异常类别标签为网络故障”之后,加入“建议排查网络设备和进行网络流量分析来解决问题”。而对于目标特征值和目标特征值对应的检测结果,可以生成如“根据对CPU利用率的分析,发现系统CPU利用率为30%,经过检测,发现该特征值对应的检测结果为正常”,具体的文字报告形式可以根据需要进行调整。Specifically, the text report may be a report containing a text description, which is used to explain in detail the target feature value, the detection result corresponding to the target feature value, the feature to be analyzed, and the abnormal category label corresponding to the feature to be analyzed. For example: "According to the analysis of CPU utilization, it is found that the system CPU utilization continues to exceed 80%. After detection, it is found that the detection result corresponding to the feature value is abnormal. The feature to be analyzed shows that the memory occupancy rate and network traffic have abnormal fluctuations, and the corresponding abnormal category label is network failure." In some embodiments, the processing measures corresponding to the network failure can be found according to a pre-set database, and the corresponding processing measures can be attached to the abnormal category label. For example, following the above example, "It is recommended to check the network equipment and perform network traffic analysis to solve the problem" can be added after "The corresponding abnormal category label is network failure." For the target feature value and the detection result corresponding to the target feature value, it can be generated as "According to the analysis of CPU utilization, it is found that the system CPU utilization is 30%. After detection, it is found that the detection result corresponding to the feature value is normal." The specific text report format can be adjusted as needed.
在一些实施方式中,还可以根据目标特征值、目标特征值对应的检测结果、待分析特征以及待分析特征对应的异常类别标签,生成以图表形式展示的检测报告,例如表格、柱状图、折线图、饼图等,直观地展示系统运行状态的数据变化和异常情况。例如,可以制作CPU利用率随时间变化的折线图,正常点用同一种颜色展示,异常点用不同颜色或标记突出显示。In some embodiments, a detection report can be generated in the form of a chart, such as a table, a bar chart, a line chart, a pie chart, etc., based on the target feature value, the detection result corresponding to the target feature value, the feature to be analyzed, and the abnormal category label corresponding to the feature to be analyzed, so as to intuitively display the data changes and abnormal conditions of the system operation status. For example, a line chart of the CPU utilization rate changing over time can be made, with normal points displayed in the same color and abnormal points highlighted in different colors or marks.
在一些实施方式中,还可以根据目标特征值、目标特征值对应的检测结果、待分析特征以及待分析特征对应的异常类别标签,生成通过数据可视化的形式呈现报告,例如热力图、散点图、雷达图等,直观地展示特征值之间的关系和异常情况的分布。例如,可以使用热力图展示不同特征值之间的相关性,或者使用散点图展示异常特征值的分布情况。In some embodiments, a report can be generated based on the target feature value, the detection result corresponding to the target feature value, the feature to be analyzed, and the abnormal category label corresponding to the feature to be analyzed, so as to intuitively display the relationship between the feature values and the distribution of abnormal situations, such as a heat map, a scatter plot, a radar map, etc. For example, a heat map can be used to display the correlation between different feature values, or a scatter plot can be used to display the distribution of abnormal feature values.
通过以上方式,可以生成云服务设备的检测报告,从而便于运维人员对云服务设备的整体运行状态进行评估,并对待分析特征以及对应的异常类别标签进行及时分析和处理,为用户提供更可靠、安全和高效的云服务。Through the above methods, a detection report of the cloud service equipment can be generated, which is convenient for operation and maintenance personnel to evaluate the overall operating status of the cloud service equipment, and to timely analyze and process the features to be analyzed and the corresponding abnormal category labels, so as to provide users with more reliable, secure and efficient cloud services.
进一步的,目标模型可以对待分析特征进行准确分类,以实现检测过程的自动化。目标模型可以通过以下过程进行训练:Furthermore, the target model can accurately classify the features to be analyzed to automate the detection process. The target model can be trained through the following process:
(a)获取样本待分析特征以及样本待分析特征对应的样本异常类别标签;(a) Obtain the features of the sample to be analyzed and the sample abnormality category labels corresponding to the features of the sample to be analyzed;
(b)将样本待分析特征输入至预设模型,得到预测类别标签;(b) Input the features of the sample to be analyzed into the preset model to obtain the predicted category label;
(c)基于样本异常类别标签与预测类别标签之间的差异,构建目标损失函数;(c) constructing a target loss function based on the difference between the sample abnormal category label and the predicted category label;
(d)根据目标损失函数对预设模型进行迭代训练,直至达到预设条件,得到训练后的目标模型。(d) Iteratively train the preset model according to the target loss function until the preset conditions are met to obtain the trained target model.
其中,样本待分析特征可以是在训练预设模型时所使用的特征数据,可以被用于预设模型的训练,以帮助预设模型更好地理解和分类不同类型的异常情况。例如,样本待分析特征可以包括系统性能指标、网络流量、用户行为等与云服务设备相关的数据。The sample features to be analyzed may be feature data used in training a preset model, and may be used in the training of the preset model to help the preset model better understand and classify different types of abnormal situations. For example, the sample features to be analyzed may include data related to cloud service devices, such as system performance indicators, network traffic, and user behavior.
其中,子特征可以是指样本待分析特征中的一个或者多个特定特征,用于在训练预设模型时作为输入数据。这些子特征可以是系统性能指标、网络流量、用户行为等与云服务设备相关的数据中选定的某个特征或特征组合。Among them, sub-features can refer to one or more specific features in the sample features to be analyzed, which are used as input data when training the preset model. These sub-features can be a feature or feature combination selected from data related to cloud service devices such as system performance indicators, network traffic, user behavior, etc.
其中,异常类别标签可以是与样本待分析特征相关联的问题分类标签,用于指示样本待分析特征所对应的异常情况的类型或程度。在监督学习中,这些异常类别标签将与样本待分析特征一起被用于训练预设模型,使得预设模型能够学习样本待分析特征与异常类别标签之间的关联,从而提高预设模型对异常数据的分类能力。示例性的,异常类别标签可以表示样本待分析特征的问题分类,例如对于网络流量,样本异常类别标签可以为攻击的类型、数据泄露等等。在一些实施方式中,样本异常类别标签除了具体的异常类别,还可以附带异常的程度,比如轻微、严重等等。Among them, the abnormal category label can be a problem classification label associated with the sample feature to be analyzed, which is used to indicate the type or degree of abnormal situation corresponding to the sample feature to be analyzed. In supervised learning, these abnormal category labels will be used together with the sample features to be analyzed to train the preset model, so that the preset model can learn the association between the sample features to be analyzed and the abnormal category label, thereby improving the preset model's ability to classify abnormal data. Exemplarily, the abnormal category label can represent the problem classification of the sample feature to be analyzed. For example, for network traffic, the sample abnormal category label can be the type of attack, data leakage, etc. In some embodiments, in addition to the specific abnormal category, the sample abnormal category label can also be accompanied by the degree of abnormality, such as mild, severe, etc.
其中,预设模型可以为分类模型,预设模型可以通过学习样本待分析特征与异常类别标签之间的关联,从而具备对异常数据进行分类的能力。例如,预设模型可以为Transformer模型。The preset model may be a classification model, and the preset model may be capable of classifying abnormal data by learning the association between the features to be analyzed of the sample and the abnormal category label. For example, the preset model may be a Transformer model.
其中,预测类别标签可以是基于预设模型对样本待分析特征进行处理后得到的分类标签,用来表示模型对样本的预测分类结果。示例性的,在对样本待分析特征值进行预测的过程中,预设模型将学习每个样本待分析特征中各个子特征之间的关系,并据此进行预测,输出对应的预测类别标签。进一步的,预设模型可以在训练的过程中学习注意力权重,以确定每个子特征与其他子特征之间的关联程度,并有效地捕捉到不同子特征之间的复杂关系,包括各个子特征之间的依赖和影响,从而生成预测类别标签。Among them, the predicted category label can be a classification label obtained after processing the features to be analyzed of the sample based on the preset model, which is used to represent the model's predicted classification result for the sample. Exemplarily, in the process of predicting the feature values of the sample to be analyzed, the preset model will learn the relationship between each sub-feature in each sample feature to be analyzed, and make predictions based on this, and output the corresponding predicted category label. Furthermore, the preset model can learn attention weights during the training process to determine the degree of association between each sub-feature and other sub-features, and effectively capture the complex relationships between different sub-features, including the dependencies and influences between the sub-features, thereby generating a predicted category label.
其中,目标损失函数可以是衡量预设模型的预测类别标签与样本异常类别标签之间的差异、误差或损失的函数。在对预设模型进行训练的过程中,可以通过最小化目标损失函数来优化预设模型的参数,以使预设模型的预测类别标签更加接近实际的样本异常类别标签。示例性的,当预设模型为Transformer模型时,可以使用交叉熵损失函数作为训练时的目标损失函数。交叉熵损失函数可以通过将预设模型对每个样本异常类别标签的概率与预测类别标签的预测概率相乘并求和,然后取负对数来衡量预设模型的预测类别标签与样本异常类别标签之间的差异。示例性地,若预设模型对样本待分析特征预测为异常登录的预测类别标签的预测概率为0.8(即异常登录的概率为0.8),而样本待分析特征的样本异常类别标签为1,则交叉熵损失将是-l og(0.8)。之后,将所有样本待分析特征的交叉熵损失相加并取平均值,可以作为预设模型的目标损失函数,通过最小化目标损失函数,可以使得预设模型更加准确地预测输出预测类别标签。Among them, the target loss function can be a function that measures the difference, error or loss between the predicted category label of the preset model and the sample abnormal category label. In the process of training the preset model, the parameters of the preset model can be optimized by minimizing the target loss function so that the predicted category label of the preset model is closer to the actual sample abnormal category label. Exemplarily, when the preset model is a Transformer model, the cross entropy loss function can be used as the target loss function during training. The cross entropy loss function can measure the difference between the predicted category label of the preset model and the sample abnormal category label by multiplying and summing the probability of the preset model for each sample abnormal category label with the predicted probability of the predicted category label, and then taking the negative logarithm. Exemplarily, if the predicted probability of the predicted category label of the sample to be analyzed by the preset model is 0.8 (that is, the probability of abnormal login is 0.8), and the sample abnormal category label of the sample to be analyzed is 1, then the cross entropy loss will be -log(0.8). Afterwards, the cross entropy losses of all sample features to be analyzed are added and averaged, which can be used as the target loss function of the preset model. By minimizing the target loss function, the preset model can more accurately predict the output predicted category label.
需要说明的是,为了使得预设模型的性能不断优化,可以对预设模型进行迭代训练,并根据目标损失函数不断更新预设模型的参数,使得预设模型朝着减小目标损失函数的方向优化,不断提高分类的准确性。It should be noted that in order to continuously optimize the performance of the preset model, the preset model can be iteratively trained, and the parameters of the preset model can be continuously updated according to the target loss function, so that the preset model is optimized in the direction of reducing the target loss function, thereby continuously improving the classification accuracy.
示例性地,预设条件可以是在训练预设模型前或者训练预设模型时预先设定的训练条件或训练指标,用于判断预设模型训练是否成功,是否满足特定的训练要求。预设条件可以是预设的训练轮次、损失收敛阈值、性能评价指标等,例如,可以设定预设模型的训练轮次为500次达到预设条件,得到训练后的目标模型,也可以设定损失收敛阈值,当损失值小于损失收敛阈值之后,停止对预设模型的训练,得到训练后的目标模型,等等。Exemplarily, the preset condition may be a training condition or training index preset before or during training of the preset model, which is used to determine whether the training of the preset model is successful and whether it meets specific training requirements. The preset condition may be a preset training round, a loss convergence threshold, a performance evaluation index, etc. For example, the training round of the preset model may be set to 500 times to meet the preset condition and obtain the trained target model. The loss convergence threshold may also be set. When the loss value is less than the loss convergence threshold, the training of the preset model is stopped to obtain the trained target model, and so on.
通过以上方式对预设模型进行训练,可以有效提高得到的目标模型的分类准确度,以加快对异常的检测结果识别的效率,生成更加准确、详尽的检测报告。By training the preset model in the above manner, the classification accuracy of the obtained target model can be effectively improved, so as to speed up the efficiency of identifying abnormal detection results and generate more accurate and detailed detection reports.
在一些实施方式中,可以获取在执行设备检测实例时每个资源项对应的检测日志,并对检测日志进行保存,从而对云服务设备的检测过程进行记录,以便于后续分析或者查看。例如,在步骤105之后,还可以包括:In some implementations, the detection log corresponding to each resource item when executing the device detection instance can be obtained and saved, thereby recording the detection process of the cloud service device for subsequent analysis or viewing. For example, after step 105, it can also include:
(105.a.1)获取在执行设备检测实例时每个资源项对应的检测日志;(105.a.1) Obtain the detection log corresponding to each resource item when executing the device detection instance;
(106.a.2)按照检测结果,对每个资源项的检测日志进行分类保存。(106.a.2) According to the detection results, the detection log of each resource item is classified and saved.
其中,检测日志可以是在执行设备检测实例时每个资源项对应的检测日志,用于记录每个资源项的运行状态、性能表现和可能出现的问题。通过获取每个资源项的检测日志,可以全面了解云服务设备在运行过程中各项资源的情况,有利于发现和解决潜在故障或性能瓶颈,以保障设备的稳定性、安全性和可靠性。The detection log can be the detection log corresponding to each resource item when executing the device detection instance, which is used to record the operating status, performance and possible problems of each resource item. By obtaining the detection log of each resource item, you can fully understand the status of each resource of the cloud service device during operation, which is conducive to discovering and solving potential faults or performance bottlenecks to ensure the stability, security and reliability of the device.
在一些实施方式中,为了便于运维人员随时查看检测日志、更好地组织和管理检测日志,可以对每个资源项的检测日志进行分类保存。示例性的,当对一个云服务设备进行性能检测时,按照检测结果,可以将资源项的检测日志分别保存在对应的分类下,比如创建名为“CPU使用率”、“内存使用情况”和“网络流量”的文件夹或数据库表格,分别存储相关的检测日志数据,以具体发现云服务设备在运行过程中可能存在的问题。In some implementations, in order to facilitate operation and maintenance personnel to view detection logs at any time and better organize and manage detection logs, the detection logs of each resource item can be classified and saved. For example, when a cloud service device is subjected to performance testing, the detection logs of resource items can be saved in corresponding categories according to the test results, such as creating folders or database tables named "CPU usage", "memory usage" and "network traffic" to store relevant detection log data, so as to specifically discover problems that may exist in the operation of the cloud service device.
可以理解的是,云服务设备的检测系统还可以包括数据库(Database,DB)模块和网络模块。具体地,DB模块可以用于提供对存储有检测日志的数据库的访问和管理功能,包括用户信息表、任务信息表、任务执行记录表、资源信息表、检测日志表、检测报告及检测权限表等等。而WEB模块提供用户界面操作,用于检测报告管理,检测任务管理,检测权限管理,检测模板管理,检测分析管理,检测日志管理等等。It is understandable that the detection system of the cloud service device may also include a database (DB) module and a network module. Specifically, the DB module may be used to provide access and management functions to a database storing detection logs, including a user information table, a task information table, a task execution record table, a resource information table, a detection log table, a detection report, and a detection authority table, etc. The WEB module provides user interface operations for detection report management, detection task management, detection authority management, detection template management, detection analysis management, detection log management, etc.
通过获取在执行设备检测实例时每个资源项对应的检测日志,可以对不同资源项进行独立的分析和查询,更好地了解云服务设备的运行状况,及时发现异常情况并进行处理,并为后续的管理和优化工作提供支持。可以理解的是,本申请还可以设置检测定时器对云服务设备进行定时检测,以确保云服务设备的正常运行。具体地,可以对故障高发的资源项设置较短的检测间隔,如每小时检测一次;对故障少发的资源项设置较长的检测间隔,如每两天检测一次,等等,从而能够在节约设备资源的同时有效对云服务设备进行检测。By obtaining the detection log corresponding to each resource item when executing the device detection instance, different resource items can be analyzed and queried independently, so as to better understand the operating status of the cloud service equipment, promptly detect and handle abnormal situations, and provide support for subsequent management and optimization work. It is understandable that the present application can also set a detection timer to perform regular detection of cloud service equipment to ensure the normal operation of the cloud service equipment. Specifically, a shorter detection interval can be set for resource items with a high incidence of failures, such as once an hour; a longer detection interval can be set for resource items with a low incidence of failures, such as once every two days, and so on, so as to effectively detect cloud service equipment while saving equipment resources.
本申请实施例通过获取云服务设备对应的检查清单,并根据检查清单中包含的多项待检测的资源项配置云服务设备对应的设备检测实例;通过执行设备检测实例,以从云服务设备采集每个资源项对应的资源参数,得到设备检测数据;按照多个检测项类别对设备检测数据进行结构化处理,得到结构化数据集,结构化数据集包括多个检测项类别以及每个检测项类别的特征值;针对每个检测项类别的特征值,对云服务设备中的资源项进行异常判定,得到每个资源项的检测结果;基于每个检测项类别的特征值以及每个资源项的检测结果,生成云服务设备的检测报告。因此,可针对云服务设备所包含的资源项来准确配置对应的设备检测实例,通过执行设备检测实例,以实现从云服务器设备自动化采集每个资源项对应的资源参数,有效提高了对云服务设备检测的效率,进而,通过对设备检测数据进行结构化处理,以基于结构化数据中的每个检测项类别的特征值进行异常判定,无需人工干预,快速分析得到每个资源项的检测结果,结合检测结果生成检测报告,可在提高检测效率的同时,提高对云服务设备进行检测时的稳定性和准确性。The embodiment of the present application obtains a checklist corresponding to a cloud service device, and configures a device detection instance corresponding to the cloud service device according to multiple resource items to be detected contained in the checklist; collects resource parameters corresponding to each resource item from the cloud service device by executing the device detection instance to obtain device detection data; performs structured processing on the device detection data according to multiple detection item categories to obtain a structured data set, which includes multiple detection item categories and feature values of each detection item category; performs abnormality judgment on the resource items in the cloud service device according to the feature values of each detection item category to obtain a detection result for each resource item; and generates a detection report for the cloud service device based on the feature values of each detection item category and the detection result of each resource item. Therefore, the corresponding device detection instances can be accurately configured for the resource items contained in the cloud service device. By executing the device detection instances, the resource parameters corresponding to each resource item can be automatically collected from the cloud server device, which effectively improves the efficiency of cloud service device detection. Furthermore, by structuring the device detection data, anomaly judgment is made based on the characteristic values of each detection item category in the structured data. Without human intervention, the detection results of each resource item can be quickly analyzed and obtained, and a detection report can be generated based on the detection results. This can improve the detection efficiency while improving the stability and accuracy of cloud service device detection.
请参阅图3,本申请实施例还提供一种云服务设备的检测装置,可以实现上述云服务设备的检测方法,云服务设备的检测装置包括:Referring to FIG. 3 , an embodiment of the present application further provides a detection device for a cloud service device, which can implement the above-mentioned detection method for a cloud service device. The detection device for a cloud service device includes:
配置模块31,用于获取云服务设备对应的检查清单,并根据检查清单中包含的多项待检测的资源项配置云服务设备对应的设备检测实例;Configuration module 31, used to obtain a checklist corresponding to the cloud service device, and configure a device detection instance corresponding to the cloud service device according to multiple resource items to be detected included in the checklist;
执行模块32,用于通过执行设备检测实例,以从云服务设备采集每个资源项对应的资源参数,得到设备检测数据;An execution module 32, configured to collect resource parameters corresponding to each resource item from the cloud service device by executing the device detection instance to obtain device detection data;
处理模块33,用于按照多个检测项类别对设备检测数据进行结构化处理,得到结构化数据集,结构化数据集包括多个检测项类别以及每个检测项类别的特征值;A processing module 33, configured to perform structured processing on the device detection data according to multiple detection item categories to obtain a structured data set, wherein the structured data set includes multiple detection item categories and feature values of each detection item category;
判定模块34,用于针对每个检测项类别的特征值,对云服务设备中的资源项进行异常运行判定,得到每个资源项的检测结果;The determination module 34 is used to determine abnormal operation of resource items in the cloud service device according to the characteristic value of each detection item category, and obtain the detection result of each resource item;
生成模块35,用于基于每个检测项类别的特征值以及每个资源项的检测结果,生成云服务设备的检测报告。The generating module 35 is used to generate a detection report of the cloud service device based on the feature value of each detection item category and the detection result of each resource item.
该云服务设备的检测装置的具体实施方式与上述云服务设备的检测方法的具体实施例基本相同,在此不再赘述。在满足本申请实施例要求的前提下,云服务设备的检测装置还可以设置其他功能模块,以实现上述实施例中的云服务设备的检测方法。The specific implementation of the detection device of the cloud service device is basically the same as the specific implementation of the detection method of the cloud service device described above, and will not be repeated here. On the premise of meeting the requirements of the embodiments of this application, the detection device of the cloud service device can also be provided with other functional modules to implement the detection method of the cloud service device in the above embodiments.
本申请实施例还提供了一种计算机设备,计算机设备包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现上述云服务设备的检测方法。该计算机设备可以为包括平板电脑、车载电脑等任意智能终端。The embodiment of the present application also provides a computer device, the computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the above-mentioned cloud service device detection method when executing the computer program. The computer device can be any intelligent terminal including a tablet computer, a car computer, etc.
请参阅图4,图4示意了另一实施例的计算机设备的硬件结构,计算机设备包括:Please refer to FIG. 4 , which schematically shows the hardware structure of a computer device according to another embodiment. The computer device includes:
处理器41,可以采用通用的CPU(CentralProcessingUnit,中央处理器)、微处理器、应用专用集成电路(ApplicationSpecificIntegratedCircuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本申请实施例所提供的技术方案;The processor 41 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute relevant programs to implement the technical solutions provided in the embodiments of the present application;
存储器42,可以采用只读存储器(ReadOnlyMemory,ROM)、静态存储设备、动态存储设备或者随机存取存储器(RandomAccessMemory,RAM)等形式实现。存储器42可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器42中,并由处理器41来调用执行本申请实施例的云服务设备的检测方法;The memory 42 can be implemented in the form of a read-only memory (ROM), a static storage device, a dynamic storage device, or a random access memory (RAM). The memory 42 can store an operating system and other applications. When the technical solution provided in the embodiment of this specification is implemented by software or firmware, the relevant program code is stored in the memory 42, and the processor 41 calls and executes the detection method of the cloud service device in the embodiment of this application;
输入/输出接口43,用于实现信息输入及输出;Input/output interface 43, used to implement information input and output;
通信接口44,用于实现本设备与其他设备的通信交互,可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信;The communication interface 44 is used to realize the communication interaction between the device and other devices. The communication can be realized through a wired manner (such as USB, network cable, etc.) or a wireless manner (such as mobile network, WIFI, Bluetooth, etc.);
总线45,在设备的各个组件(例如处理器41、存储器42、输入/输出接口43和通信接口44)之间传输信息;A bus 45 that transmits information between the various components of the device (e.g., the processor 41, the memory 42, the input/output interface 43, and the communication interface 44);
其中处理器41、存储器42、输入/输出接口43和通信接口44通过总线45实现彼此之间在设备内部的通信连接。The processor 41 , the memory 42 , the input/output interface 43 and the communication interface 44 are connected to each other in communication within the device via a bus 45 .
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述云服务设备的检测方法。An embodiment of the present application also provides a computer-readable storage medium, which stores a computer program. When the computer program is executed by a processor, the detection method of the cloud service device is implemented.
存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该处理器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory, as a non-transient computer-readable storage medium, can be used to store non-transient software programs and non-transient computer executable programs. In addition, the memory may include a high-speed random access memory, and may also include a non-transient memory, such as at least one disk storage device, a flash memory device, or other non-transient solid-state storage device. In some embodiments, the memory may optionally include a memory remotely disposed relative to the processor, and these remote memories may be connected to the processor via a network. Examples of the above-mentioned network include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
本申请实施例描述的实施例是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域技术人员可知,随着技术的演变和新应用场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。The embodiments described in the embodiments of the present application are intended to more clearly illustrate the technical solutions of the embodiments of the present application and do not constitute a limitation on the technical solutions provided in the embodiments of the present application. Those skilled in the art will appreciate that with the evolution of technology and the emergence of new application scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
本领域技术人员可以理解的是,图中示出的技术方案并不构成对本申请实施例的限定,可以包括比图示更多或更少的步骤,或者组合某些步骤,或者不同的步骤。Those skilled in the art will appreciate that the technical solutions shown in the figures do not constitute a limitation on the embodiments of the present application, and may include more or fewer steps than shown in the figures, or a combination of certain steps, or different steps.
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The device embodiments described above are merely illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。Those skilled in the art will appreciate that all or some of the steps in the methods disclosed above, and the functional modules/units in the systems and devices may be implemented as software, firmware, hardware, or a suitable combination thereof.
本申请的说明书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", "fourth", etc. (if any) in the specification of the present application and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the data used in this way can be interchangeable where appropriate, so that the embodiments of the present application described herein can be implemented in an order other than those illustrated or described herein. In addition, the terms "including" and "having" and any of their variations are intended to cover non-exclusive inclusions, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those steps or units clearly listed, but may include other steps or units that are not clearly listed or inherent to these processes, methods, products or devices.
应当理解,在本申请中,“至少一个(项)”和“若干”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。It should be understood that in the present application, "at least one (item)" and "several" refer to one or more, and "plurality" refers to two or more. "And/or" is used to describe the association relationship of associated objects, indicating that three relationships may exist. For example, "A and/or B" can mean: only A exists, only B exists, and A and B exist at the same time, where A and B can be singular or plural. The character "/" generally indicates that the previous and following associated objects are in an "or" relationship. "At least one of the following" or similar expressions refers to any combination of these items, including any combination of single or plural items. For example, at least one of a, b or c can mean: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, c can be single or multiple.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统和方法,可以通过其它的方式实现。例如,以上所描述的系统实施例仅仅是示意性的,例如,上述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in the present application, it should be understood that the disclosed systems and methods can be implemented in other ways. For example, the system embodiments described above are merely schematic. For example, the division of the above units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be an indirect coupling or communication connection through some interfaces, devices or units, which can be electrical, mechanical or other forms.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括多指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例的方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-On ly Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等各种可以存储程序的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including multiple instructions to enable a computer device (which can be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), disk or optical disk and other media that can store programs.
以上参照附图说明了本申请实施例的优选实施例,并非因此局限本申请实施例的权利范围。本领域技术人员不脱离本申请实施例的范围和实质内所作的任何修改、等同替换和改进,均应在本申请实施例的权利范围之内。The preferred embodiments of the present application are described above with reference to the accompanying drawings, but the scope of the rights of the present application is not limited thereto. Any modification, equivalent substitution and improvement made by a person skilled in the art without departing from the scope and essence of the present application should be within the scope of the rights of the present application.
Claims (13)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410207997.1A CN118132303A (en) | 2024-02-22 | 2024-02-22 | Cloud service equipment detection method, device, equipment and readable storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410207997.1A CN118132303A (en) | 2024-02-22 | 2024-02-22 | Cloud service equipment detection method, device, equipment and readable storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN118132303A true CN118132303A (en) | 2024-06-04 |
Family
ID=91236902
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410207997.1A Pending CN118132303A (en) | 2024-02-22 | 2024-02-22 | Cloud service equipment detection method, device, equipment and readable storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118132303A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119376992A (en) * | 2024-12-23 | 2025-01-28 | 神州灵云(北京)科技有限公司 | A method, system, device and medium for self-detecting abnormality of a component |
-
2024
- 2024-02-22 CN CN202410207997.1A patent/CN118132303A/en active Pending
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119376992A (en) * | 2024-12-23 | 2025-01-28 | 神州灵云(北京)科技有限公司 | A method, system, device and medium for self-detecting abnormality of a component |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111343173B (en) | Data access abnormity monitoring method and device | |
| CN111835585B (en) | Inspection method and device for Internet of things equipment, computer equipment and storage medium | |
| CN115033876B (en) | Log processing methods, log processing devices, computer equipment, and storage media | |
| US9411673B2 (en) | Management server, management system, and management method | |
| CN105184886A (en) | Cloud data center intelligence inspection system and cloud data center intelligence inspection method | |
| CN113392426A (en) | Method and system for enhancing data privacy of an industrial or electrical power system | |
| US20170109639A1 (en) | General Model for Linking Between Nonconsecutively Performed Steps in Business Processes | |
| CN114124509A (en) | Spark-based network abnormal flow detection method and system | |
| CN114528457A (en) | Web fingerprint detection method and related equipment | |
| CN117240527A (en) | Network security risk prevention system and method | |
| CN111475380A (en) | Log analysis method and device | |
| CN119477021A (en) | A method and system for evaluating enterprise supply chain based on big data | |
| CN107430590B (en) | System and method for data comparison | |
| CN118132303A (en) | Cloud service equipment detection method, device, equipment and readable storage medium | |
| CN104461847B (en) | Data processor detection method and device | |
| US20170109637A1 (en) | Crowd-Based Model for Identifying Nonconsecutive Executions of a Business Process | |
| CN120011179B (en) | Inspection method for multi-system operation and maintenance using artificial intelligence-based RPA inspection robots | |
| CN120821509A (en) | Component configuration method, device, computer equipment and storage medium | |
| CN117056209B (en) | Software defect prediction model, interpretation method and quantitative evaluation method | |
| CN119127691A (en) | A regression testing method, device, equipment and medium based on SDK | |
| CN119046118A (en) | Big data-based computer network intelligent analysis platform | |
| Nurcahyo et al. | Classification of Simulated Fake Bandwidth Data Using LSTM | |
| Maia et al. | One class density estimation approach for fault detection and rootcause analysis in computer networks | |
| KR20220032706A (en) | Apparatus and method for visualizing causality of events | |
| CN119172257A (en) | A local area network website operation system and operation method based on all elements |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |