CN106254166A

CN106254166A - A kind of cloud platform resource allocation method based on Disaster Preparation Center and system

Info

Publication number: CN106254166A
Application number: CN201610874942.1A
Authority: CN
Inventors: 李兴锋; 郝建明; 张炼; 宋泽锋; 伍福生; 简超; 韩笑; 潘星明
Original assignee: China Unionpay Co Ltd
Current assignee: China Unionpay Co Ltd
Priority date: 2016-09-30
Filing date: 2016-09-30
Publication date: 2016-12-21
Anticipated expiration: 2036-09-30
Also published as: CN106254166B

Abstract

The invention provides a kind of cloud platform resource allocation method based on Disaster Preparation Center and system, relate to disaster tolerance technology field.Described method includes: gather the load data of the server that each operation system of Disaster Preparation Center's deployment is corresponding in cloud platform；Capturing service service data from production environment；Described load data and service operation data are classified according to the disaster tolerance mode hierarchies preset, obtains the loading condition data of each operation system under different disaster tolerance pattern；According to default tactful allocation list and loading condition data, described cloud platform is carried out resource distribution.Achieve and for different calamity molar formula corresponding different strategy, virtual machine, physical machine are migrated, reach to economize on resources, the effect of energy efficient.

Description

A cloud platform resource configuration method and system based on disaster recovery center

技术领域technical field

本发明关于容灾技术领域，特别是关于灾备中心的资源调度技术，具体的讲是一种基于灾备中心的云平台资源配置方法及系统。The present invention relates to the technical field of disaster recovery, in particular to the resource scheduling technology of the disaster recovery center, specifically a cloud platform resource configuration method and system based on the disaster recovery center.

背景技术Background technique

本部分旨在为权利要求书中陈述的本发明的实施方式提供背景或上下文。此处的描述不因为包括在本部分中就承认是现有技术。This section is intended to provide a background or context for implementations of the invention that are recited in the claims. The descriptions herein are not admitted to be prior art by inclusion in this section.

随着业务的发展，灾备中心部署了越来越多的业务系统，其中包括了双活、主辅等架构的核心生产系统，还包括热备、温备、冷备架构的多种灾备模式的大量业务系统，用于保障生产系统的安全稳定运行，且大部分业务系统将逐步迁移至云平台中。目前，云平台物理机集群中平均每台物理机上都有较多虚拟机运行，在业务高峰时容易出现某一台物理机负载过高而同集群中其他物理机负载低，也即物理机资源使用率的不均衡，从而影响业务运行效率并造成资源浪费。With the development of business, the disaster recovery center has deployed more and more business systems, including the core production system of active-active, primary and secondary architectures, as well as various disaster recovery of hot standby, warm standby, and cold standby architectures A large number of business systems in the model are used to ensure the safe and stable operation of the production system, and most of the business systems will be gradually migrated to the cloud platform. At present, in the physical machine cluster of the cloud platform, there are many virtual machines running on each physical machine on average. During peak business hours, it is easy for a certain physical machine to have a high load while other physical machines in the same cluster have a low load, that is, physical machine resources The unbalanced utilization rate will affect the efficiency of business operation and cause waste of resources.

目前，物理机监控一般使用Patrol监控，但patrol监控与云平台为相互独立的平台，无通用接口，无法将物理机的负载与云平台的热迁移等高级特性结合起来。At present, Patrol monitoring is generally used for physical machine monitoring, but Patrol monitoring and the cloud platform are independent platforms without a common interface, and it is impossible to combine the load of the physical machine with advanced features such as hot migration of the cloud platform.

因此，如何研究和开发出一种新的方案以针对不同的灾备模式对云平台资源进行配置是本领域亟待解决的技术难题。Therefore, how to research and develop a new solution to configure cloud platform resources for different disaster recovery modes is a technical problem to be solved urgently in this field.

发明内容Contents of the invention

为了克服现有技术存在的上述技术问题，本发明提供了一种基于灾备中心的云平台资源配置方法以及系统，通过获取云平台上灾备中心部署的各个业务系统对应的服务器的负载数据，并从生产环境中获取业务运行数据，引入预设的灾容模式等级进行分类，并结合预设的策略配置表进行资源配置，实现了针对不同的灾容模式对应不同的策略对虚拟机、物理机进行迁移，达到节约资源、节约能耗的效果。In order to overcome the above-mentioned technical problems existing in the prior art, the present invention provides a cloud platform resource configuration method and system based on the disaster recovery center, by obtaining the load data of the servers corresponding to the various business systems deployed by the disaster recovery center on the cloud platform, And obtain business operation data from the production environment, introduce the preset disaster recovery mode level to classify, and combine the preset policy configuration table to configure resources, and realize different strategies corresponding to different disaster recovery modes for virtual machines, physical Machines are migrated to achieve the effect of saving resources and energy consumption.

为了实现上述目的，本发明提供一种基于灾备中心的云平台资源配置方法，所述方法包括：采集云平台上灾备中心部署的各个业务系统对应的服务器的负载数据；从生产环境中采集业务运行数据；将所述负载数据以及业务运行数据根据预设的容灾模式等级进行分类，得到不同容灾模式下各个业务系统的负载情况数据；根据预设的策略配置表以及负载情况数据对所述云平台进行资源配置。In order to achieve the above object, the present invention provides a cloud platform resource configuration method based on a disaster recovery center, the method comprising: collecting the load data of servers corresponding to each business system deployed by the disaster recovery center on the cloud platform; collecting data from the production environment Business operation data; classify the load data and business operation data according to the preset disaster recovery mode level to obtain the load status data of each business system in different disaster recovery modes; according to the preset policy configuration table and load status data The cloud platform configures resources.

在本发明的优选实施方式中，采用脚本采集服务器的负载数据，所述服务器包括虚拟机以及物理机。In a preferred embodiment of the present invention, a script is used to collect load data of a server, and the server includes a virtual machine and a physical machine.

在本发明的优选实施方式中，所述负载数据包括CPU使用率、内存使用率以及存储容量使用率，所述业务运行数据包括交易量日均值以及交易量日峰值。In a preferred embodiment of the present invention, the load data includes CPU usage, memory usage, and storage capacity usage, and the business operation data includes a daily average value of transaction volume and a daily peak value of transaction volume.

在本发明的优选实施方式中，所述容灾模式包括双活容灾模式、主辅容灾模式、温备容灾模式以及冷备容灾模式。In a preferred embodiment of the present invention, the disaster recovery mode includes a dual-active disaster recovery mode, a master-slave disaster recovery mode, a warm-standby disaster recovery mode, and a cold-standby disaster recovery mode.

在本发明的优选实施方式中，当所述业务系统为双活容灾模式或主辅容灾模式时，根据策略配置表以及负载情况数据对所述云平台进行资源配置包括：从所述生产环境中采集预测信息；根据预测信息以及所述策略配置表确定所述业务系统的优化策略；根据所述优化策略以及负载情况数据对所述业务系统进行资源配置。In a preferred embodiment of the present invention, when the business system is in a dual-active disaster recovery mode or a primary-slave disaster recovery mode, configuring resources on the cloud platform according to the policy configuration table and load data includes: Collect prediction information in the environment; determine the optimization strategy of the business system according to the prediction information and the policy configuration table; configure the resources of the business system according to the optimization strategy and load data.

在本发明的优选实施方式中，当所述业务系统为温备容灾模式或冷备容灾模式时，根据策略配置表以及负载情况数据对所述云平台进行资源配置包括根据策略配置表确定所述业务系统的优化策略；根据所述优化策略以及负载情况数据对所述业务系统进行资源配置。In a preferred embodiment of the present invention, when the business system is in warm standby disaster recovery mode or cold standby disaster recovery mode, configuring resources on the cloud platform according to the policy configuration table and load data includes determining the An optimization strategy of the business system; performing resource allocation on the business system according to the optimization strategy and load data.

在本发明的优选实施方式中，根据所述优化策略以及负载情况数据对所述业务系统进行资源配置包括从所述优化策略中获取设定信息；In a preferred embodiment of the present invention, configuring the resources of the business system according to the optimization strategy and load data includes obtaining setting information from the optimization strategy;

当所述业务系统的资源配置不满足所述设定信息时，根据所述业务系统对应的物理机的名称从预设数据库中读取所述物理机对应的虚拟机；从所述负载情况数据中获取所述虚拟机的CPU使用率除以及内存使用率；根据所述CPU使用率除以及内存使用率确定出所述虚拟机的系数；根据所述虚拟机的系数以及所述优化策略选取待迁移的虚拟机；根据待迁移的虚拟机选取待迁移的物理机；迁移所述待迁移的物理机以及待迁移的虚拟机进行，以使迁移后的所述业务系统满足所述设定信息。When the resource configuration of the business system does not meet the set information, read the virtual machine corresponding to the physical machine from the preset database according to the name of the physical machine corresponding to the business system; from the load data Obtain the CPU usage ratio and the memory usage ratio of the virtual machine; determine the coefficient of the virtual machine according to the CPU usage ratio and the memory usage ratio; select the virtual machine according to the virtual machine coefficient and the optimization strategy The virtual machine to be migrated; selecting the physical machine to be migrated according to the virtual machine to be migrated; migrating the physical machine to be migrated and the virtual machine to be migrated, so that the service system after migration meets the set information.

本发明的目的之一是，提供了一种基于灾备中心的云平台资源配置系统，所述的系统包括负载数据采集装置，用于采集云平台上灾备中心部署的各个业务系统对应的服务器的负载数据；运行数据采集装置，用于从生产环境中采集业务运行数据；数据分类装置，用于将所述负载数据以及业务运行数据根据预设的容灾模式等级进行分类，得到不同容灾模式下各个业务系统的负载情况数据；资源配置装置，用于根据预设的策略配置表以及负载情况数据对所述云平台进行资源配置。One of the purposes of the present invention is to provide a cloud platform resource configuration system based on a disaster recovery center, the system includes a load data collection device for collecting the servers corresponding to each business system deployed by the disaster recovery center on the cloud platform The load data; the operation data acquisition device is used to collect business operation data from the production environment; the data classification device is used to classify the load data and business operation data according to the preset disaster recovery mode level to obtain different disaster recovery The load status data of each business system in the mode; the resource configuration device is used to configure the resources of the cloud platform according to the preset policy configuration table and the load status data.

在本发明的优选实施方式中，当所述业务系统为温备容灾模式或冷备容灾模式时，所述资源配置装置包括第一优化策略确定模块，用于根据策略配置表确定所述业务系统的优化策略；资源配置模块，用于根据所述优化策略以及负载情况数据对所述业务系统进行资源配置。In a preferred embodiment of the present invention, when the business system is in warm standby disaster recovery mode or cold standby disaster recovery mode, the resource configuration device includes a first optimization policy determination module, which is used to determine the business system according to the policy configuration table an optimization strategy; a resource configuration module configured to configure resources for the business system according to the optimization strategy and load data.

在本发明的优选实施方式中，当所述业务系统为双活容灾模式或主辅容灾模式时，所述资源配置装置还包括预测信息采集模块，用于从所述生产环境中采集预测信息；In a preferred embodiment of the present invention, when the business system is in the active-active disaster recovery mode or the primary-slave disaster recovery mode, the resource configuration device further includes a prediction information collection module, which is used to collect prediction information from the production environment information;

第二优化策略确定模块，用于根据预测信息以及所述策略配置表确定所述业务系统的优化策。The second optimization strategy determination module is configured to determine the optimization strategy of the business system according to the prediction information and the strategy configuration table.

在本发明的优选实施方式中，所述资源配置模块包括获取单元，用于从所述优化策略中获取设定信息；In a preferred embodiment of the present invention, the resource configuration module includes an acquisition unit, configured to acquire setting information from the optimization strategy;

读取单元，用于当所述业务系统的资源配置不满足所述设定信息时，根据所述业务系统对应的物理机的名称从预设数据库中读取所述物理机对应的虚拟机；使用率获取单元，用于从所述负载情况数据中获取所述虚拟机的CPU使用率除以及内存使用率；系数确定单元，用于根据所述CPU使用率除以及内存使用率确定出所述虚拟机的系数；第一确定单元，用于根据所述虚拟机的系数以及所述优化策略选取待迁移的虚拟机；第二确定单元，用于根据待迁移的虚拟机选取待迁移的物理机；迁移单元，用于迁移所述待迁移的物理机以及待迁移的虚拟机，以使迁移后的所述业务系统满足所述设定信息。A reading unit, configured to read the virtual machine corresponding to the physical machine from a preset database according to the name of the physical machine corresponding to the business system when the resource configuration of the business system does not meet the set information; The usage acquisition unit is configured to obtain the CPU usage ratio and the memory usage ratio of the virtual machine from the load condition data; the coefficient determination unit is configured to determine the CPU usage ratio and the memory usage ratio according to the CPU usage ratio and the memory usage ratio. The coefficient of the virtual machine; the first determination unit is used to select the virtual machine to be migrated according to the coefficient of the virtual machine and the optimization strategy; the second determination unit is used to select the physical machine to be migrated according to the virtual machine to be migrated and a migration unit, configured to migrate the physical machine to be migrated and the virtual machine to be migrated, so that the service system after migration meets the setting information.

本发明的有益效果在于，提供了一种基于灾备中心的云平台资源配置方法以及系统，通过获取云平台上灾备中心部署的各个业务系统对应的服务器的负载数据，并从生产环境中获取业务运行数据，引入预设的灾容模式等级进行分类，并结合预设的策略配置表进行资源配置，实现了针对不同的灾容模式对应不同的策略对虚拟机、物理机进行迁移，达到节约资源、节约能耗的效果。The beneficial effect of the present invention is that it provides a cloud platform resource configuration method and system based on the disaster recovery center, by obtaining the load data of the servers corresponding to each business system deployed in the disaster recovery center on the cloud platform, and obtaining the load data from the production environment The business operation data is classified by introducing the preset disaster recovery mode level, and the resource configuration is combined with the preset policy configuration table, which realizes the migration of virtual machines and physical machines according to different policies corresponding to different disaster recovery modes, and achieves saving resources and save energy.

为让本发明的上述和其他目的、特征和优点能更明显易懂，下文特举较佳实施例，并配合所附图式，作详细说明如下。In order to make the above and other objects, features and advantages of the present invention more comprehensible, preferred embodiments will be described in detail below together with the accompanying drawings.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1为本发明实施例提供的一种基于灾备中心的云平台资源配置方法的流程图；Fig. 1 is a flow chart of a cloud platform resource configuration method based on a disaster recovery center provided by an embodiment of the present invention;

图2为图1中的步骤S104的实施方式一的流程图；FIG. 2 is a flowchart of Embodiment 1 of step S104 in FIG. 1;

图3为图1中的步骤S104的实施方式二的流程图；FIG. 3 is a flowchart of the second embodiment of step S104 in FIG. 1;

图4为图2中的步骤S303的流程图；Fig. 4 is the flowchart of step S303 in Fig. 2;

图5为本发明实施例提供的一种基于灾备中心的云平台资源配置系统的结构框图；Fig. 5 is a structural block diagram of a cloud platform resource configuration system based on a disaster recovery center provided by an embodiment of the present invention;

图6为本发明实施例提供的一种基于灾备中心的云平台资源配置系统中资源配置装置的实施方式一的结构框图；6 is a structural block diagram of Embodiment 1 of a resource configuration device in a cloud platform resource configuration system based on a disaster recovery center provided by an embodiment of the present invention;

图7为本发明实施例提供的一种基于灾备中心的云平台资源配置系统中资源配置装置的实施方式二的结构框图；7 is a structural block diagram of Embodiment 2 of a resource configuration device in a cloud platform resource configuration system based on a disaster recovery center provided by an embodiment of the present invention;

图8为本发明实施例提供的一种基于灾备中心的云平台资源配置系统中资源配置模块的结构框图。FIG. 8 is a structural block diagram of a resource configuration module in a cloud platform resource configuration system based on a disaster recovery center provided by an embodiment of the present invention.

具体实施方式detailed description

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

本领域技术技术人员知道，本发明的实施方式可以实现为一种系统、装置、设备、方法或计算机程序产品。因此，本公开可以具体实现为以下形式，即：完全的硬件、完全的软件(包括固件、驻留软件、微代码等)，或者硬件和软件结合的形式。Those skilled in the art know that the embodiments of the present invention can be implemented as a system, device, device, method or computer program product. Therefore, the present disclosure may be embodied in the form of complete hardware, complete software (including firmware, resident software, microcode, etc.), or a combination of hardware and software.

下面参考本发明的若干代表性实施方式，详细阐释本发明的原理和精神。The principle and spirit of the present invention will be explained in detail below with reference to several representative embodiments of the present invention.

目前使用的云平台管理工具中，可监控各虚拟机的分布情况，对物理机资源使用情况监控功能暂时没有，物理机监控使用Patrol商用产品监控，但patrol监控与云资源管理平台为相互独立的平台，无通用接口，无法将物理机的负载与云平台的热迁移等高级特性结合起来。Among the currently used cloud platform management tools, the distribution of each virtual machine can be monitored, and the monitoring function of physical machine resource usage is temporarily absent. Physical machine monitoring uses Patrol commercial product monitoring, but Patrol monitoring and cloud resource management platforms are independent of each other. Platform, without a common interface, cannot combine the load of the physical machine with advanced features such as live migration of the cloud platform.

目前云平台管理工具的虚拟机迁移等功能，针对的场景是物理机故障的高可用迁移，该功能是普适云计算平台最基本的高可用特性，要实现适用于灾备中心业务特点的自动化资源优化，须向云平台引入数据采集、数据分析、过程控制、优化、决策等功能。At present, the virtual machine migration and other functions of cloud platform management tools are aimed at the high-availability migration of physical machine failures. This function is the most basic high-availability feature of the pervasive cloud computing platform. It is necessary to realize the automation suitable for the business characteristics of the disaster recovery center For resource optimization, functions such as data acquisition, data analysis, process control, optimization, and decision-making must be introduced to the cloud platform.

本发明针对上述技术问题，提出了一种基于灾备中心的云平台资源配置方法以及系统。Aiming at the above technical problems, the present invention proposes a cloud platform resource configuration method and system based on a disaster recovery center.

图1为本发明提出的一种基于灾备中心的云平台资源配置方法的具体流程图，请参阅图1，所述的方法包括：Fig. 1 is the concrete flowchart of a kind of cloud platform resource allocation method based on disaster recovery center that the present invention proposes, please refer to Fig. 1, described method comprises:

S101：采集云平台上灾备中心部署的各个业务系统对应的服务器的负载数据。S101: Collect load data of servers corresponding to various business systems deployed in the disaster recovery center on the cloud platform.

在具体的实施例中，可通过脚本等方式采集服务器的负载数据，所述服务器包括虚拟机以及物理机，所述负载数据包括CPU使用率、内存使用率以及存储容量使用率。In a specific embodiment, the load data of the server may be collected through a script or the like. The server includes a virtual machine and a physical machine, and the load data includes CPU usage, memory usage, and storage capacity usage.

S102：从生产环境中采集业务运行数据。在本发明中，所提及的生产环境是指在典型的金融系统中，用于处理实际金融交易信息的IT系统环境。在具体的实施方式中，所述业务运行数据包括交易量日均值以及交易量日峰值。即步骤S102打通了灾备中心与生产中心的交易监控接口S102: Collect business operation data from the production environment. In the present invention, the mentioned production environment refers to the IT system environment used to process actual financial transaction information in a typical financial system. In a specific implementation manner, the business operation data includes a daily average value of transaction volume and a daily peak value of transaction volume. That is, step S102 opens up the transaction monitoring interface between the disaster recovery center and the production center

S103：将所述负载数据以及业务运行数据根据预设的容灾模式等级进行分类，得到不同容灾模式下各个业务系统的负载情况数据。在具体的实施方式中，预设的容灾模式等级中包括双活容灾模式、主辅容灾模式、温备容灾模式以及冷备容灾模式。容灾模式等级可配置，对负载数据以及业务运行数据进行分类，形成以业务系统为维度的负载情况数据，以供后续步骤进行决策。S103: Classify the load data and business operation data according to the preset disaster recovery mode level to obtain load data of each business system in different disaster recovery modes. In a specific implementation manner, the preset disaster recovery mode levels include active-active disaster recovery mode, primary and secondary disaster recovery mode, warm-standby disaster recovery mode, and cold-standby disaster recovery mode. The disaster recovery mode level is configurable, and the load data and business operation data are classified to form load data with the business system as the dimension for decision-making in subsequent steps.

S104：根据预设的策略配置表以及负载情况数据对所述云平台进行资源配置。S104: Perform resource allocation on the cloud platform according to a preset policy configuration table and load condition data.

图2为步骤S104的实施方式一的流程图，请参阅图2，在实施方式一中，当所述业务系统为温备容灾模式或冷备容灾模式时，步骤S104包括：Fig. 2 is a flow chart of the first embodiment of step S104, please refer to Fig. 2, in the first embodiment, when the business system is in warm backup disaster recovery mode or cold backup disaster recovery mode, step S104 includes:

S201：根据策略配置表确定所述业务系统的优化策略；S201: Determine the optimization strategy of the business system according to the strategy configuration table;

S202：根据所述优化策略以及负载情况数据对所述业务系统进行资源配置。S202: Perform resource allocation on the business system according to the optimization strategy and load data.

在实施方式一中，策略配置表如表1所示，其示出了当所述业务系统为温备容灾模式或冷备容灾模式时，工作日、周末以及节日三种情景下分别对应的优化策略。以温备容灾模式为例，则其工作日对应的优化策略为“历史峰值均值，以前一周峰值作为参考”，即在该种情形下，以前一周的历史峰值的均值作为参考。In Embodiment 1, the policy configuration table is shown in Table 1, which shows when the business system is in the warm standby disaster recovery mode or the cold standby disaster recovery mode, corresponding optimizations under the three scenarios of weekdays, weekends and holidays Strategy. Taking the warm standby disaster recovery mode as an example, the optimization strategy corresponding to the working day is "the average value of the historical peak value, and the peak value of the previous week is used as a reference", that is, in this case, the average value of the historical peak value of the previous week is used as a reference.

表1Table 1

图3为步骤S104的实施方式二的流程图，请参阅图3，在实施方式二中，当所述业务系统为双活容灾模式或主辅容灾模式时，步骤S104包括：FIG. 3 is a flow chart of the second implementation of step S104. Please refer to FIG. 3. In the second implementation, when the business system is in a dual-active disaster recovery mode or a master-slave disaster recovery mode, step S104 includes:

S301：从所述生产环境中采集预测信息，此处的预测信息来源于生产环境的预测平台，通常输入为指定的未来某个时间段，输出为预测的该时间段的业务量峰值，在这两种模式下获取预测信息的意义在于既可以保证历史峰值业务量的处理能力，也保证预测平台预测的峰值业务量的处理能力。S301: Collect forecast information from the production environment. The forecast information here comes from the forecast platform of the production environment. Usually, the input is a specified future time period, and the output is the predicted business volume peak value in this time period. In this The significance of obtaining forecast information in the two modes is that it can not only guarantee the processing capacity of the historical peak traffic volume, but also guarantee the processing capacity of the peak traffic volume predicted by the forecasting platform.

S302：根据策略配置表确定所述业务系统的优化策略；S302: Determine the optimization strategy of the business system according to the strategy configuration table;

S303：根据所述优化策略以及负载情况数据对所述业务系统进行资源配置。S303: Perform resource allocation on the business system according to the optimization strategy and load data.

在实施方式二中，策略配置表如表2所示，其示出了当所述业务系统为双活容灾模式或主辅容灾模式时，工作日、周末以及节日三种情景下分别对应的优化策略。以双活容灾模式为例，则其工作日对应的优化策略为“双中心历史峰值之和，以前一月峰值作为参”，即在该种情形下，以双中心的前一月的历史峰值的和作为参考，此处的双中心指的是北京中心以及上海中心。In Embodiment 2, the policy configuration table is shown in Table 2, which shows that when the business system is in the active-active disaster recovery mode or the primary and secondary disaster recovery mode, the corresponding optimization strategy. Taking the active-active disaster recovery model as an example, the optimization strategy corresponding to the working day is "the sum of the historical peak values of the two centers, and the peak value of the previous month as a parameter", that is, in this case, the historical peak value of the previous month of the two centers is used as The sum of the peaks is for reference, and the dual centers here refer to the Beijing center and the Shanghai center.

表2Table 2

图4为步骤S202、S303的流程图，请参阅图4，根据所述优化策略以及负载情况数据对所述业务系统进行资源配置包括：FIG. 4 is a flow chart of steps S202 and S303. Please refer to FIG. 4. Resource allocation for the business system according to the optimization strategy and load data includes:

S401：从所述优化策略中获取设定信息。以双活容灾模式为例，步骤S303确定出其工作日对应的优化策略为“双中心历史峰值之和，以前一月峰值作为参”，即在该种情形下，以双中心的前一月的历史峰值的和作为参考。举例而言，若上海中心前一月的历史峰值为1500tps(即每秒钟1500笔交易)，北京中心前一月的历史峰值为1000tps，则双活容灾模式下工作日对应的优化策略确定出的设定信息为2500tps。S401: Obtain setting information from the optimization strategy. Taking the active-active disaster recovery model as an example, step S303 determines that the optimization strategy corresponding to the working day is "the sum of the historical peak values of the two centers, and the peak value of the previous month is used as a parameter", that is, in this case, the previous peak value of the two centers is used as a parameter. The sum of the historical peaks of the month is used as a reference. For example, if the historical peak value of the Shanghai center in the previous month is 1500tps (that is, 1500 transactions per second), and the historical peak value of the Beijing center in the previous month is 1000tps, then the optimization strategy corresponding to the working day in the active-active disaster recovery mode is determined The output setting information is 2500tps.

S402：当所述业务系统的资源配置不满足所述设定信息时，根据所述业务系统对应的物理机的名称从预设数据库中读取所述物理机对应的虚拟机，若已满足设定信息，则无需对资源进行调整。此处提及的资源配置是指虚拟机CPU、内存的计算能力。S402: When the resource configuration of the business system does not meet the set information, read the virtual machine corresponding to the physical machine from the preset database according to the name of the physical machine corresponding to the business system, if the set information is met, If the information is specified, no resource adjustments are required. The resource configuration mentioned here refers to the computing power of the CPU and memory of the virtual machine.

S403：从所述负载情况数据中获取所述虚拟机的CPU使用率除以及内存使用率；S403: Obtain the CPU usage ratio and memory usage ratio of the virtual machine from the load status data;

S404：根据所述CPU使用率除以及内存使用率确定出所述虚拟机的系数，在具体的实施方式中，用虚拟机的CPU使用率除以内存使用率，计算出系数。在本发明中之所以采用此系数是因为CPU使用率高直接影响到物理机的负载，而内存大小则会影响迁移时间。系数值大则表示使用率高或者使用内存低，迁移后对物理机负载影响明显。S404: Determine the coefficient of the virtual machine according to dividing the CPU usage rate and the memory usage rate. In a specific implementation manner, divide the CPU usage rate of the virtual machine by the memory usage rate to calculate the coefficient. The reason why this coefficient is adopted in the present invention is that the high CPU usage directly affects the load of the physical machine, and the memory size will affect the migration time. A large coefficient value indicates high usage or low memory usage, and the migration will have a significant impact on the load of the physical machine.

S405：根据所述虚拟机的系数以及所述优化策略选取待迁移的虚拟机；S405: Select a virtual machine to be migrated according to the coefficient of the virtual machine and the optimization strategy;

S406：根据待迁移的虚拟机选取待迁移的物理机。S406: Select a physical machine to be migrated according to the virtual machine to be migrated.

S407：迁移所述待迁移的物理机以及待迁移的虚拟机进行，以使迁移后的所述业务系统的资源配置满足所述设定信息。S407: Migrate the physical machine to be migrated and the virtual machine to be migrated, so that the resource configuration of the service system after migration satisfies the setting information.

在具体的实施方式中，也可以选取好待迁移虚拟机后选择适合迁移的物理机，然后计算迁移后资源配置能否满足设定信息，若不符合条件则再次进行选择。条件满足则执行迁移操作。In a specific implementation, after selecting a virtual machine to be migrated, a physical machine suitable for migration may be selected, and then it is calculated whether the resource configuration after migration meets the set information, and if the condition is not met, the selection is performed again. If the conditions are met, the migration operation is performed.

如上即是本发明提供的一种基于灾备中心的云平台资源配置方法，获取云平台上灾备中心部署的各个业务系统对应的服务器的负载数据，并从生产环境中获取业务运行数据，引入预设的灾容模式等级进行分类，并结合预设的策略配置表进行资源配置，实现了针对不同的容灾模式自动触发不同的策略对虚拟机进行迁移，如业务低谷时将虚拟机迁移至少数物理机并关闭其他物理机以达到节约资源、节约能耗的效果。The above is a cloud platform resource configuration method based on the disaster recovery center provided by the present invention, which obtains the load data of the servers corresponding to the various business systems deployed in the disaster recovery center on the cloud platform, and obtains business operation data from the production environment, and introduces Classify the preset disaster recovery mode levels and configure resources in combination with the preset policy configuration table to automatically trigger different policies for different disaster recovery modes to migrate virtual machines, such as migrating virtual machines at least Number of physical machines and shut down other physical machines to save resources and energy consumption.

应当注意，尽管在附图中以特定顺序描述了本发明方法的操作，但是，这并非要求或者暗示必须按照该特定顺序来执行这些操作，或是必须执行全部所示的操作才能实现期望的结果。附加地或备选地，可以省略某些步骤，将多个步骤合并为一个步骤执行，和/或将一个步骤分解为多个步骤执行。It should be noted that, although operations of the methods of the present invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in that particular order, or that all illustrated operations must be performed to achieve the desired results. . Additionally or alternatively, certain steps may be omitted, multiple steps may be combined into one step for execution, and/or one step may be decomposed into multiple steps for execution.

在介绍了本发明示例性实施方式的方法之后，接下来，参考图5对本发明示例性实施方式的云平台资源配置系统进行介绍。该系统的实施可以参见上述方法的实施，重复之处不再赘述。以下所使用的术语“模块”和“单元”，可以是实现预定功能的软件和/或硬件。尽管以下实施例所描述的模块较佳地以软件来实现，但是硬件，或者软件和硬件的组合的实现也是可能并被构想的。After introducing the method in the exemplary embodiment of the present invention, next, the cloud platform resource configuration system in the exemplary embodiment of the present invention will be introduced with reference to FIG. 5 . For the implementation of the system, reference may be made to the implementation of the above-mentioned method, and repeated descriptions will not be repeated. The terms "module" and "unit" used below may be software and/or hardware that realize predetermined functions. Although the modules described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.

图5为本发明实施例提供的一种基于灾备中心的云平台资源配置系统的结构框图，请参阅图5，所述系统包括：Fig. 5 is a structural block diagram of a cloud platform resource configuration system based on a disaster recovery center provided by an embodiment of the present invention, please refer to Fig. 5, the system includes:

负载数据采集装置101，用于采集云平台上灾备中心部署的各个业务系统对应的服务器的负载数据。The load data collection device 101 is configured to collect load data of servers corresponding to various business systems deployed in the disaster recovery center on the cloud platform.

运行数据采集装置102，用于从生产环境中采集业务运行数据。在具体的实施方式中，所述业务运行数据包括交易量日均值以及交易量日峰值。即运行数据采集装置102打通了灾备中心与生产中心的交易监控接口。The operation data collection device 102 is used to collect business operation data from the production environment. In a specific implementation manner, the business operation data includes a daily average value of transaction volume and a daily peak value of transaction volume. That is, the operation data acquisition device 102 opens up the transaction monitoring interface between the disaster recovery center and the production center.

数据分类装置103，用于将所述负载数据以及业务运行数据根据预设的容灾模式等级进行分类，得到不同容灾模式下各个业务系统的负载情况数据。在具体的实施方式中，预设的容灾模式等级中包括双活容灾模式、主辅容灾模式、温备容灾模式以及冷备容灾模式。容灾模式等级可配置，对负载数据以及业务运行数据进行分类，形成以业务系统为维度的负载情况数据，以供后续步骤进行决策。The data classification device 103 is configured to classify the load data and business operation data according to the preset disaster recovery mode level, so as to obtain the load data of each business system in different disaster recovery modes. In a specific implementation manner, the preset disaster recovery mode levels include active-active disaster recovery mode, primary and secondary disaster recovery mode, warm-standby disaster recovery mode, and cold-standby disaster recovery mode. The disaster recovery mode level is configurable, and the load data and business operation data are classified to form load data with the business system as the dimension for decision-making in subsequent steps.

资源配置装置104，用于根据预设的策略配置表以及负载情况数据对所述云平台进行资源配置。The resource configuration device 104 is configured to configure resources on the cloud platform according to a preset policy configuration table and load condition data.

图6为本发明实施例提供的一种基于灾备中心的云平台资源配置系统中资源配置装置104的实施方式一的结构框图，请参阅图6，在实施方式一中，当所述业务系统为温备容灾模式或冷备容灾模式时，资源配置装置104包括：Fig. 6 is a structural block diagram of the first embodiment of the resource configuration device 104 in the cloud platform resource configuration system based on the disaster recovery center provided by the embodiment of the present invention. Please refer to Fig. 6. In the first embodiment, when the business system When it is a warm standby disaster recovery mode or a cold standby disaster recovery mode, the resource allocation device 104 includes:

第一优化策略确定模块201，用于根据策略配置表确定所述业务系统的优化策略；The first optimization strategy determination module 201 is used to determine the optimization strategy of the business system according to the strategy configuration table;

资源配置模块202，用于根据所述优化策略以及负载情况数据对所述业务系统进行资源配置。The resource configuration module 202 is configured to configure resources of the business system according to the optimization strategy and load data.

图7为本发明实施例提供的一种基于灾备中心的云平台资源配置系统中资源配置装置的实施方式二的结构框图，请参阅图7，在实施方式二中，当所述业务系统为双活容灾模式或主辅容灾模式时，所述资源配置装置104还包括：Fig. 7 is a structural block diagram of the second embodiment of the resource configuration device in the cloud platform resource configuration system based on the disaster recovery center provided by the embodiment of the present invention. Please refer to Fig. 7. In the second embodiment, when the business system is In the dual-active disaster recovery mode or the primary-slave disaster recovery mode, the resource configuration device 104 also includes:

预测信息采集模块203，用于从所述生产环境中采集预测信息。The prediction information collection module 203 is configured to collect prediction information from the production environment.

图8为本发明实施例提供的一种基于灾备中心的云平台资源配置系统中资源配置模块202的结构框图，请参阅图8，资源配置模块202包括：Fig. 8 is a structural block diagram of a resource configuration module 202 in a cloud platform resource configuration system based on a disaster recovery center provided by an embodiment of the present invention. Referring to Fig. 8, the resource configuration module 202 includes:

获取单元301，用于从所述优化策略中获取设定信息。以双活容灾模式为例，步骤S303确定出其工作日对应的优化策略为“双中心历史峰值之和，以前一月峰值作为参”，即在该种情形下，以双中心的前一月的历史峰值的和作为参考。举例而言，若上海中心前一月的历史峰值为1500tps，北京中心前一月的历史峰值为1000tps，则双活容灾模式下工作日对应的优化策略确定出的设定信息为2500tps。The obtaining unit 301 is configured to obtain setting information from the optimization strategy. Taking the active-active disaster recovery model as an example, step S303 determines that the optimization strategy corresponding to the working day is "the sum of the historical peak values of the two centers, and the peak value of the previous month is used as a parameter", that is, in this case, the previous peak value of the two centers is used as a parameter. The sum of the historical peaks of the month is used as a reference. For example, if the historical peak value of the Shanghai center in the previous month is 1500tps, and the historical peak value of the Beijing center is 1000tps in the previous month, then the setting information determined by the optimization strategy corresponding to the working day in the active-active disaster recovery mode is 2500tps.

读取单元302，用于当所述业务系统的资源配置不满足所述设定信息时，根据所述业务系统对应的物理机的名称从预设数据库中读取所述物理机对应的虚拟机，若已满足设定信息，则无需对资源进行调整。The reading unit 302 is configured to read the virtual machine corresponding to the physical machine from a preset database according to the name of the physical machine corresponding to the business system when the resource configuration of the business system does not meet the set information , if the setting information is satisfied, there is no need to adjust the resources.

使用率获取单元303，用于从所述负载情况数据中获取所述虚拟机的CPU使用率除以及内存使用率；A usage rate obtaining unit 303, configured to obtain the CPU usage rate and memory usage rate of the virtual machine from the load situation data;

系数确定单元304，用于根据所述CPU使用率除以及内存使用率确定出所述虚拟机的系数，在具体的实施方式中，用虚拟机的CPU使用率除以内存使用率，计算出系数。在本发明中之所以采用此系数是因为CPU使用率高直接影响到物理机的负载，而内存大小则会影响迁移时间。系数值大则表示使用率高或者使用内存低，迁移后对物理机负载影响明显。The coefficient determination unit 304 is configured to determine the coefficient of the virtual machine according to the CPU usage and the memory usage. In a specific embodiment, the CPU usage of the virtual machine is divided by the memory usage to calculate the coefficient . The reason why this coefficient is adopted in the present invention is that the high CPU usage directly affects the load of the physical machine, and the memory size will affect the migration time. A large coefficient value indicates high usage or low memory usage, and the migration will have a significant impact on the load of the physical machine.

第一确定单元305，用于根据所述虚拟机的系数以及所述优化策略选取待迁移的虚拟机；The first determining unit 305 is configured to select a virtual machine to be migrated according to the coefficient of the virtual machine and the optimization strategy;

第二确定单元306，用于根据待迁移的虚拟机选取待迁移的物理机。The second determining unit 306 is configured to select a physical machine to be migrated according to the virtual machine to be migrated.

迁移单元307，用于迁移所述待迁移的物理机以及待迁移的虚拟机进行，以使迁移后的所述业务系统的资源配置满足所述设定信息。The migration unit 307 is configured to migrate the physical machine to be migrated and the virtual machine to be migrated, so that the resource configuration of the service system after migration satisfies the setting information.

此外，尽管在上文详细描述中提及了基于灾备中心的云平台资源配置系统的若干单元模块，但是这种划分仅仅并非强制性的。实际上，根据本发明的实施方式，上文描述的两个或更多单元的特征和功能可以在一个单元中具体化。同样，上文描述的一个单元的特征和功能也可以进一步划分为由多个单元来具体化。In addition, although several unit modules of the disaster recovery center-based cloud platform resource configuration system are mentioned in the above detailed description, this division is not mandatory. Actually, according to the embodiment of the present invention, the features and functions of two or more units described above may be embodied in one unit. Likewise, the features and functions of one unit described above can also be further divided to be embodied by a plurality of units.

以下具体实施例以2016年10月1日的某温备系统A的灾备中心云平台资源配置的过程为例，详细说明利用本发明的一种基于灾备中心的云平台资源配置方法及系统是如何实现不同容灾模式下的云平台的负载均衡的。The following specific embodiments take the process of cloud platform resource configuration of a disaster recovery center of a warm standby system A on October 1, 2016 as an example to describe in detail a method and system for configuring cloud platform resources based on a disaster recovery center according to the present invention How to achieve load balancing of cloud platforms in different disaster recovery modes.

1.负载数据采集装置显示生产环境A系统，共使用1台虚拟机VM1，VM1的配置为8C16G，CPU使用率为20％，内存使用率为20％；灾备中心A系统共使用1台VM1，配置为2C4G。1. The load data acquisition device shows that system A of the production environment uses 1 virtual machine VM1 in total. The configuration of VM1 is 8C16G, the CPU usage rate is 20%, and the memory usage rate is 20%. The system A of the disaster recovery center uses 1 VM1 in total. , configured as 2C4G.

2.运行数据采集装置显示生产环境A系统当前的业务量为50tps，历史当日峰值的均值为80tps，2015年10月1日的峰值为100tps。2. The running data acquisition device shows that the current business volume of system A in the production environment is 50tps, the average peak value of the historical day is 80tps, and the peak value on October 1, 2015 is 100tps.

3.查询策略配置表得知A系统为温备容灾模式，对策略进行初始化。3. Query the policy configuration table and learn that system A is in warm standby disaster recovery mode, and initialize the policy.

4.资源配置装置获取1和2的数据，结合策略配置表，进行目标决策，对于该示例，决策目标为将2C4G，当前tps为50tps，自动调整为可满足100tps处理能力的虚拟机配置，并计算出目标虚拟机配置为((8C*20％/50tps)*100tps)/70％＝4.5C，((16G*20％/50tps)*100tps)/70％＝9.14G，(注：70％为灾备环境的资源容差，为固定参数)，因此决策目标为将灾备中心的VM1的配置2C4G调整为5C10G。4. The resource configuration device obtains the data of 1 and 2, and combines the policy configuration table to make target decisions. For this example, the decision target is to automatically adjust 2C4G, the current tps is 50tps, to a virtual machine configuration that can meet the processing capacity of 100tps, and Calculate the target virtual machine configuration as ((8C*20%/50tps)*100tps)/70%=4.5C, ((16G*20%/50tps)*100tps)/70%=9.14G, (Note: 70% is the resource tolerance of the disaster recovery environment and is a fixed parameter), so the decision-making goal is to adjust the configuration of VM1 in the disaster recovery center from 2C4G to 5C10G.

5.资源配置装置将决策目标4和相关决策信息1、2(生产环境的服务器负载信息、业务负载信息)发送给容灾系统。5. The resource configuration device sends the decision target 4 and relevant decision information 1, 2 (server load information and business load information of the production environment) to the disaster recovery system.

6.容灾系统针对决策目标，确定需进行资源调整。6. The disaster recovery system determines that resources need to be adjusted according to the decision-making goal.

7.计算物理机上所有虚拟机的cpu使用率与使用内存大小的比例，并按照从大到小排列；并计算所有物理机负载，从小到大排列。7. Calculate the ratio of the cpu usage rate of all virtual machines on the physical machine to the memory size used, and arrange them from large to small; and calculate the load of all physical machines, and arrange them from small to large.

8.挑选合适的目标物理机，对VM1进行迁移和资源调整。8. Select a suitable target physical machine, and perform migration and resource adjustment on VM1.

上述描述2016年10月1日的资源动态调整过程，该示例描述的是节假日来临实现了自动扩充资源，按照本系统设计，普通工作日，会自动收缩资源，提高灾备中心的资源使用率。The above describes the dynamic adjustment process of resources on October 1, 2016. This example describes the automatic expansion of resources when holidays come. According to the design of this system, resources will be automatically shrunk on normal working days to improve the resource utilization rate of the disaster recovery center.

综上所述，本发明提出的一种基于灾备中心的云平台资源配置方法以及系统，可以针对不同容灾模式下的云计算平台的负载进行负载均衡，方案灵活，也可针对不同的业务系统进行独立的设定，不用担心适用性。且可以省略人工操作步骤，节约工作量，迁移信息等可进行日志存储或保存至数据库，便于查看。本方案站在灾备数据中心的视角，在保障整个灾备中心各业务系统的灾备指标的前提下，同时能有效利用云平台资源池中的计算资源，自动化合理分配资源，减轻了操作人员的工作量。In summary, a cloud platform resource configuration method and system based on a disaster recovery center proposed by the present invention can perform load balancing for the loads of cloud computing platforms in different disaster recovery modes, the scheme is flexible, and it can also be used for different business The system is set independently, so there is no need to worry about applicability. In addition, manual operation steps can be omitted to save workload, and migration information can be stored in logs or saved to the database for easy viewing. From the perspective of the disaster recovery data center, this solution can effectively utilize the computing resources in the cloud platform resource pool, automatically and reasonably allocate resources, and reduce the number of operators on the premise of ensuring the disaster recovery indicators of each business system in the entire disaster recovery center. workload.

本申请是站在整个灾备数据中心的视角，围绕灾备中心云IT系统如何更好地满足灾备中心内的各业务系统运行需求(比如如何满足双活系统、温备系统或冷备系统的容灾指标，并且要最大化的优化资源使用率)，其核心是根据对生产业务量的分析结合灾备心中的业务容灾指标，通过决策分析计算，使得云灾备中心的资源能力随时自动调整，使得对于整个云灾备中心始终具备生产系统的实时运行能力或灾备系统的接管能力。This application is based on the perspective of the entire disaster recovery data center, focusing on how the cloud IT system of the disaster recovery center can better meet the operating needs of various business systems in the disaster recovery center (such as how to meet the active-active system, warm standby system or cold standby system) Disaster recovery indicators, and to maximize resource utilization), the core is based on the analysis of the production business volume combined with the business disaster recovery indicators in the disaster recovery center, through decision-making analysis and calculation, so that the resource capacity of the cloud disaster recovery center at any time Automatic adjustment enables the entire cloud disaster recovery center to always have the real-time operation capability of the production system or the takeover capability of the disaster recovery system.

对于一个技术的改进可以很明显地区分是硬件上的改进(例如，对二极管、晶体管、开关等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而，随着技术的发展，当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此，不能说一个方法流程的改进就不能用硬件实体模块来实现。例如，可编程逻辑器件(ProgrammableLogic Device,PLD)(例如现场可编程门阵列(Field Programmable Gate Array，FPGA))就是这样一种集成电路，其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上，而不需要请芯片制造厂商来设计和制作专用的集成电路芯片2。而且，如今，取代手工地制作集成电路芯片，这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现，它与程序开发撰写时所用的软件编译器相类似，而要编译之前的原始代码也得用特定的编程语言来撰写，此称之为硬件描述语言(HardwareDescription Language，HDL)，而HDL也并非仅有一种，而是有许多种，如ABEL(AdvancedBoolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(JavaHardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby HardwareDescription Language)等，目前最普遍使用的是VHDL(Very-High-Speed IntegratedCircuit Hardware Description Language)与Verilog2。本领域技术人员也应该清楚，只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中，就可以很容易得到实现该逻辑方法流程的硬件电路。For a technical improvement, it can be clearly distinguished whether it is an improvement on hardware (for example, improvement on circuit structures such as diodes, transistors, switches) or an improvement on software (improvement on method flow). However, with the development of technology, the improvement of many current method flows can be regarded as the direct improvement of the hardware circuit structure. Designers almost always get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (Programmable Logic Device, PLD) (such as a Field Programmable Gate Array (Field Programmable Gate Array, FPGA)) is such an integrated circuit, and its logic function is determined by programming the device by a user. It is programmed by the designer to "integrate" a digital system on a PLD, instead of asking a chip manufacturer to design and manufacture a dedicated integrated circuit chip 2 . Moreover, nowadays, instead of making integrated circuit chips by hand, this kind of programming is mostly implemented by "logic compiler" software, which is similar to the software compiler used when writing programs, but before compiling The original code of the computer must also be written in a specific programming language, which is called a hardware description language (Hardware Description Language, HDL), and there is not only one kind of HDL, but many kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, RHDL (Ruby Hardware Description Language), etc. Currently, the most commonly used is VHDL ( Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog2. It should also be clear to those skilled in the art that only a little logical programming of the method flow in the above-mentioned hardware description languages and programming into an integrated circuit can easily obtain a hardware circuit for realizing the logic method flow.

控制器可以按任何适当的方式实现，例如，控制器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit，ASIC)、可编程逻辑控制器和嵌入微控制器的形式，控制器的例子包括但不限于以下微控制器：ARC 625D、Atmel AT91SAM、Microchip PIC18F26K20以及Silicone Labs C8051F320，存储器控制器还可以被实现为存储器的控制逻辑的一部分。The controller may be implemented in any suitable way, for example the controller may take the form of a microprocessor or processor and a computer readable medium storing computer readable program code (such as software or firmware) executable by the (micro)processor , logic gates, switches, Application Specific Integrated Circuit (ASIC), programmable logic controllers, and embedded microcontrollers, examples of controllers include but are not limited to the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, the memory controller can also be implemented as part of the memory's control logic.

本领域技术人员也知道，除了以纯计算机可读程序代码方式实现控制器以外，完全可以通过将方法步骤进行逻辑编程来使得控制器以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种控制器可以被认为是一种硬件部件，而对其内包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至，可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。Those skilled in the art also know that, in addition to realizing the controller in a purely computer-readable program code mode, it is entirely possible to make the controller use logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded The same function can be realized in the form of a microcontroller or the like. Therefore, such a controller can be regarded as a hardware component, and the devices included in it for realizing various functions can also be regarded as structures within the hardware component. Or even, means for realizing various functions can be regarded as a structure within both a software module realizing a method and a hardware component.

上述实施例阐明的系统、装置、模块或单元，具体可以由计算机芯片或实体实现，或者由具有某种功能的产品来实现。The systems, devices, modules, or units described in the above embodiments can be specifically implemented by computer chips or entities, or by products with certain functions.

为了描述的方便，描述以上装置时以功能分为各种单元分别描述。当然，在实施本申请时可以把各单元的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, when describing the above devices, functions are divided into various units and described separately. Of course, when implementing the present application, the functions of each unit can be implemented in one or more pieces of software and/or hardware.

通过以上的实施方式的描述可知，本领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在存储介质中，如ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本申请各个实施例或者实施例的某些部分所述的方法。It can be known from the above description of the implementation manners that those skilled in the art can clearly understand that the present application can be implemented by means of software plus a necessary general-purpose hardware platform. Based on this understanding, the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products can be stored in storage media, such as ROM/RAM, disk , optical disc, etc., including several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in various embodiments or some parts of the embodiments of the present application.

本说明书中的各个实施例均采用递进的方式描述，各个实施例之间相同相似的部分互相参见即可，每个实施例重点说明的都是与其他实施例的不同之处。尤其，对于系统实施例而言，由于其基本相似于方法实施例，所以描述的比较简单，相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a progressive manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, as for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for the related parts, please refer to the part of the description of the method embodiment.

本申请可用于众多通用或专用的计算机系统环境或配置中。例如：个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。The application can be used in numerous general purpose or special purpose computer system environments or configurations. Examples: personal computers, server computers, handheld or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, including A distributed computing environment for any of the above systems or devices, etc.

本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述，例如程序模块。一般地，程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请，在这些分布式计算环境中，由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中，程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。This application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including storage devices.

虽然通过实施例描绘了本申请，本领域普通技术人员知道，本申请有许多变形和变化而不脱离本申请的精神，希望所附的权利要求包括这些变形和变化而不脱离本申请的精神。Although this application has been described by way of example, those of ordinary skill in the art know that there are many variations and changes in this application without departing from the spirit of this application, and it is intended that the appended claims cover these variations and changes without departing from the spirit of this application.

Claims

1. A cloud platform resource allocation method based on disaster recovery center, is characterized in that, described method comprises:

Collect the load data of the server corresponding to each business system deployed in the disaster recovery center on the cloud platform;

Collect business operation data from the production environment;

classify the load data and business operation data according to the preset disaster recovery mode level, and obtain the load status data of each business system in different disaster recovery modes;

Resource allocation is performed on the cloud platform according to the preset policy configuration table and load condition data.

2. The method according to claim 1, wherein the load data of the server is collected by using a script, and the server includes a virtual machine and a physical machine.

3. The method according to claim 2, wherein the load data includes CPU usage, memory usage, and storage capacity usage, and the business operation data includes a daily average value of transaction volume and a daily peak value of transaction volume.

4 . The method according to claim 3 , wherein the disaster recovery mode includes active-active disaster recovery mode, master-slave disaster recovery mode, warm-standby disaster recovery mode, and cold-standby disaster recovery mode.

5. The method according to claim 4, wherein when the business system is in a dual-active disaster recovery mode or a master-slave disaster recovery mode, resource allocation is performed on the cloud platform according to a policy configuration table and load data include:

collecting forecast information from said production environment;

determining the optimization strategy of the business system according to the prediction information and the strategy configuration table;

Resource allocation is performed on the business system according to the optimization strategy and load condition data.

6. The method according to claim 4, wherein when the business system is in warm standby disaster recovery mode or cold standby disaster recovery mode, performing resource allocation on the cloud platform according to the policy configuration table and load condition data includes:

Determine the optimization strategy of the business system according to the strategy configuration table;

7. The method according to claim 5 or 6, wherein configuring the resources of the business system according to the optimization strategy and load data includes:

Obtaining setting information from the optimization strategy;

When the resource configuration of the business system does not meet the set information, read the virtual machine corresponding to the physical machine from a preset database according to the name of the physical machine corresponding to the business system;

Obtain the CPU usage ratio and memory usage ratio of the virtual machine from the load situation data;

Determining the coefficient of the virtual machine according to the CPU usage ratio and the memory usage ratio;

selecting a virtual machine to be migrated according to the coefficient of the virtual machine and the optimization strategy;

Select the physical machine to be migrated according to the virtual machine to be migrated;

The physical machine to be migrated and the virtual machine to be migrated are migrated so that the resource configuration of the service system after migration meets the setting information.

8. A cloud platform resource configuration system based on a disaster recovery center, characterized in that the system includes:

The load data collection device is used to collect the load data of the servers corresponding to the various business systems deployed in the disaster recovery center on the cloud platform;

Operational data collection device, used to collect business operation data from the production environment;

The data classification device is used to classify the load data and business operation data according to the preset disaster recovery mode level, and obtain the load status data of each business system in different disaster recovery modes;

The resource configuration device is configured to configure resources on the cloud platform according to a preset policy configuration table and load condition data.

9 . The system according to claim 8 , wherein the load data collection device uses a script to collect load data of servers, and the servers include virtual machines and physical machines.

10 . The system according to claim 9 , wherein the load data includes CPU usage, memory usage, and storage capacity usage, and the business operation data includes a daily average value of transaction volume and a daily peak value of transaction volume.

11 . The system according to claim 10 , wherein the disaster recovery mode includes active-active disaster recovery mode, master-slave disaster recovery mode, warm-standby disaster recovery mode, and cold-standby disaster recovery mode.

12. The system according to claim 11, wherein when the business system is in warm standby disaster recovery mode or cold standby disaster recovery mode, the resource configuration device comprises:

The first optimization strategy determination module is used to determine the optimization strategy of the business system according to the strategy configuration table;

A resource configuration module, configured to configure resources of the business system according to the optimization strategy and load data.

13. The system according to claim 12, wherein when the business system is in a dual-active disaster recovery mode or a master-slave disaster recovery mode, the resource configuration device further comprises:

A prediction information collection module, configured to collect prediction information from the production environment;

The second optimization strategy determination module is configured to determine the optimization strategy of the business system according to the prediction information and the strategy configuration table.

14. The system according to claim 12 or 13, wherein the resource configuration module comprises:

an acquisition unit, configured to acquire setting information from the optimization strategy;

A reading unit, configured to read the virtual machine corresponding to the physical machine from a preset database according to the name of the physical machine corresponding to the business system when the resource configuration of the business system does not meet the set information;

a utilization rate acquisition unit, configured to obtain the CPU utilization rate and memory utilization rate of the virtual machine from the load condition data;

a coefficient determining unit, configured to determine the coefficient of the virtual machine according to the CPU usage ratio and the memory usage ratio;

A first determining unit, configured to select a virtual machine to be migrated according to the coefficient of the virtual machine and the optimization strategy;

The second determination unit is used to select the physical machine to be migrated according to the virtual machine to be migrated;

The migration unit is configured to migrate the physical machine to be migrated and the virtual machine to be migrated, so that the service system after migration satisfies the setting information.