Detailed Description
The application will be described in detail hereinafter with reference to the drawings in conjunction with embodiments. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.
In this embodiment, a method for reconstructing a container unit of a containerized cluster is provided, as shown in fig. 1, where the containerized cluster includes a cluster scheduler, an address allocation component, a cluster service interface, an IP address management module, and a plurality of nodes, and the method includes:
Step 101, after receiving a request for reconstructing a container unit, the cluster scheduler obtains information about the need for reconstructing the container unit, determines a target node from a plurality of nodes included in the containerized cluster according to the information about the need for reconstructing the container unit, and sends a command for reconstructing the container unit to the target node.
The embodiment of the application provides a container unit reconstruction method which can be applied to a containerized cluster (for example kubernetes cluster) and aims to solve the problem of failure in IP address allocation in the container unit reconstruction process in the prior art and improve the success rate of container unit reconstruction. The container unit involved in the method may be Pod. The containerized cluster is composed of a plurality of key components including a cluster scheduler, an address assignment component, a cluster service interface, an IP address management module, and a plurality of nodes. These components work cooperatively to achieve efficient IP address assignment during the container unit rebuild process. Here, the cluster scheduler is the core component in the containerized cluster responsible for resource allocation and task scheduling. When there is a container unit reestablishment request, the cluster scheduler may receive the request. This request may be triggered automatically by the system (e.g., when an abnormal operation of the container unit is detected) or may be initiated manually by an administrator. After receiving the request for reconstructing the container unit, the cluster scheduler can acquire the information of the need for reconstructing the container unit. Such information may include the resource requirements (e.g., CPU, memory, storage, etc.) of the rebuilt container unit, the specific requirements (e.g., specific hardware configuration, software environment, geographic location, etc.) of the node, the namespace to which the container unit belongs, etc. Then, the cluster scheduler can reconstruct the demand information according to the acquired container units, and screen out target nodes meeting the requirements from a plurality of nodes in the container cluster. The screening process can comprehensively consider the factors such as the resource use condition, the load condition, the health state and the like of the nodes so as to ensure that the target node can meet the operation requirement of the reconstructed container unit. After determining the target node, the cluster scheduler may send a container unit rebuild instruction to the target node. The instructions contain various information required to reconstruct the container unit, such as configuration files, mirror information, etc. of the container unit.
For example, in a financial transaction system, a certain container unit responsible for handling high frequency transactions needs to be rebuilt due to failure. The container unit has extremely high requirements on computing resources, needs to be provided with a high-performance CPU and a large-capacity memory, and also needs a specific financial data processing software environment. After receiving the reestablishment request, the cluster dispatcher screens out nodes with corresponding hardware configuration and software environment from a plurality of nodes as target nodes according to the requirements and sends a reestablishment instruction so as to ensure that a new container unit can rapidly and stably process high-frequency transaction data.
For another example, in a medical image processing system, one container unit for processing a large amount of medical image data fails. The container unit needs to have powerful graphics processing power (GPU acceleration) and enough memory to store the image data. After the cluster scheduler obtains the reconstruction requirement, a node provided with a high-performance GPU and a large-capacity storage is selected from the cluster as a target node, and a reconstruction instruction is sent to ensure that a new container unit can efficiently process medical image data.
Step 102, after receiving the container unit rebuilding instruction, the target node creates a target container unit and starts the target container unit.
In this embodiment, after receiving the container unit rebuilding instruction sent by the cluster scheduler, the target node creates a target container unit according to information in the container unit rebuilding instruction. The creation process includes allocating necessary resources (e.g., CPU, memory, storage, etc.), initializing the container environment, downloading the required image, etc. After the container unit is created, the target node may start the target container unit. After the start-up, the target container unit enters a ready state and waits for subsequent IP address allocation and other operations. At this point, the target container unit already has basic operational capabilities, but is not yet able to communicate with other components in the network because it has not yet been assigned an IP address.
And step 103, after the address allocation component monitors that the target container unit is started, judging whether a target IP address matched with the naming space where the target container unit is located exists in a cache space.
In this embodiment, the address assignment component is responsible for assigning IP addresses to target container units. It continuously listens in the system for start events of individual target container units. When the target container unit is monitored to start, the address allocation component starts to execute the IP address allocation flow. In particular, the address assignment component may check if there is a target IP address in the cache space that matches the namespace in which the target container unit is located. Cache space is a mechanism for storing commonly used or recently used IP addresses that may increase the efficiency of IP address allocation. Namespaces are one mechanism for logically isolating resources in a containerized cluster.
Step 104, when the target IP address exists in the cache space, the address allocation component associates the target IP address with the target container unit, and stores the association information into a target object for managing identity information.
In this embodiment, when there is a target IP address in the cache space that matches the namespace in which the target container unit resides, the address assignment component may associate the target IP address with the target container unit. I.e. the target container unit will use the target IP address for network communication. In addition, the address allocation component may store association information of the target container unit and the target IP address into a target object that manages identity information. The target object may be a database or a specific data structure for recording the correspondence of the target container unit to the target IP address. For example, if there is a target IP address in the cache space that fits into the namespace in which a high frequency transaction processing container unit is located, the address assignment component may assign the target IP address to the target container unit, e.g., assign the target IP address 192.168.1.100 to the target container unit, and store the identification of the target container unit (e.g., container unit name, namespace, etc.) and the association information of the target IP address to the target object. In this way, in the subsequent transaction data processing process, other components can acquire the IP address of the target container unit through the target object, so as to realize network communication.
And 105, when the target IP address does not exist in the cache space, the address allocation component stores the unique identifier corresponding to the target container unit into a queue to be processed, reads a preset number of unique identifiers from the queue to be processed when a preset sending condition is met, generates a data acquisition request based on the preset number of unique identifiers, sends the data acquisition request to a cluster service interface, acquires the target IP address corresponding to each unique identifier from an IP address management module based on target data corresponding to each unique identifier returned by the cluster service interface, associates each target IP address with a matched target container unit respectively, and stores associated information into the cache space and a target object for managing identity information.
In this embodiment, when there is no target IP address in the cache space that matches the namespace in which the target container unit is located, the address allocation component may store the unique identification corresponding to the target container unit in the pending queue. The unique identification may be the name, ID, etc. of the target container unit for uniquely identifying the target container unit. The pending queue is a temporary storage mechanism for temporarily storing the unique identification of the target container unit that needs further processing. When a preset transmission condition is met (for example, a certain number of container unit identifiers are reached in the queue to be processed, or a certain time interval passes), the address allocation component reads (can read in a first-in first-out manner) a preset number of unique identifiers from the queue to be processed. The preset number can be adjusted according to the actual condition of the system so as to balance the efficiency of IP address allocation and the load of the system. The address assignment component may then generate a data acquisition request based on the read preset number of unique identifications and send the data acquisition request to the cluster service interface. The cluster service interface is an interface in the containerized cluster that provides various services, through which information related to the target container unit can be obtained. The data acquisition request may include information such as a namespace, an identifier, etc. of the target container units, and is used to acquire target data (e.g., tag data, annotation data, etc.) associated with the target container units. After receiving the target data returned by the cluster service interface, the address allocation component may obtain, from the IP address management module, the target IP address corresponding to each unique identifier according to the target data. The IP address management module is responsible for managing and distributing IP address resources of the whole containerized cluster, and can distribute proper IP addresses to the container units according to certain rules and strategies. After the target IP addresses are acquired, the address allocation component associates each target IP address with the matched target container unit respectively, and stores the association information of each target IP address and the target container unit into a cache space and a target object for managing identity information. Therefore, when the subsequent container units are rebuilt, the method can be firstly searched from the cache space, and the efficiency of IP address allocation is improved.
In one embodiment of the medical health field, when there is no target IP address in the cache space that matches the medical image processing container unit, the address assignment component may store the unique identification of the container unit (e.g., container unit-image processing-department a-001) in a queue to be processed awaiting further processing. When the unique identifiers of the 3 medical image processing container units exist in the queue to be processed, the preset sending condition is met. The address allocation component reads the 3 unique identifiers, generates a data acquisition request, and sends the data acquisition request to the cluster service interface to acquire target data required by subsequent allocation of the target IP address. The address allocation component obtains corresponding target IP addresses, such as 172.16.0.201, 172.16.0.202, etc., from the IP address management module based on the unique identification of the 3 medical image processing container units and the target data. The address assignment component associates these IP addresses with corresponding container units, ensuring that each target container unit has a target IP address available. The address allocation component stores association information of the medical image processing container unit and the target IP address into the cache space and the target object. For example, the target IP address corresponding to the record container unit-image processing-department a-001 in the target object is 172.16.0.201. Other medical system components can acquire the target IP address of the target container unit by inquiring the target object, so that data transmission and processing are facilitated.
After successful allocation of the target IP address to the target container unit, the target container unit is successfully rebuilt.
By applying the technical scheme of the embodiment, the target IP addresses are acquired from the cache space when the IP addresses are distributed by introducing the cache mechanism, so that the IP address acquisition efficiency can be greatly improved, the target IP addresses of the batch target container units are acquired from the IP address management module each time when the IP addresses are not in the cache space, a plurality of requests are combined into one batch of requests, the request frequency between the cluster service interface and the IP address management module can be reduced, the occurrence of the blocking condition of the cluster service interface and the IP address management module caused by frequent requests can be effectively prevented, and the success rate of the IP address distribution is improved.
In the embodiment of the application, optionally, the target data comprises tag data and annotation data, after the target data corresponding to each unique identifier returned by the cluster service interface is received, the method further comprises the steps that the address distribution component analyzes the tag data and the annotation data, determines a plurality of network strategies matched with the target container unit according to analysis results, invokes a network strategy analysis model, respectively carries out strategy analysis on each network strategy through the network strategy analysis model to obtain an access object corresponding to the network strategy, judges whether access conflicts exist among the network strategies according to the access object and the access action corresponding to each network strategy relative to the access action of the access object, and triggers an automatic restoration mechanism when the access conflicts exist, so as to obtain a final network strategy corresponding to the target container unit based on the automatic restoration mechanism.
In this embodiment, during the reconstruction of the container unit, the address allocation component is not only responsible for allocating an IP address to the target container unit, but may also generate a network policy for the target container unit that meets the security requirements. Wherein the target data obtained from the cluster service interface may include tag data and annotation data. Such data is typically associated with the configuration and policy of the target container unit and is stored in a containerized platform such as Kubernetes. The tag data is used to identify attributes (e.g., environment, application, etc.) of the target container unit, and the annotation data is used to store additional information (e.g., version, configuration details, etc.). After receiving the target data corresponding to each unique identifier returned by the cluster service interface, the address allocation component can analyze the tag data and the annotation data in the target data to extract information related to the network policy, so that the network policy matched with the target container unit is determined according to the information. Such information may include attributes of the application, environment, security group, etc. to which the target container unit belongs, which are used to determine applicable network policies. Next, based on the parsing result, the address allocation component determines a plurality of network policies from the plurality of existing network policies that match the target container unit. The network policy defines which resources the target container unit can access and how to access those resources.
The address assignment component then invokes a network policy resolution model that is used to resolve the specifics of each network policy. The network policy resolution model may be a rule-based resolution engine or a machine-learning based model for understanding and enforcing policy rules. Through a network policy analysis model, the address allocation component analyzes each network policy respectively to obtain an access object (i.e. a target resource allowing or denying access) corresponding to each network policy and an access action (e.g. allowing, denying, limiting, etc.) corresponding to the access object. And according to the access object and the access action corresponding to each network policy, the address allocation component judges whether access conflict exists among the network policies. For example, one policy may allow access to a resource, while another policy may deny access to the same resource. If there is an access conflict, the address assignment component may trigger an automatic repair mechanism to resolve the conflict and determine the final network policy.
Here, the automatic repair mechanism may resolve the conflict based on predefined rules or policy priorities. For example, conflicts may be resolved based on the priority of the policy (e.g., a high priority policy overrides a low priority policy) or the severity of the policy (e.g., access is denied over access is allowed). And finally determining a network strategy applicable to the target container unit through an automatic repair mechanism, ensuring that the network access of the target container unit meets the safety requirement, and simultaneously avoiding access problems caused by strategy conflict.
In one specific embodiment of the financial field, the tag data may include "app=online-banking", "env=production", etc., and the annotation data may include "version=2.0", "security-level=high", etc. The address assignment component parses the tag and annotation data to determine a plurality of network policies that match the transactional container unit. For example, policy 1 allows container units from "app=online-banking" and "env=production" to access the core transaction database. Policy 2-container unit from "security-level=low" is denied access to sensitive data (but the "security-level=high" of the transaction container unit, so this policy does not directly conflict). Policy 3-restricting access from external networks, allowing only access to specific IP ranges (e.g. banking internal networks). The address allocation component calls a network policy analysis model to analyze each network policy to obtain an access object and an access action corresponding to each policy. During the resolution process, the address assignment component discovers that policy 1 and policy 3 have potential access conflicts. Policy 1 allows the inner container unit to access the core transaction database, but policy 3 restricts access to the external network. Although the transaction container unit is an internal container unit, it is necessary to ensure that the restrictions of policy 3 do not accidentally injure legitimate internal accesses. To resolve the conflict, the address assignment component triggers an automatic repair mechanism. The mechanism may include checking policy priority-assuming policy 1 has a higher priority than policy 3, then the allowed rules of policy 1 will override the restricted rules of policy 3 (only for internal legitimate accesses). Refining access rules-adding exception rules for policy 3, explicitly allowing access from container units of "app = online-banking" and "env = production" even if they are from the internal network. Through the automatic repair mechanism, the address allocation component determines the final network policy applicable to the transaction container unit, ensuring that it can access the core transaction database and meet the security requirements of the bank.
The embodiment of the application determines the applicable network strategy by analyzing the tag data and the annotation data, and utilizes the network strategy analysis model to analyze the strategy so as to judge and resolve the slight conflict. By the automatic repair mechanism, the network access strategy of the target container unit can be reasonably solved in the case of conflict, so that the network access safety and stability of the target container unit in the containerized cluster are ensured.
In the embodiment of the application, the containerized cluster further comprises a cluster monitoring component, wherein the cluster monitoring component acquires load usage data of each node in the containerized cluster according to a preset frequency, calculates the current total load of the containerized cluster based on the load usage data of each node, adjusts sub-resource quota for each container unit in the containerized cluster according to the current total load and the total resources of the containerized cluster, determines a first resource occupied by a reconstructed container unit and a second resource occupied by the created container unit when the reconstruction of the container unit is monitored if the adjusted sub-resource quota is smaller than a preset quota threshold, and releases the first resource occupied by the created container unit when the second resource comprises the first resource, and resumes the first resource occupied by the created container unit after the reconstruction of the container unit is finished.
In this embodiment, the containerized cluster further includes a cluster monitoring component, configured to dynamically adjust sub-resource quota of each container unit in the cluster, and perform resource management during the process of rebuilding the container unit, so as to ensure reasonable utilization of cluster resources and smooth implementation of rebuilding the container unit. Specifically, the cluster monitoring component obtains load usage data for each node in the containerized cluster at a preset frequency (e.g., every minute or second). Such data includes CPU usage, memory usage, storage usage, network bandwidth usage, and the like. The cluster monitoring component then calculates a current total load of the containerized cluster based on the obtained load usage data. The total load may be obtained by weighted averaging or summing the individual node loads. The cluster monitoring component adjusts sub-resource quotas for each container unit within the containerized cluster based on the current total load and the total resources of the containerized cluster. For example, if the current total load is higher, the cluster monitoring component may decrease the sub-resource quota for the container unit to free up resources for use by the remaining services or applications, and if the current total load is lower, the cluster monitoring component may increase the sub-resource quota for the container unit. Here, the sub-resource quota refers to a resource quota for all container units, including created container units and non-created container units. If the adjusted sub-resource quota is smaller than the preset quota threshold, the sub-resource quota can not meet the occupation of the resources of the container units in the containerized cluster, so that the reestablishing operation of the container units can be monitored. When the cluster monitoring component monitors that there are container units to be rebuilt, it may begin executing a resource management flow in which the cluster monitoring component determines a first resource (e.g., a particular CPU core, memory block, etc.) occupied by the rebuilt container unit and a second resource occupied by the created container unit. If the first resource is included in the second resource, it is indicated that there is a conflict between the resource that needs to be used by the reconstructed container unit and the created container unit. At this point, the cluster monitoring component de-occupies the first resource by the created container unit to ensure that the reconstructed container unit is available for the required resource. After the container unit rebuilds, the cluster monitoring component resumes occupation of the first resource by the created container unit to resume normal resource allocation.
According to the embodiment of the application, the cluster monitoring component is introduced, so that the dynamic monitoring and adjustment of the containerized cluster resources are realized, and the efficient utilization of the resources and the smooth implementation of the reconstruction of the container units are ensured. When resources are tensed, normal operation of the reconstruction of the container unit is ensured by a mechanism for releasing and recovering the occupation of the resources, and the reliability and the stability of the system are improved.
In the embodiment of the application, the method further comprises the steps that the cluster dispatcher obtains historical container unit reconstruction data according to a first preset time interval, predicts a container unit to be created based on the historical container unit reconstruction data, determines a predicted deployment node corresponding to the container unit to be created, sends a base image preloading instruction to the predicted deployment node, and carries out preloading operation on the stored base image after receiving the base image preloading instruction, and correspondingly, in step 102, after receiving the container unit reconstruction instruction, the target node creates a target container unit, wherein the step comprises the steps that after receiving the container unit reconstruction instruction, the target node determines a target container unit type according to the container unit reconstruction instruction, pulls a target base image corresponding to the target container unit type from a plurality of preloaded base images, and creates the target container unit on the target node based on the target base image.
In this embodiment, predictive scheduling and base image preloading mechanisms are introduced in the containerized cluster to improve efficiency and response speed of container unit rebuilding. Specifically, the cluster scheduler obtains historical container unit reconstruction data at a first preset time interval (e.g., every hour or every day). These data may be stored in log files or databases. Through analysis of historical container unit reconstruction data, the cluster scheduler identifies common patterns and trends of container unit reconstruction. For example, certain applications may be re-established frequently within a certain period of time, or may require re-establishment after a system update. Next, based on the historical container unit reconstruction data analysis, the cluster scheduler predicts container units that may need to be created over a future period of time. These container units may need to be rebuilt for reasons such as increased load, failure recovery, or system updates. In addition, the cluster scheduler may also determine the possible deployment nodes for these container units to be created. These nodes are typically selected based on factors such as resource availability, geographic location, historical load, etc. The cluster scheduler sends a base image preload instruction to the predicted deployment node. The instruction informs the node to perform a preload operation on the already stored base image. After the predicted deployment node receives the instruction, all the cached base images can be preloaded, so that the preloaded base images can be directly used when the container unit is subsequently rebuilt, and the preloading time is omitted.
Subsequently, when the target node actually receives a container unit rebuild instruction, it may determine the target container unit type from the instruction. The target node pulls a target base image corresponding to the target container unit type from the preloaded plurality of base images. This process can be done quickly since the target base image has already been preloaded. Further, the target node creates a target container unit locally based on the target base image. Since the target base image is already preloaded, the creation speed of the container unit is significantly increased, reducing the delay of the reconstruction.
The embodiment of the application can obviously improve the efficiency and response speed of the reconstruction of the container units in the containerized cluster by introducing the predictive scheduling and the basic mirror image preloading mechanism. Predictive scheduling reduces the image loading time when a container unit is created by analyzing historical data, identifying in advance the container unit that may need to be rebuilt, and by a base image preloading mechanism.
In the embodiment of the application, the method further comprises the steps that the address allocation component counts the monitored container unit starting times, the IP address allocation success rate and the current resource occupancy rate according to a second preset time interval, if the monitored container unit starting times are larger than the first preset times and/or the IP address allocation success rate is smaller than a preset success rate threshold value, the relation between the current resource occupancy rate and the preset occupancy rate threshold value is judged, if the current resource occupancy rate is larger than the preset occupancy rate threshold value, an IP address pre-allocation mechanism is triggered to pre-generate an IP address through the IP address pre-allocation mechanism and store the IP address in an IP address resource pool, and correspondingly, after the IP address pre-allocation mechanism pre-generates the IP address and stores the IP address in the IP address resource pool, the address allocation component is heard that the IP address is used as the target IP address of a new target container unit from the IP address resource pool after the new target container unit is started, and the target IP address of the new target container unit does not exist in the cache space.
In this embodiment, by introducing an IP address pre-allocation mechanism, the IP address allocation process of container units in the containerized cluster is optimized, so as to improve the efficiency and reliability of address allocation. Specifically, the address allocation component counts the number of container unit starts at a second preset time interval (e.g., every minute or hour), records the number of container units newly started (newly rebuilt) in the counting period, calculates the IP address allocation success rate, calculates the ratio of the number of container units successfully allocated with the IP address in the counting period to the total number of container units started, and evaluates the current resource occupancy rate, namely the resource use condition of the current address allocation component.
The address assignment component then determines based on the statistics that if the number of container unit starts exceeds a first preset number (e.g., more than 100 times/minute), indicating that the address assignment component is experiencing a high load, more pre-generated IP addresses may be needed to relieve the pressure. If the IP address assignment success rate is below a preset success rate threshold (e.g., below 95%), indicating that the current IP address assignment procedure is problematic, optimization may be required. If the current resource occupancy of the address assignment component is greater than a preset occupancy threshold (e.g., greater than 80%), the address assignment component resource occupancy is indicated to be near saturation.
Specifically, if the monitored number of times of starting the container unit is greater than the first preset number of times and/or the success rate of allocation of the IP address is less than the preset success rate threshold, then further judging the relationship between the current resource occupancy rate and the preset occupancy rate threshold. If the current resource occupancy is greater than the preset occupancy threshold, an IP address pre-allocation mechanism may be triggered by which the address allocation component pre-generates a number of IP addresses and stores the addresses in an IP address resource pool. The pre-generated IP addresses may then be directly assigned to the container units to reduce latency and resource contention in real-time allocation. For example, in the financial field, during peak transaction periods, a large number of new transaction container units need to be quickly started and allocated with IP addresses to ensure timely processing of transactions, delay or failure of transactions can be avoided by pre-generated IP addresses, and for example, medical health systems need to process a large amount of real-time data, such as patient vital sign monitoring, medical image analysis, etc., and the response speed and data processing capacity of the system are extremely high, and in emergency situations, such as emergency, surgery, etc., new medical container units need to be quickly started to process data or provide services, and by pre-generated IP addresses, the IP address allocation process must be quick and reliable.
After the IP address resource pool is established, the address allocation component continues to monitor for start events for new target container units. When a new target container unit is started and its target IP address does not exist in the cache space, the address allocation component may obtain a pre-generated IP address from the IP address resource pool and allocate it to the container unit. After allocation, the address allocation component updates the IP address resource pool, ensuring that there are always enough IP addresses in the resource pool to allocate.
The embodiment of the application can effectively meet the IP address allocation requirement of the address allocation assembly under the condition of high load by introducing an IP address pre-allocation mechanism, and reduce the delay and the resource competition of real-time allocation.
In the embodiment of the application, optionally, after receiving the target data corresponding to each unique identifier returned by the cluster service interface, the method further comprises triggering a retry mechanism if any unique identifier does not successfully acquire a target IP address from the IP address management module, acquiring the target IP address from the IP address management module again through the retry mechanism based on the target data corresponding to any unique identifier until the target IP address corresponding to any unique identifier is successfully acquired or the retry times are ended when the second preset times are reached, and correspondingly, adding any unique identifier into a processing failure queue if the retry times are up to the second preset times and the target IP address corresponding to any unique identifier is not successfully acquired.
In this embodiment, by introducing a retry mechanism and processing a failure queue, the reliability and fault tolerance of the IP address assignment process is enhanced. Specifically, when the target IP address is acquired from the IP address management module, if a certain uniquely identified target container unit fails to acquire the target IP address, a retry mechanism may be triggered. The retry mechanism may retry to acquire the target IP address according to a preset retry strategy (e.g., interval time, retry number, etc.). In addition, a maximum number of retries (i.e., a second preset number) may be set to prevent resource waste and system performance degradation caused by infinite retries. After each retry, it may be checked whether the current number of retries has reached a second preset number. If the target IP address is successfully acquired before the second preset times are reached, the retry process is immediately terminated, the task continues to be normally executed, and if the retry times reach the second preset times and the target IP address is not successfully acquired, the unique identification can be added into the processing failure queue. Here, the processing failure queue is used to record the unique identification of an unsuccessful allocation of an IP address for subsequent manual inspection, repair, or reattempt allocation.
The embodiment of the application can obviously improve the reliability and fault tolerance capability of the IP address allocation process by introducing a retry mechanism and a processing failure queue, ensures that recovery or recording failure information can be automatically carried out when the IP address allocation fails, is convenient for subsequent processing, and improves the overall stability and user experience.
In the embodiment of the application, optionally, the step of determining the target node from the plurality of nodes included in the containerized cluster according to the request information of the container unit includes the steps of obtaining CPU usage rate, available memory amount, node health state and node schedulable state corresponding to each node in the containerized cluster, determining storage space request and CPU usage request corresponding to the target container unit to be created according to the request information of the container unit, screening a plurality of first nodes in health state and schedulable state from the nodes according to the node health state and the node schedulable state corresponding to each node, and determining the target node from the plurality of first nodes according to the storage space request and CPU usage request corresponding to the target container unit, the CPU usage rate and the available memory amount corresponding to each first node.
In this embodiment, the cluster scheduler may obtain CPU usage, the amount of available memory, the node health status, and the node schedulable status for each node in the containerized cluster. This information is used to evaluate the current state of the node and the availability of resources. Specifically, CPU utilization, which is indicative of the current CPU resource occupancy of a node, is typically expressed in terms of a percentage, high CPU utilization may mean that the node is heavily loaded, available memory, which is indicative of the current available memory resources of the node, typically expressed in terms of GB or MB, which may result in memory starvation, node health, which is indicative of the operational status of the node, such as healthy, unhealthy or unknown, which is typically determined by heartbeat detection or health check mechanisms, and node schedulability, which is indicative of whether the node may accept new container unit scheduling requests, which may not be schedulable for maintenance, resource starvation or other reasons.
In addition, the cluster scheduler may also determine storage space requirements and CPU usage requirements of the target container unit to be created based on the container unit reconstruction requirement information. These requirements may be determined based on the configuration file or historical usage data of the container unit. The storage space requirement is used for indicating the storage space required by the target container unit, is usually used for persistent data or temporary storage, and the CPU use requirement is used for indicating the CPU resource required by the target container unit, is usually expressed in terms of CPU core number or CPU millisecond number.
The cluster scheduler then screens out all nodes from among the nodes that are healthy and schedulable (i.e., the first node) based on the healthy state and schedulable state of the nodes. This step excludes nodes that are not available or not suitable for scheduling. Further, among the first nodes selected, the most suitable node is selected as the target node according to the storage space requirement and the CPU usage requirement of the target container unit, and the CPU usage rate and the available memory amount of each first node. The node can be ensured to have enough storage space to meet the requirement of the target container unit through storage space matching, and can be ensured to have enough CPU to support the operation of the target container unit through CPU matching.
Further, as a specific implementation of the method of fig. 1, an embodiment of the present application provides a container unit reconstruction system for a containerized cluster, as shown in fig. 2, where the system includes:
the cluster scheduler is used for acquiring the container unit reconstruction demand information after receiving the container unit reconstruction request, determining a target node from a plurality of nodes included in the containerized cluster according to the container unit reconstruction demand information, and sending a container unit reconstruction instruction to the target node;
the target node is used for creating a target container unit and starting after receiving the container unit reconstruction instruction;
The system comprises a target container unit, an address allocation component, an IP address management module and a cluster service interface, wherein the target container unit is used for storing target IP addresses corresponding to the target container unit, the address allocation component is used for judging whether target IP addresses matched with a name space where the target container unit is located exist in a cache space after monitoring that the target container unit is started, when the target IP addresses exist in the cache space, the target IP addresses are associated with the target container unit, and associated information is stored in a target object for managing identity information, when the target IP addresses do not exist in the cache space, unique identifiers corresponding to the target container unit are stored in a queue to be processed, when preset sending conditions are met, a preset number of unique identifiers are read from the queue to be processed, a data acquisition request is generated based on the preset number of unique identifiers, and the data acquisition request is sent to the cluster service interface, so that the target IP addresses corresponding to each unique identifier are acquired from the IP address management module based on target data corresponding to each unique identifier returned by the cluster service interface, the unique identifier are respectively associated with the matched target container unit, and the associated information is stored in the cache space and the target object for managing identity information.
Optionally, the target data comprises tag data and annotation data, and the address allocation component is further configured to:
After receiving target data corresponding to each unique identifier returned by the cluster service interface, analyzing the tag data and the annotation data, determining a plurality of network policies matched with the target container unit according to analysis results, calling a network policy analysis model, respectively carrying out policy analysis on each network policy through the network policy analysis model to obtain an access object corresponding to the network policy, judging whether access conflict exists between the network policies according to the access object and the access action corresponding to each network policy, triggering an automatic restoration mechanism when the access conflict exists, and obtaining a final network policy corresponding to the target container unit based on the automatic restoration mechanism.
Optionally, the containerized cluster further comprises a cluster monitoring component, wherein the cluster monitoring component is further configured to:
And if the adjusted sub-resource quota is smaller than a preset quota threshold, determining a first resource occupied by the reconstructed container unit and a second resource occupied by the created container unit when the reconstruction of the container unit is monitored, and releasing the first resource occupied by the created container unit when the second resource comprises the first resource, and recovering the first resource occupied by the created container unit after the reconstruction of the container unit is finished.
Optionally, the cluster scheduler is further configured to obtain historical container unit reconstruction data according to a first preset time interval, predict a container unit to be created based on the historical container unit reconstruction data, determine a predicted deployment node corresponding to the container unit to be created, and send a base mirror image preloading instruction to the predicted deployment node;
the forecast deployment node is used for carrying out preloading operation on the stored base image after receiving the base image preloading instruction;
Correspondingly, the target node is further configured to:
And after receiving the container unit reconstruction instruction, determining a target container unit type according to the container unit reconstruction instruction, pulling a target basic mirror image corresponding to the target container unit type from a plurality of preloaded basic mirror images, and creating a target container unit on the target node based on the target basic mirror image.
Optionally, the address allocation component is further configured to:
Counting the monitored container unit starting times, IP address allocation success rate and current resource occupancy rate according to a second preset time interval, judging the relation between the current resource occupancy rate and a preset occupancy rate threshold value if the monitored container unit starting times are greater than a first preset times and/or the IP address allocation success rate is smaller than the preset success rate threshold value, and triggering an IP address pre-allocation mechanism if the current resource occupancy rate is greater than the preset occupancy rate threshold value so as to pre-generate an IP address through the IP address pre-allocation mechanism and store the IP address in an IP address resource pool;
Accordingly, the address allocation component is further configured to:
After the IP address is pre-generated through the IP address pre-allocation mechanism and stored in an IP address resource pool, when the start of a new target container unit is monitored and the target IP address of the new target container unit does not exist in the cache space, the pre-generated IP address is obtained from the IP address resource pool and is used as the target IP address of the new target container unit.
Optionally, the address allocation component is further configured to:
After receiving the target data corresponding to each unique identifier returned by the cluster service interface, if any unique identifier does not successfully acquire a target IP address from the IP address management module, triggering a retry mechanism, acquiring the target IP address from the IP address management module again through the retry mechanism based on the target data corresponding to any unique identifier until the target IP address corresponding to any unique identifier is successfully acquired, or ending when the retry times reach a second preset times;
Accordingly, the address allocation component is further configured to:
If the retry times reach the second preset times and the target IP address corresponding to any unique identifier is not successfully obtained, adding any unique identifier into a processing failure queue.
Optionally, the cluster scheduler is further configured to:
acquiring CPU utilization rate, available memory amount, node health state and node schedulable state corresponding to each node in the containerized cluster, and determining storage space requirements and CPU use requirements corresponding to target container units to be created according to the container unit reconstruction requirement information;
And screening a plurality of first nodes in a healthy state and a schedulable state from the nodes based on the node healthy state and the node schedulable state corresponding to each node, and determining a target node from the plurality of first nodes according to the storage space requirement and the CPU use requirement corresponding to the target container unit, and the CPU use rate and the available memory quantity corresponding to each first node.
It should be noted that, in the embodiment of the present application, other corresponding descriptions of each functional unit related to the container unit reconstruction system for a containerized cluster may refer to corresponding descriptions in the method of fig. 1, which are not described herein again.
The embodiment of the application also provides a computer device, which can be a personal computer, a server, a network device and the like, and as shown in fig. 3, the computer device comprises a bus, a processor, a memory and a communication interface, and can also comprise an input/output interface and a display device. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing location information. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement the steps in the method embodiments.
It will be appreciated by those skilled in the art that the structure shown in FIG. 3 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a computer readable storage medium is provided, which may be non-volatile or volatile, and on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
The user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.