[go: up one dir, main page]

CN120812029A - Container unit reconstruction method and system for containerized cluster, storage medium and computer equipment - Google Patents

Container unit reconstruction method and system for containerized cluster, storage medium and computer equipment

Info

Publication number
CN120812029A
CN120812029A CN202510913115.8A CN202510913115A CN120812029A CN 120812029 A CN120812029 A CN 120812029A CN 202510913115 A CN202510913115 A CN 202510913115A CN 120812029 A CN120812029 A CN 120812029A
Authority
CN
China
Prior art keywords
target
container unit
address
cluster
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510913115.8A
Other languages
Chinese (zh)
Inventor
叶新林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Pay Electronic Payment Co ltd
Original Assignee
Ping An Pay Electronic Payment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Pay Electronic Payment Co ltd filed Critical Ping An Pay Electronic Payment Co ltd
Priority to CN202510913115.8A priority Critical patent/CN120812029A/en
Publication of CN120812029A publication Critical patent/CN120812029A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/50Address allocation
    • H04L61/5007Internet protocol [IP] addresses
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请涉及计算机技术领域,公开了一种用于容器化集群的容器单元重建方法及系统、存储介质、计算机设备,包括:集群调度器接收到容器单元重建请求后,获取重建需求信息,据其确定目标节点,向目标节点发送容器单元重建指令;目标节点接收到该指令后,创建目标容器单元并启动;地址分配组件在监听到目标容器单元启动后,判断缓存空间中是否存在匹配的目标IP地址;当存在时,将目标IP地址与目标容器单元关联,将关联信息存储至目标对象中;当不存在时,将目标容器单元对应的唯一标识存入待处理队列中,后续每次基于预设数量的唯一标识生成数据获取请求,根据数据获取请求确定目标IP地址。本申请的技术方案可应用于金融科技、医疗健康领域。

The present application relates to the field of computer technology and discloses a container unit reconstruction method and system, storage medium, and computer equipment for a containerized cluster, including: after receiving a container unit reconstruction request, the cluster scheduler obtains reconstruction requirement information, determines the target node based on the information, and sends a container unit reconstruction instruction to the target node; after the target node receives the instruction, it creates a target container unit and starts it; after monitoring the start of the target container unit, the address allocation component determines whether there is a matching target IP address in the cache space; if so, the target IP address is associated with the target container unit and the associated information is stored in the target object; if not, the unique identifier corresponding to the target container unit is stored in a queue to be processed, and subsequently generates a data acquisition request based on a preset number of unique identifiers each time, and determines the target IP address according to the data acquisition request. The technical solution of the present application can be applied to the fields of financial technology and medical health.

Description

Container unit reconstruction method and system for containerized cluster, storage medium and computer equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and a system for reconstructing a container unit for a containerized cluster, a storage medium, and a computer device.
Background
In the present digital age, the containerization technology is widely applied in various fields, especially in the financial field and the medical health field by virtue of the advantages of high efficiency, flexibility, portability and the like. In the financial field, institutions such as banks, securities, insurance and the like need to rapidly deploy and update various financial applications to cope with market changes and customer demands, while containerized clusters can provide rapid application deployment and expansion capabilities, so that efficient operation of financial services is ensured. For example, in a securities trading system, a large number of trade requests need to be processed in real time, and the containerized cluster can quickly start and close the Pod related to the trade, so as to ensure the response speed and stability of the system. In the medical health field, information systems, telemedicine platforms, etc. of hospitals also need to implement rapid iteration and efficient management by means of containerization techniques. For example, the electronic medical record system of the hospital needs to be continuously updated and optimized, and the containerized cluster can conveniently start and close the application Pod to realize the upgrading and maintenance of the application, and meanwhile, the safety and the reliability of medical data are ensured.
In containerized clusters, pod is critical for its stable operation as a minimal deployment unit. However, pod may fail for various reasons, such as node failure, application crash, etc., requiring rebuild. Pod reconstruction is a key for guaranteeing cluster stability and application availability, and can quickly recover the failed Pod and ensure service continuity. For example, in a financial transaction system, if a Pod of a transaction is crashed, the Pod can be rebuilt in time to avoid transaction interruption and reduce economic loss. In the medical health field, if the Pod of the remote medical platform fails, the timely reconstruction can ensure that the patient can normally acquire medical services.
In the existing containerized cluster Pod reconstruction process, IP address allocation is a key step. Taking Cilium-based Kubernetes cluster as an example, cilium Agent (address assignment component) is responsible for assigning IP addresses for reestablishing Pod. However, cilium Agent needs to acquire related data of the rebuilt Pod from the cluster service interface each time when the IP address is allocated to the rebuilt Pod, and then acquires the IP address from the IP address management module according to the related data, which not only increases the IP address allocation time, but also is easy to be blocked if the request data required to be processed by the cluster service interface and the IP address management module is too large, thereby causing failure in allocating the IP address.
Disclosure of Invention
In view of this, the present application provides a method and a system for reconstructing container units for containerized clusters, a storage medium, and a computer device, by introducing a caching mechanism, when performing IP address allocation, a target IP address is first obtained from a cache space, so that the efficiency of obtaining IP addresses can be greatly improved, when the cache space is not available, a lot of target IP addresses of target container units are obtained from an IP address management module each time, and a plurality of requests are combined into a lot of requests, so that the request frequency between a cluster service interface and the IP address management module can be reduced, the occurrence of the blocking situation caused by frequent requests of the cluster service interface and the IP address management module can be effectively prevented, and the success rate of IP address allocation can be improved.
According to one aspect of the present application, there is provided a container unit reconstruction method for a containerized cluster, the containerized cluster comprising a cluster scheduler, an address allocation component, a cluster service interface, an IP address management module, and a plurality of nodes, the method comprising:
After receiving a container unit reconstruction request, the cluster scheduler acquires container unit reconstruction demand information, determines a target node from a plurality of nodes included in the containerized cluster according to the container unit reconstruction demand information, and sends a container unit reconstruction instruction to the target node;
after receiving the container unit reconstruction instruction, the target node creates a target container unit and starts the target container unit;
After the address allocation component monitors that the target container unit is started, judging whether a target IP address matched with a naming space where the target container unit is located exists in a cache space or not;
When the target IP address exists in the cache space, the address allocation component associates the target IP address with the target container unit and stores the associated information into a target object for managing identity information;
When the target IP address does not exist in the cache space, the address allocation component stores the unique identifier corresponding to the target container unit into a to-be-processed queue, reads a preset number of unique identifiers from the to-be-processed queue when a preset sending condition is met, generates a data acquisition request based on the preset number of unique identifiers, and sends the data acquisition request to a cluster service interface, so that the target IP address corresponding to each unique identifier is acquired from an IP address management module based on target data corresponding to each unique identifier returned by the cluster service interface, each target IP address is respectively associated with the matched target container unit, and associated information is stored in the cache space and a target object for managing identity information.
According to another aspect of the present application, there is provided a container unit reconstruction system for a containerized cluster, comprising:
the cluster scheduler is used for acquiring the container unit reconstruction demand information after receiving the container unit reconstruction request, determining a target node from a plurality of nodes included in the containerized cluster according to the container unit reconstruction demand information, and sending a container unit reconstruction instruction to the target node;
the target node is used for creating a target container unit and starting after receiving the container unit reconstruction instruction;
The system comprises a target container unit, an address allocation component, an IP address management module and a cluster service interface, wherein the target container unit is used for storing target IP addresses corresponding to the target container unit, the address allocation component is used for judging whether target IP addresses matched with a name space where the target container unit is located exist in a cache space after monitoring that the target container unit is started, when the target IP addresses exist in the cache space, the target IP addresses are associated with the target container unit, and associated information is stored in a target object for managing identity information, when the target IP addresses do not exist in the cache space, unique identifiers corresponding to the target container unit are stored in a queue to be processed, when preset sending conditions are met, a preset number of unique identifiers are read from the queue to be processed, a data acquisition request is generated based on the preset number of unique identifiers, and the data acquisition request is sent to the cluster service interface, so that the target IP addresses corresponding to each unique identifier are acquired from the IP address management module based on target data corresponding to each unique identifier returned by the cluster service interface, the unique identifier are respectively associated with the matched target container unit, and the associated information is stored in the cache space and the target object for managing identity information.
According to a further aspect of the present application, there is provided a storage medium having stored thereon a computer program which when executed by a processor implements the above-described container unit reconstruction method for containerized clusters.
According to a further aspect of the present application, there is provided a computer device comprising a storage medium, a processor and a computer program stored on the storage medium and executable on the processor, said processor implementing the above-described container unit reconstruction method for containerized clusters when executing said program.
By means of the technical scheme, the container unit reconstruction method, the system, the storage medium and the computer equipment for containerized clusters are capable of greatly improving the acquisition efficiency of the IP addresses by introducing the caching mechanism to acquire the target IP addresses from the caching space when the IP addresses are distributed, and capable of combining a plurality of requests into one batch of requests when the caching space is not available and acquiring the target IP addresses of batch target container units from the IP address management module each time, so that the request frequency between the cluster service interface and the IP address management module can be reduced, the occurrence of the blocking condition of the cluster service interface and the IP address management module caused by frequent requests can be effectively prevented, and the success rate of the IP address distribution is improved.
The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
fig. 1 is a schematic flow chart of a method for reconstructing a container unit for a containerized cluster according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a system for reconstructing a container unit for a containerized cluster according to an embodiment of the present application;
Fig. 3 shows a schematic device structure of a computer device according to an embodiment of the present application.
Detailed Description
The application will be described in detail hereinafter with reference to the drawings in conjunction with embodiments. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.
In this embodiment, a method for reconstructing a container unit of a containerized cluster is provided, as shown in fig. 1, where the containerized cluster includes a cluster scheduler, an address allocation component, a cluster service interface, an IP address management module, and a plurality of nodes, and the method includes:
Step 101, after receiving a request for reconstructing a container unit, the cluster scheduler obtains information about the need for reconstructing the container unit, determines a target node from a plurality of nodes included in the containerized cluster according to the information about the need for reconstructing the container unit, and sends a command for reconstructing the container unit to the target node.
The embodiment of the application provides a container unit reconstruction method which can be applied to a containerized cluster (for example kubernetes cluster) and aims to solve the problem of failure in IP address allocation in the container unit reconstruction process in the prior art and improve the success rate of container unit reconstruction. The container unit involved in the method may be Pod. The containerized cluster is composed of a plurality of key components including a cluster scheduler, an address assignment component, a cluster service interface, an IP address management module, and a plurality of nodes. These components work cooperatively to achieve efficient IP address assignment during the container unit rebuild process. Here, the cluster scheduler is the core component in the containerized cluster responsible for resource allocation and task scheduling. When there is a container unit reestablishment request, the cluster scheduler may receive the request. This request may be triggered automatically by the system (e.g., when an abnormal operation of the container unit is detected) or may be initiated manually by an administrator. After receiving the request for reconstructing the container unit, the cluster scheduler can acquire the information of the need for reconstructing the container unit. Such information may include the resource requirements (e.g., CPU, memory, storage, etc.) of the rebuilt container unit, the specific requirements (e.g., specific hardware configuration, software environment, geographic location, etc.) of the node, the namespace to which the container unit belongs, etc. Then, the cluster scheduler can reconstruct the demand information according to the acquired container units, and screen out target nodes meeting the requirements from a plurality of nodes in the container cluster. The screening process can comprehensively consider the factors such as the resource use condition, the load condition, the health state and the like of the nodes so as to ensure that the target node can meet the operation requirement of the reconstructed container unit. After determining the target node, the cluster scheduler may send a container unit rebuild instruction to the target node. The instructions contain various information required to reconstruct the container unit, such as configuration files, mirror information, etc. of the container unit.
For example, in a financial transaction system, a certain container unit responsible for handling high frequency transactions needs to be rebuilt due to failure. The container unit has extremely high requirements on computing resources, needs to be provided with a high-performance CPU and a large-capacity memory, and also needs a specific financial data processing software environment. After receiving the reestablishment request, the cluster dispatcher screens out nodes with corresponding hardware configuration and software environment from a plurality of nodes as target nodes according to the requirements and sends a reestablishment instruction so as to ensure that a new container unit can rapidly and stably process high-frequency transaction data.
For another example, in a medical image processing system, one container unit for processing a large amount of medical image data fails. The container unit needs to have powerful graphics processing power (GPU acceleration) and enough memory to store the image data. After the cluster scheduler obtains the reconstruction requirement, a node provided with a high-performance GPU and a large-capacity storage is selected from the cluster as a target node, and a reconstruction instruction is sent to ensure that a new container unit can efficiently process medical image data.
Step 102, after receiving the container unit rebuilding instruction, the target node creates a target container unit and starts the target container unit.
In this embodiment, after receiving the container unit rebuilding instruction sent by the cluster scheduler, the target node creates a target container unit according to information in the container unit rebuilding instruction. The creation process includes allocating necessary resources (e.g., CPU, memory, storage, etc.), initializing the container environment, downloading the required image, etc. After the container unit is created, the target node may start the target container unit. After the start-up, the target container unit enters a ready state and waits for subsequent IP address allocation and other operations. At this point, the target container unit already has basic operational capabilities, but is not yet able to communicate with other components in the network because it has not yet been assigned an IP address.
And step 103, after the address allocation component monitors that the target container unit is started, judging whether a target IP address matched with the naming space where the target container unit is located exists in a cache space.
In this embodiment, the address assignment component is responsible for assigning IP addresses to target container units. It continuously listens in the system for start events of individual target container units. When the target container unit is monitored to start, the address allocation component starts to execute the IP address allocation flow. In particular, the address assignment component may check if there is a target IP address in the cache space that matches the namespace in which the target container unit is located. Cache space is a mechanism for storing commonly used or recently used IP addresses that may increase the efficiency of IP address allocation. Namespaces are one mechanism for logically isolating resources in a containerized cluster.
Step 104, when the target IP address exists in the cache space, the address allocation component associates the target IP address with the target container unit, and stores the association information into a target object for managing identity information.
In this embodiment, when there is a target IP address in the cache space that matches the namespace in which the target container unit resides, the address assignment component may associate the target IP address with the target container unit. I.e. the target container unit will use the target IP address for network communication. In addition, the address allocation component may store association information of the target container unit and the target IP address into a target object that manages identity information. The target object may be a database or a specific data structure for recording the correspondence of the target container unit to the target IP address. For example, if there is a target IP address in the cache space that fits into the namespace in which a high frequency transaction processing container unit is located, the address assignment component may assign the target IP address to the target container unit, e.g., assign the target IP address 192.168.1.100 to the target container unit, and store the identification of the target container unit (e.g., container unit name, namespace, etc.) and the association information of the target IP address to the target object. In this way, in the subsequent transaction data processing process, other components can acquire the IP address of the target container unit through the target object, so as to realize network communication.
And 105, when the target IP address does not exist in the cache space, the address allocation component stores the unique identifier corresponding to the target container unit into a queue to be processed, reads a preset number of unique identifiers from the queue to be processed when a preset sending condition is met, generates a data acquisition request based on the preset number of unique identifiers, sends the data acquisition request to a cluster service interface, acquires the target IP address corresponding to each unique identifier from an IP address management module based on target data corresponding to each unique identifier returned by the cluster service interface, associates each target IP address with a matched target container unit respectively, and stores associated information into the cache space and a target object for managing identity information.
In this embodiment, when there is no target IP address in the cache space that matches the namespace in which the target container unit is located, the address allocation component may store the unique identification corresponding to the target container unit in the pending queue. The unique identification may be the name, ID, etc. of the target container unit for uniquely identifying the target container unit. The pending queue is a temporary storage mechanism for temporarily storing the unique identification of the target container unit that needs further processing. When a preset transmission condition is met (for example, a certain number of container unit identifiers are reached in the queue to be processed, or a certain time interval passes), the address allocation component reads (can read in a first-in first-out manner) a preset number of unique identifiers from the queue to be processed. The preset number can be adjusted according to the actual condition of the system so as to balance the efficiency of IP address allocation and the load of the system. The address assignment component may then generate a data acquisition request based on the read preset number of unique identifications and send the data acquisition request to the cluster service interface. The cluster service interface is an interface in the containerized cluster that provides various services, through which information related to the target container unit can be obtained. The data acquisition request may include information such as a namespace, an identifier, etc. of the target container units, and is used to acquire target data (e.g., tag data, annotation data, etc.) associated with the target container units. After receiving the target data returned by the cluster service interface, the address allocation component may obtain, from the IP address management module, the target IP address corresponding to each unique identifier according to the target data. The IP address management module is responsible for managing and distributing IP address resources of the whole containerized cluster, and can distribute proper IP addresses to the container units according to certain rules and strategies. After the target IP addresses are acquired, the address allocation component associates each target IP address with the matched target container unit respectively, and stores the association information of each target IP address and the target container unit into a cache space and a target object for managing identity information. Therefore, when the subsequent container units are rebuilt, the method can be firstly searched from the cache space, and the efficiency of IP address allocation is improved.
In one embodiment of the medical health field, when there is no target IP address in the cache space that matches the medical image processing container unit, the address assignment component may store the unique identification of the container unit (e.g., container unit-image processing-department a-001) in a queue to be processed awaiting further processing. When the unique identifiers of the 3 medical image processing container units exist in the queue to be processed, the preset sending condition is met. The address allocation component reads the 3 unique identifiers, generates a data acquisition request, and sends the data acquisition request to the cluster service interface to acquire target data required by subsequent allocation of the target IP address. The address allocation component obtains corresponding target IP addresses, such as 172.16.0.201, 172.16.0.202, etc., from the IP address management module based on the unique identification of the 3 medical image processing container units and the target data. The address assignment component associates these IP addresses with corresponding container units, ensuring that each target container unit has a target IP address available. The address allocation component stores association information of the medical image processing container unit and the target IP address into the cache space and the target object. For example, the target IP address corresponding to the record container unit-image processing-department a-001 in the target object is 172.16.0.201. Other medical system components can acquire the target IP address of the target container unit by inquiring the target object, so that data transmission and processing are facilitated.
After successful allocation of the target IP address to the target container unit, the target container unit is successfully rebuilt.
By applying the technical scheme of the embodiment, the target IP addresses are acquired from the cache space when the IP addresses are distributed by introducing the cache mechanism, so that the IP address acquisition efficiency can be greatly improved, the target IP addresses of the batch target container units are acquired from the IP address management module each time when the IP addresses are not in the cache space, a plurality of requests are combined into one batch of requests, the request frequency between the cluster service interface and the IP address management module can be reduced, the occurrence of the blocking condition of the cluster service interface and the IP address management module caused by frequent requests can be effectively prevented, and the success rate of the IP address distribution is improved.
In the embodiment of the application, optionally, the target data comprises tag data and annotation data, after the target data corresponding to each unique identifier returned by the cluster service interface is received, the method further comprises the steps that the address distribution component analyzes the tag data and the annotation data, determines a plurality of network strategies matched with the target container unit according to analysis results, invokes a network strategy analysis model, respectively carries out strategy analysis on each network strategy through the network strategy analysis model to obtain an access object corresponding to the network strategy, judges whether access conflicts exist among the network strategies according to the access object and the access action corresponding to each network strategy relative to the access action of the access object, and triggers an automatic restoration mechanism when the access conflicts exist, so as to obtain a final network strategy corresponding to the target container unit based on the automatic restoration mechanism.
In this embodiment, during the reconstruction of the container unit, the address allocation component is not only responsible for allocating an IP address to the target container unit, but may also generate a network policy for the target container unit that meets the security requirements. Wherein the target data obtained from the cluster service interface may include tag data and annotation data. Such data is typically associated with the configuration and policy of the target container unit and is stored in a containerized platform such as Kubernetes. The tag data is used to identify attributes (e.g., environment, application, etc.) of the target container unit, and the annotation data is used to store additional information (e.g., version, configuration details, etc.). After receiving the target data corresponding to each unique identifier returned by the cluster service interface, the address allocation component can analyze the tag data and the annotation data in the target data to extract information related to the network policy, so that the network policy matched with the target container unit is determined according to the information. Such information may include attributes of the application, environment, security group, etc. to which the target container unit belongs, which are used to determine applicable network policies. Next, based on the parsing result, the address allocation component determines a plurality of network policies from the plurality of existing network policies that match the target container unit. The network policy defines which resources the target container unit can access and how to access those resources.
The address assignment component then invokes a network policy resolution model that is used to resolve the specifics of each network policy. The network policy resolution model may be a rule-based resolution engine or a machine-learning based model for understanding and enforcing policy rules. Through a network policy analysis model, the address allocation component analyzes each network policy respectively to obtain an access object (i.e. a target resource allowing or denying access) corresponding to each network policy and an access action (e.g. allowing, denying, limiting, etc.) corresponding to the access object. And according to the access object and the access action corresponding to each network policy, the address allocation component judges whether access conflict exists among the network policies. For example, one policy may allow access to a resource, while another policy may deny access to the same resource. If there is an access conflict, the address assignment component may trigger an automatic repair mechanism to resolve the conflict and determine the final network policy.
Here, the automatic repair mechanism may resolve the conflict based on predefined rules or policy priorities. For example, conflicts may be resolved based on the priority of the policy (e.g., a high priority policy overrides a low priority policy) or the severity of the policy (e.g., access is denied over access is allowed). And finally determining a network strategy applicable to the target container unit through an automatic repair mechanism, ensuring that the network access of the target container unit meets the safety requirement, and simultaneously avoiding access problems caused by strategy conflict.
In one specific embodiment of the financial field, the tag data may include "app=online-banking", "env=production", etc., and the annotation data may include "version=2.0", "security-level=high", etc. The address assignment component parses the tag and annotation data to determine a plurality of network policies that match the transactional container unit. For example, policy 1 allows container units from "app=online-banking" and "env=production" to access the core transaction database. Policy 2-container unit from "security-level=low" is denied access to sensitive data (but the "security-level=high" of the transaction container unit, so this policy does not directly conflict). Policy 3-restricting access from external networks, allowing only access to specific IP ranges (e.g. banking internal networks). The address allocation component calls a network policy analysis model to analyze each network policy to obtain an access object and an access action corresponding to each policy. During the resolution process, the address assignment component discovers that policy 1 and policy 3 have potential access conflicts. Policy 1 allows the inner container unit to access the core transaction database, but policy 3 restricts access to the external network. Although the transaction container unit is an internal container unit, it is necessary to ensure that the restrictions of policy 3 do not accidentally injure legitimate internal accesses. To resolve the conflict, the address assignment component triggers an automatic repair mechanism. The mechanism may include checking policy priority-assuming policy 1 has a higher priority than policy 3, then the allowed rules of policy 1 will override the restricted rules of policy 3 (only for internal legitimate accesses). Refining access rules-adding exception rules for policy 3, explicitly allowing access from container units of "app = online-banking" and "env = production" even if they are from the internal network. Through the automatic repair mechanism, the address allocation component determines the final network policy applicable to the transaction container unit, ensuring that it can access the core transaction database and meet the security requirements of the bank.
The embodiment of the application determines the applicable network strategy by analyzing the tag data and the annotation data, and utilizes the network strategy analysis model to analyze the strategy so as to judge and resolve the slight conflict. By the automatic repair mechanism, the network access strategy of the target container unit can be reasonably solved in the case of conflict, so that the network access safety and stability of the target container unit in the containerized cluster are ensured.
In the embodiment of the application, the containerized cluster further comprises a cluster monitoring component, wherein the cluster monitoring component acquires load usage data of each node in the containerized cluster according to a preset frequency, calculates the current total load of the containerized cluster based on the load usage data of each node, adjusts sub-resource quota for each container unit in the containerized cluster according to the current total load and the total resources of the containerized cluster, determines a first resource occupied by a reconstructed container unit and a second resource occupied by the created container unit when the reconstruction of the container unit is monitored if the adjusted sub-resource quota is smaller than a preset quota threshold, and releases the first resource occupied by the created container unit when the second resource comprises the first resource, and resumes the first resource occupied by the created container unit after the reconstruction of the container unit is finished.
In this embodiment, the containerized cluster further includes a cluster monitoring component, configured to dynamically adjust sub-resource quota of each container unit in the cluster, and perform resource management during the process of rebuilding the container unit, so as to ensure reasonable utilization of cluster resources and smooth implementation of rebuilding the container unit. Specifically, the cluster monitoring component obtains load usage data for each node in the containerized cluster at a preset frequency (e.g., every minute or second). Such data includes CPU usage, memory usage, storage usage, network bandwidth usage, and the like. The cluster monitoring component then calculates a current total load of the containerized cluster based on the obtained load usage data. The total load may be obtained by weighted averaging or summing the individual node loads. The cluster monitoring component adjusts sub-resource quotas for each container unit within the containerized cluster based on the current total load and the total resources of the containerized cluster. For example, if the current total load is higher, the cluster monitoring component may decrease the sub-resource quota for the container unit to free up resources for use by the remaining services or applications, and if the current total load is lower, the cluster monitoring component may increase the sub-resource quota for the container unit. Here, the sub-resource quota refers to a resource quota for all container units, including created container units and non-created container units. If the adjusted sub-resource quota is smaller than the preset quota threshold, the sub-resource quota can not meet the occupation of the resources of the container units in the containerized cluster, so that the reestablishing operation of the container units can be monitored. When the cluster monitoring component monitors that there are container units to be rebuilt, it may begin executing a resource management flow in which the cluster monitoring component determines a first resource (e.g., a particular CPU core, memory block, etc.) occupied by the rebuilt container unit and a second resource occupied by the created container unit. If the first resource is included in the second resource, it is indicated that there is a conflict between the resource that needs to be used by the reconstructed container unit and the created container unit. At this point, the cluster monitoring component de-occupies the first resource by the created container unit to ensure that the reconstructed container unit is available for the required resource. After the container unit rebuilds, the cluster monitoring component resumes occupation of the first resource by the created container unit to resume normal resource allocation.
According to the embodiment of the application, the cluster monitoring component is introduced, so that the dynamic monitoring and adjustment of the containerized cluster resources are realized, and the efficient utilization of the resources and the smooth implementation of the reconstruction of the container units are ensured. When resources are tensed, normal operation of the reconstruction of the container unit is ensured by a mechanism for releasing and recovering the occupation of the resources, and the reliability and the stability of the system are improved.
In the embodiment of the application, the method further comprises the steps that the cluster dispatcher obtains historical container unit reconstruction data according to a first preset time interval, predicts a container unit to be created based on the historical container unit reconstruction data, determines a predicted deployment node corresponding to the container unit to be created, sends a base image preloading instruction to the predicted deployment node, and carries out preloading operation on the stored base image after receiving the base image preloading instruction, and correspondingly, in step 102, after receiving the container unit reconstruction instruction, the target node creates a target container unit, wherein the step comprises the steps that after receiving the container unit reconstruction instruction, the target node determines a target container unit type according to the container unit reconstruction instruction, pulls a target base image corresponding to the target container unit type from a plurality of preloaded base images, and creates the target container unit on the target node based on the target base image.
In this embodiment, predictive scheduling and base image preloading mechanisms are introduced in the containerized cluster to improve efficiency and response speed of container unit rebuilding. Specifically, the cluster scheduler obtains historical container unit reconstruction data at a first preset time interval (e.g., every hour or every day). These data may be stored in log files or databases. Through analysis of historical container unit reconstruction data, the cluster scheduler identifies common patterns and trends of container unit reconstruction. For example, certain applications may be re-established frequently within a certain period of time, or may require re-establishment after a system update. Next, based on the historical container unit reconstruction data analysis, the cluster scheduler predicts container units that may need to be created over a future period of time. These container units may need to be rebuilt for reasons such as increased load, failure recovery, or system updates. In addition, the cluster scheduler may also determine the possible deployment nodes for these container units to be created. These nodes are typically selected based on factors such as resource availability, geographic location, historical load, etc. The cluster scheduler sends a base image preload instruction to the predicted deployment node. The instruction informs the node to perform a preload operation on the already stored base image. After the predicted deployment node receives the instruction, all the cached base images can be preloaded, so that the preloaded base images can be directly used when the container unit is subsequently rebuilt, and the preloading time is omitted.
Subsequently, when the target node actually receives a container unit rebuild instruction, it may determine the target container unit type from the instruction. The target node pulls a target base image corresponding to the target container unit type from the preloaded plurality of base images. This process can be done quickly since the target base image has already been preloaded. Further, the target node creates a target container unit locally based on the target base image. Since the target base image is already preloaded, the creation speed of the container unit is significantly increased, reducing the delay of the reconstruction.
The embodiment of the application can obviously improve the efficiency and response speed of the reconstruction of the container units in the containerized cluster by introducing the predictive scheduling and the basic mirror image preloading mechanism. Predictive scheduling reduces the image loading time when a container unit is created by analyzing historical data, identifying in advance the container unit that may need to be rebuilt, and by a base image preloading mechanism.
In the embodiment of the application, the method further comprises the steps that the address allocation component counts the monitored container unit starting times, the IP address allocation success rate and the current resource occupancy rate according to a second preset time interval, if the monitored container unit starting times are larger than the first preset times and/or the IP address allocation success rate is smaller than a preset success rate threshold value, the relation between the current resource occupancy rate and the preset occupancy rate threshold value is judged, if the current resource occupancy rate is larger than the preset occupancy rate threshold value, an IP address pre-allocation mechanism is triggered to pre-generate an IP address through the IP address pre-allocation mechanism and store the IP address in an IP address resource pool, and correspondingly, after the IP address pre-allocation mechanism pre-generates the IP address and stores the IP address in the IP address resource pool, the address allocation component is heard that the IP address is used as the target IP address of a new target container unit from the IP address resource pool after the new target container unit is started, and the target IP address of the new target container unit does not exist in the cache space.
In this embodiment, by introducing an IP address pre-allocation mechanism, the IP address allocation process of container units in the containerized cluster is optimized, so as to improve the efficiency and reliability of address allocation. Specifically, the address allocation component counts the number of container unit starts at a second preset time interval (e.g., every minute or hour), records the number of container units newly started (newly rebuilt) in the counting period, calculates the IP address allocation success rate, calculates the ratio of the number of container units successfully allocated with the IP address in the counting period to the total number of container units started, and evaluates the current resource occupancy rate, namely the resource use condition of the current address allocation component.
The address assignment component then determines based on the statistics that if the number of container unit starts exceeds a first preset number (e.g., more than 100 times/minute), indicating that the address assignment component is experiencing a high load, more pre-generated IP addresses may be needed to relieve the pressure. If the IP address assignment success rate is below a preset success rate threshold (e.g., below 95%), indicating that the current IP address assignment procedure is problematic, optimization may be required. If the current resource occupancy of the address assignment component is greater than a preset occupancy threshold (e.g., greater than 80%), the address assignment component resource occupancy is indicated to be near saturation.
Specifically, if the monitored number of times of starting the container unit is greater than the first preset number of times and/or the success rate of allocation of the IP address is less than the preset success rate threshold, then further judging the relationship between the current resource occupancy rate and the preset occupancy rate threshold. If the current resource occupancy is greater than the preset occupancy threshold, an IP address pre-allocation mechanism may be triggered by which the address allocation component pre-generates a number of IP addresses and stores the addresses in an IP address resource pool. The pre-generated IP addresses may then be directly assigned to the container units to reduce latency and resource contention in real-time allocation. For example, in the financial field, during peak transaction periods, a large number of new transaction container units need to be quickly started and allocated with IP addresses to ensure timely processing of transactions, delay or failure of transactions can be avoided by pre-generated IP addresses, and for example, medical health systems need to process a large amount of real-time data, such as patient vital sign monitoring, medical image analysis, etc., and the response speed and data processing capacity of the system are extremely high, and in emergency situations, such as emergency, surgery, etc., new medical container units need to be quickly started to process data or provide services, and by pre-generated IP addresses, the IP address allocation process must be quick and reliable.
After the IP address resource pool is established, the address allocation component continues to monitor for start events for new target container units. When a new target container unit is started and its target IP address does not exist in the cache space, the address allocation component may obtain a pre-generated IP address from the IP address resource pool and allocate it to the container unit. After allocation, the address allocation component updates the IP address resource pool, ensuring that there are always enough IP addresses in the resource pool to allocate.
The embodiment of the application can effectively meet the IP address allocation requirement of the address allocation assembly under the condition of high load by introducing an IP address pre-allocation mechanism, and reduce the delay and the resource competition of real-time allocation.
In the embodiment of the application, optionally, after receiving the target data corresponding to each unique identifier returned by the cluster service interface, the method further comprises triggering a retry mechanism if any unique identifier does not successfully acquire a target IP address from the IP address management module, acquiring the target IP address from the IP address management module again through the retry mechanism based on the target data corresponding to any unique identifier until the target IP address corresponding to any unique identifier is successfully acquired or the retry times are ended when the second preset times are reached, and correspondingly, adding any unique identifier into a processing failure queue if the retry times are up to the second preset times and the target IP address corresponding to any unique identifier is not successfully acquired.
In this embodiment, by introducing a retry mechanism and processing a failure queue, the reliability and fault tolerance of the IP address assignment process is enhanced. Specifically, when the target IP address is acquired from the IP address management module, if a certain uniquely identified target container unit fails to acquire the target IP address, a retry mechanism may be triggered. The retry mechanism may retry to acquire the target IP address according to a preset retry strategy (e.g., interval time, retry number, etc.). In addition, a maximum number of retries (i.e., a second preset number) may be set to prevent resource waste and system performance degradation caused by infinite retries. After each retry, it may be checked whether the current number of retries has reached a second preset number. If the target IP address is successfully acquired before the second preset times are reached, the retry process is immediately terminated, the task continues to be normally executed, and if the retry times reach the second preset times and the target IP address is not successfully acquired, the unique identification can be added into the processing failure queue. Here, the processing failure queue is used to record the unique identification of an unsuccessful allocation of an IP address for subsequent manual inspection, repair, or reattempt allocation.
The embodiment of the application can obviously improve the reliability and fault tolerance capability of the IP address allocation process by introducing a retry mechanism and a processing failure queue, ensures that recovery or recording failure information can be automatically carried out when the IP address allocation fails, is convenient for subsequent processing, and improves the overall stability and user experience.
In the embodiment of the application, optionally, the step of determining the target node from the plurality of nodes included in the containerized cluster according to the request information of the container unit includes the steps of obtaining CPU usage rate, available memory amount, node health state and node schedulable state corresponding to each node in the containerized cluster, determining storage space request and CPU usage request corresponding to the target container unit to be created according to the request information of the container unit, screening a plurality of first nodes in health state and schedulable state from the nodes according to the node health state and the node schedulable state corresponding to each node, and determining the target node from the plurality of first nodes according to the storage space request and CPU usage request corresponding to the target container unit, the CPU usage rate and the available memory amount corresponding to each first node.
In this embodiment, the cluster scheduler may obtain CPU usage, the amount of available memory, the node health status, and the node schedulable status for each node in the containerized cluster. This information is used to evaluate the current state of the node and the availability of resources. Specifically, CPU utilization, which is indicative of the current CPU resource occupancy of a node, is typically expressed in terms of a percentage, high CPU utilization may mean that the node is heavily loaded, available memory, which is indicative of the current available memory resources of the node, typically expressed in terms of GB or MB, which may result in memory starvation, node health, which is indicative of the operational status of the node, such as healthy, unhealthy or unknown, which is typically determined by heartbeat detection or health check mechanisms, and node schedulability, which is indicative of whether the node may accept new container unit scheduling requests, which may not be schedulable for maintenance, resource starvation or other reasons.
In addition, the cluster scheduler may also determine storage space requirements and CPU usage requirements of the target container unit to be created based on the container unit reconstruction requirement information. These requirements may be determined based on the configuration file or historical usage data of the container unit. The storage space requirement is used for indicating the storage space required by the target container unit, is usually used for persistent data or temporary storage, and the CPU use requirement is used for indicating the CPU resource required by the target container unit, is usually expressed in terms of CPU core number or CPU millisecond number.
The cluster scheduler then screens out all nodes from among the nodes that are healthy and schedulable (i.e., the first node) based on the healthy state and schedulable state of the nodes. This step excludes nodes that are not available or not suitable for scheduling. Further, among the first nodes selected, the most suitable node is selected as the target node according to the storage space requirement and the CPU usage requirement of the target container unit, and the CPU usage rate and the available memory amount of each first node. The node can be ensured to have enough storage space to meet the requirement of the target container unit through storage space matching, and can be ensured to have enough CPU to support the operation of the target container unit through CPU matching.
Further, as a specific implementation of the method of fig. 1, an embodiment of the present application provides a container unit reconstruction system for a containerized cluster, as shown in fig. 2, where the system includes:
the cluster scheduler is used for acquiring the container unit reconstruction demand information after receiving the container unit reconstruction request, determining a target node from a plurality of nodes included in the containerized cluster according to the container unit reconstruction demand information, and sending a container unit reconstruction instruction to the target node;
the target node is used for creating a target container unit and starting after receiving the container unit reconstruction instruction;
The system comprises a target container unit, an address allocation component, an IP address management module and a cluster service interface, wherein the target container unit is used for storing target IP addresses corresponding to the target container unit, the address allocation component is used for judging whether target IP addresses matched with a name space where the target container unit is located exist in a cache space after monitoring that the target container unit is started, when the target IP addresses exist in the cache space, the target IP addresses are associated with the target container unit, and associated information is stored in a target object for managing identity information, when the target IP addresses do not exist in the cache space, unique identifiers corresponding to the target container unit are stored in a queue to be processed, when preset sending conditions are met, a preset number of unique identifiers are read from the queue to be processed, a data acquisition request is generated based on the preset number of unique identifiers, and the data acquisition request is sent to the cluster service interface, so that the target IP addresses corresponding to each unique identifier are acquired from the IP address management module based on target data corresponding to each unique identifier returned by the cluster service interface, the unique identifier are respectively associated with the matched target container unit, and the associated information is stored in the cache space and the target object for managing identity information.
Optionally, the target data comprises tag data and annotation data, and the address allocation component is further configured to:
After receiving target data corresponding to each unique identifier returned by the cluster service interface, analyzing the tag data and the annotation data, determining a plurality of network policies matched with the target container unit according to analysis results, calling a network policy analysis model, respectively carrying out policy analysis on each network policy through the network policy analysis model to obtain an access object corresponding to the network policy, judging whether access conflict exists between the network policies according to the access object and the access action corresponding to each network policy, triggering an automatic restoration mechanism when the access conflict exists, and obtaining a final network policy corresponding to the target container unit based on the automatic restoration mechanism.
Optionally, the containerized cluster further comprises a cluster monitoring component, wherein the cluster monitoring component is further configured to:
And if the adjusted sub-resource quota is smaller than a preset quota threshold, determining a first resource occupied by the reconstructed container unit and a second resource occupied by the created container unit when the reconstruction of the container unit is monitored, and releasing the first resource occupied by the created container unit when the second resource comprises the first resource, and recovering the first resource occupied by the created container unit after the reconstruction of the container unit is finished.
Optionally, the cluster scheduler is further configured to obtain historical container unit reconstruction data according to a first preset time interval, predict a container unit to be created based on the historical container unit reconstruction data, determine a predicted deployment node corresponding to the container unit to be created, and send a base mirror image preloading instruction to the predicted deployment node;
the forecast deployment node is used for carrying out preloading operation on the stored base image after receiving the base image preloading instruction;
Correspondingly, the target node is further configured to:
And after receiving the container unit reconstruction instruction, determining a target container unit type according to the container unit reconstruction instruction, pulling a target basic mirror image corresponding to the target container unit type from a plurality of preloaded basic mirror images, and creating a target container unit on the target node based on the target basic mirror image.
Optionally, the address allocation component is further configured to:
Counting the monitored container unit starting times, IP address allocation success rate and current resource occupancy rate according to a second preset time interval, judging the relation between the current resource occupancy rate and a preset occupancy rate threshold value if the monitored container unit starting times are greater than a first preset times and/or the IP address allocation success rate is smaller than the preset success rate threshold value, and triggering an IP address pre-allocation mechanism if the current resource occupancy rate is greater than the preset occupancy rate threshold value so as to pre-generate an IP address through the IP address pre-allocation mechanism and store the IP address in an IP address resource pool;
Accordingly, the address allocation component is further configured to:
After the IP address is pre-generated through the IP address pre-allocation mechanism and stored in an IP address resource pool, when the start of a new target container unit is monitored and the target IP address of the new target container unit does not exist in the cache space, the pre-generated IP address is obtained from the IP address resource pool and is used as the target IP address of the new target container unit.
Optionally, the address allocation component is further configured to:
After receiving the target data corresponding to each unique identifier returned by the cluster service interface, if any unique identifier does not successfully acquire a target IP address from the IP address management module, triggering a retry mechanism, acquiring the target IP address from the IP address management module again through the retry mechanism based on the target data corresponding to any unique identifier until the target IP address corresponding to any unique identifier is successfully acquired, or ending when the retry times reach a second preset times;
Accordingly, the address allocation component is further configured to:
If the retry times reach the second preset times and the target IP address corresponding to any unique identifier is not successfully obtained, adding any unique identifier into a processing failure queue.
Optionally, the cluster scheduler is further configured to:
acquiring CPU utilization rate, available memory amount, node health state and node schedulable state corresponding to each node in the containerized cluster, and determining storage space requirements and CPU use requirements corresponding to target container units to be created according to the container unit reconstruction requirement information;
And screening a plurality of first nodes in a healthy state and a schedulable state from the nodes based on the node healthy state and the node schedulable state corresponding to each node, and determining a target node from the plurality of first nodes according to the storage space requirement and the CPU use requirement corresponding to the target container unit, and the CPU use rate and the available memory quantity corresponding to each first node.
It should be noted that, in the embodiment of the present application, other corresponding descriptions of each functional unit related to the container unit reconstruction system for a containerized cluster may refer to corresponding descriptions in the method of fig. 1, which are not described herein again.
The embodiment of the application also provides a computer device, which can be a personal computer, a server, a network device and the like, and as shown in fig. 3, the computer device comprises a bus, a processor, a memory and a communication interface, and can also comprise an input/output interface and a display device. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing location information. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement the steps in the method embodiments.
It will be appreciated by those skilled in the art that the structure shown in FIG. 3 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, a computer readable storage medium is provided, which may be non-volatile or volatile, and on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
The user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (10)

1.一种用于容器化集群的容器单元重建方法,其特征在于,所述容器化集群包括集群调度器、地址分配组件、集群服务接口、IP地址管理模块以及多个节点;所述方法包括:1. A method for reconstructing a container unit in a containerized cluster, wherein the containerized cluster includes a cluster scheduler, an address allocation component, a cluster service interface, an IP address management module, and multiple nodes; the method comprises: 所述集群调度器接收到容器单元重建请求后,获取容器单元重建需求信息,根据所述容器单元重建需求信息从所述容器化集群包括的多个节点中确定目标节点,并向所述目标节点发送容器单元重建指令;After receiving the container unit reconstruction request, the cluster scheduler obtains container unit reconstruction requirement information, determines a target node from a plurality of nodes included in the containerized cluster according to the container unit reconstruction requirement information, and sends a container unit reconstruction instruction to the target node; 所述目标节点接收到所述容器单元重建指令后,创建目标容器单元并启动;After receiving the container unit reconstruction instruction, the target node creates a target container unit and starts it; 所述地址分配组件在监听到所述目标容器单元启动后,判断缓存空间中是否存在与所述目标容器单元所在的命名空间相匹配的目标IP地址;After monitoring the start of the target container unit, the address allocation component determines whether there is a target IP address in the cache space that matches the namespace where the target container unit is located; 当所述缓存空间中存在所述目标IP地址时,所述地址分配组件将所述目标IP地址与所述目标容器单元关联,并将关联信息存储至管理身份信息的目标对象中;When the target IP address exists in the cache space, the address allocation component associates the target IP address with the target container unit and stores the association information in the target object that manages identity information; 当所述缓存空间中不存在所述目标IP地址时,所述地址分配组件将所述目标容器单元对应的唯一标识存入待处理队列中,在满足预设发送条件时,从所述待处理队列中读取预设数量的唯一标识,基于所述预设数量的唯一标识生成数据获取请求,并将所述数据获取请求发送至集群服务接口,以基于所述集群服务接口返回的与每个唯一标识对应的目标数据,从IP地址管理模块获取每个唯一标识对应的目标IP地址,将各目标IP地址分别与匹配的目标容器单元关联,并将关联信息存储至所述缓存空间以及管理身份信息的目标对象中。When the target IP address does not exist in the cache space, the address allocation component stores the unique identifier corresponding to the target container unit in a pending queue. When the preset sending conditions are met, a preset number of unique identifiers are read from the pending queue, a data acquisition request is generated based on the preset number of unique identifiers, and the data acquisition request is sent to the cluster service interface to obtain the target IP address corresponding to each unique identifier from the IP address management module based on the target data corresponding to each unique identifier returned by the cluster service interface, associate each target IP address with a matching target container unit, and store the association information in the cache space and the target object that manages the identity information. 2.根据权利要求1所述的方法,其特征在于,所述目标数据包括标签数据和注解数据;接收到所述集群服务接口返回的与每个唯一标识对应的目标数据之后,所述方法还包括:2. The method according to claim 1, wherein the target data includes label data and annotation data; after receiving the target data corresponding to each unique identifier returned by the cluster service interface, the method further comprises: 所述地址分配组件解析所述标签数据和注解数据,并根据解析结果确定与所述目标容器单元匹配的多个网络策略,调用网络策略解析模型,通过所述网络策略解析模型分别对每个所述网络策略进行策略解析,得到所述网络策略对应的访问对象,以及相对于所述访问对象的访问动作,根据每个网络策略对应的访问对象和访问动作,判断各网络策略之间是否存在访问冲突,并在存在访问冲突时,触发自动修复机制,基于所述自动修复机制得到所述目标容器单元对应的最终网络策略。The address allocation component parses the label data and annotation data, and determines multiple network policies that match the target container unit based on the parsing results, calls the network policy parsing model, and performs policy parsing on each of the network policies through the network policy parsing model to obtain the access object corresponding to the network policy and the access action relative to the access object. According to the access object and access action corresponding to each network policy, it is determined whether there is an access conflict between the network policies, and when there is an access conflict, an automatic repair mechanism is triggered, and the final network policy corresponding to the target container unit is obtained based on the automatic repair mechanism. 3.根据权利要求1所述的方法,其特征在于,所述容器化集群还包括集群监控组件;所述方法还包括:3. The method according to claim 1, wherein the containerized cluster further comprises a cluster monitoring component; the method further comprises: 所述集群监控组件按照预设频率获取所述容器化集群中各节点的负载使用数据,并基于各节点的负载使用数据,计算所述容器化集群的当前总负载,根据所述当前总负载以及所述容器化集群的总资源,调整用于所述容器化集群内各容器单元的子资源配额;若调整后的子资源配额小于预设配额阈值,则在监测到容器单元重建时,确定重建容器单元占用的第一资源,以及已创建容器单元占用的第二资源,当所述第二资源包括所述第一资源时,解除已创建容器单元对第一资源的占用,并当容器单元重建结束后,恢复已创建容器单元对第一资源的占用。The cluster monitoring component obtains the load usage data of each node in the containerized cluster at a preset frequency, and calculates the current total load of the containerized cluster based on the load usage data of each node, and adjusts the sub-resource quota used for each container unit in the containerized cluster according to the current total load and the total resources of the containerized cluster; if the adjusted sub-resource quota is less than the preset quota threshold, when the reconstruction of the container unit is monitored, the first resource occupied by the reconstructed container unit and the second resource occupied by the created container unit are determined; when the second resource includes the first resource, the occupation of the first resource by the created container unit is released, and when the reconstruction of the container unit is completed, the occupation of the first resource by the created container unit is restored. 4.根据权利要求1所述的方法,其特征在于,所述方法还包括:4. The method according to claim 1, further comprising: 所述集群调度器按照第一预设时间间隔获取历史容器单元重建数据,基于所述历史容器单元重建数据,预测待创建容器单元,并确定所述待创建容器单元对应的预测部署节点,向所述预测部署节点发送基础镜像预加载指令;The cluster scheduler obtains historical container unit reconstruction data at a first preset time interval, predicts a container unit to be created based on the historical container unit reconstruction data, determines a predicted deployment node corresponding to the container unit to be created, and sends a base image preloading instruction to the predicted deployment node; 所述预测部署节点接收到所述基础镜像预加载指令后,对存储的基础镜像进行预加载操作;After receiving the base image preloading instruction, the prediction deployment node performs a preloading operation on the stored base image; 相应地,所述目标节点接收到所述容器单元重建指令后,创建目标容器单元,包括:Accordingly, after receiving the container unit reconstruction instruction, the target node creates a target container unit, including: 所述目标节点接收到所述容器单元重建指令后,根据所述容器单元重建指令确定目标容器单元类型,并从预加载完成的多个基础镜像中,拉取与所述目标容器单元类型对应的目标基础镜像,基于所述目标基础镜像在所述目标节点上创建目标容器单元。After receiving the container unit reconstruction instruction, the target node determines the target container unit type according to the container unit reconstruction instruction, pulls the target base image corresponding to the target container unit type from the multiple preloaded base images, and creates the target container unit on the target node based on the target base image. 5.根据权利要求1所述的方法,其特征在于,所述方法还包括:5. The method according to claim 1, further comprising: 所述地址分配组件按照第二预设时间间隔统计监听到的容器单元启动次数、IP地址分配成功率以及当前资源占用率,若监听到的容器单元启动次数大于第一预设次数,和/或所述IP地址分配成功率小于预设成功率阈值,则判断当前资源占用率与预设占用率阈值之间的关系,若当前资源占用率大于预设占用率阈值,则触发IP地址预分配机制,以通过所述IP地址预分配机制预生成IP地址并存储在IP地址资源池中;The address allocation component counts the number of container unit startups, the IP address allocation success rate, and the current resource occupancy rate monitored at a second preset time interval; if the number of container unit startups monitored is greater than the first preset number, and/or the IP address allocation success rate is less than a preset success rate threshold, then determining a relationship between the current resource occupancy rate and the preset occupancy rate threshold; if the current resource occupancy rate is greater than the preset occupancy rate threshold, triggering an IP address pre-allocation mechanism to pre-generate an IP address through the IP address pre-allocation mechanism and store it in an IP address resource pool; 相应地,所述通过所述IP地址预分配机制预生成IP地址并存储在IP地址资源池中之后,所述方法还包括:Correspondingly, after the IP address is pre-generated by the IP address pre-allocation mechanism and stored in the IP address resource pool, the method further includes: 所述地址分配组件在监听到新的目标容器单元启动后,且所述缓存空间中不存在新的目标容器单元的目标IP地址时,从所述IP地址资源池中获取预生成的IP地址作为新的目标容器单元的目标IP地址。After monitoring that a new target container unit is started and the target IP address of the new target container unit does not exist in the cache space, the address allocation component obtains a pre-generated IP address from the IP address resource pool as the target IP address of the new target container unit. 6.根据权利要求1所述的方法,其特征在于,接收到所述集群服务接口返回的与每个唯一标识对应的目标数据之后,所述方法还包括:6. The method according to claim 1, characterized in that after receiving the target data corresponding to each unique identifier returned by the cluster service interface, the method further comprises: 若存在任一唯一标识未从所述IP地址管理模块成功获取到目标IP地址,则触发重试机制,基于所述任一唯一标识对应的目标数据,通过所述重试机制再次从所述IP地址管理模块获取目标IP地址,直至成功获取到所述任一唯一标识对应的目标IP地址,或者重试次数达到第二预设次数时结束;If any unique identifier fails to successfully obtain the target IP address from the IP address management module, a retry mechanism is triggered, and based on the target data corresponding to any unique identifier, the target IP address is obtained from the IP address management module again through the retry mechanism until the target IP address corresponding to any unique identifier is successfully obtained, or the retry number reaches a second preset number; 相应地,所述方法还包括:Accordingly, the method further comprises: 若重试次数达到第二预设次数,且未成功获取到所述任一唯一标识对应的目标IP地址,则将所述任一唯一标识添加至处理失败队列中。If the number of retries reaches a second preset number and the target IP address corresponding to any unique identifier is not successfully obtained, any unique identifier is added to a processing failure queue. 7.根据权利要求1至6中任一项所述的方法,其特征在于,所述根据所述容器单元重建需求信息从所述容器化集群包括的多个节点中确定目标节点,包括:7. The method according to any one of claims 1 to 6, wherein determining a target node from a plurality of nodes included in the containerized cluster according to the container unit reconstruction requirement information comprises: 获取所述容器化集群中每个节点对应的CPU使用率、可用内存量、节点健康状态以及节点可调度状态,并根据所述容器单元重建需求信息,确定待创建的目标容器单元对应的存储空间需求和CPU使用需求;Obtain the CPU usage, available memory, node health status, and node schedulability status corresponding to each node in the containerized cluster, and determine the storage space requirement and CPU usage requirement corresponding to the target container unit to be created based on the container unit reconstruction requirement information; 基于各节点对应的节点健康状态和节点可调度状态,从所述节点中筛选出处于健康状态及可调度状态的多个第一节点,并根据所述目标容器单元对应的存储空间需求和CPU使用需求,以及各第一节点对应的CPU使用率和可用内存量,从所述多个第一节点中确定目标节点。Based on the node health status and the node schedulable status corresponding to each node, a plurality of first nodes in a healthy state and a schedulable state are screened out from the nodes, and a target node is determined from the plurality of first nodes according to the storage space requirement and CPU usage requirement corresponding to the target container unit, as well as the CPU usage rate and available memory amount corresponding to each first node. 8.一种用于容器化集群的容器单元重建系统,其特征在于,包括:8. A container unit reconstruction system for a containerized cluster, comprising: 集群调度器,用于接收到容器单元重建请求后,获取容器单元重建需求信息,根据所述容器单元重建需求信息从所述容器化集群包括的多个节点中确定目标节点,并向所述目标节点发送容器单元重建指令;The cluster scheduler is configured to, upon receiving a container unit reconstruction request, obtain container unit reconstruction requirement information, determine a target node from a plurality of nodes included in the containerized cluster according to the container unit reconstruction requirement information, and send a container unit reconstruction instruction to the target node; 所述目标节点,用于接收到所述容器单元重建指令后,创建目标容器单元并启动;The target node is configured to create and start a target container unit after receiving the container unit reconstruction instruction; 地址分配组件,用于在监听到所述目标容器单元启动后,判断缓存空间中是否存在与所述目标容器单元所在的命名空间相匹配的目标IP地址;当所述缓存空间中存在所述目标IP地址时,将所述目标IP地址与所述目标容器单元关联,并将关联信息存储至管理身份信息的目标对象中;当所述缓存空间中不存在所述目标IP地址时,将所述目标容器单元对应的唯一标识存入待处理队列中,在满足预设发送条件时,从所述待处理队列中读取预设数量的唯一标识,基于所述预设数量的唯一标识生成数据获取请求,并将所述数据获取请求发送至集群服务接口,以基于所述集群服务接口返回的与每个唯一标识对应的目标数据,从IP地址管理模块获取每个唯一标识对应的目标IP地址,将各目标IP地址分别与匹配的目标容器单元关联,并将关联信息存储至所述缓存空间以及管理身份信息的目标对象中。The address allocation component is used to, after monitoring the startup of the target container unit, determine whether there is a target IP address matching the namespace where the target container unit is located in the cache space; when the target IP address exists in the cache space, associate the target IP address with the target container unit, and store the association information in the target object for managing identity information; when the target IP address does not exist in the cache space, store the unique identifier corresponding to the target container unit in a pending queue, and when a preset sending condition is met, read a preset number of unique identifiers from the pending queue, generate a data acquisition request based on the preset number of unique identifiers, and send the data acquisition request to the cluster service interface, so as to obtain the target IP address corresponding to each unique identifier from the IP address management module based on the target data corresponding to each unique identifier returned by the cluster service interface, associate each target IP address with a matching target container unit, and store the association information in the cache space and the target object for managing identity information. 9.一种存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述的方法。9. A storage medium having a computer program stored thereon, wherein the computer program implements the method according to any one of claims 1 to 7 when executed by a processor. 10.一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至7中任一项所述的方法。10. A computer device comprising a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, wherein the processor implements the method according to any one of claims 1 to 7 when executing the computer program.
CN202510913115.8A 2025-07-02 2025-07-02 Container unit reconstruction method and system for containerized cluster, storage medium and computer equipment Pending CN120812029A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510913115.8A CN120812029A (en) 2025-07-02 2025-07-02 Container unit reconstruction method and system for containerized cluster, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510913115.8A CN120812029A (en) 2025-07-02 2025-07-02 Container unit reconstruction method and system for containerized cluster, storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN120812029A true CN120812029A (en) 2025-10-17

Family

ID=97323159

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510913115.8A Pending CN120812029A (en) 2025-07-02 2025-07-02 Container unit reconstruction method and system for containerized cluster, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN120812029A (en)

Similar Documents

Publication Publication Date Title
CN111488241B (en) Method and system for realizing agent-free backup and recovery operation in container arrangement platform
US9548912B2 (en) System and method for supporting smart buffer management in a distributed data grid
CN110941481A (en) Resource scheduling method, device and system
US9495201B2 (en) Management of bottlenecks in database systems
CN105049268A (en) Distributed computing resource allocation system and task processing method
CN108334396B (en) Data processing method and device, and resource group creation method and device
CN112231108A (en) Task processing method, device, computer-readable storage medium, and server
JP2004302937A (en) Program arrangement method, its execution system, and its processing program
CN111831504B (en) Monitoring method, monitoring device, computer equipment and medium
CN113064744A (en) Task processing method and device, computer readable medium and electronic equipment
CN112100034A (en) Service monitoring method and device
CN111538585A (en) Js-based server process scheduling method, system and device
CN114296891A (en) Task scheduling method, system, computing device, storage medium and program product
CN106789308A (en) The GIS service device and its control method of a kind of micro services framework automatically retractable
US9760405B2 (en) Defining enforcing and governing performance goals of a distributed caching infrastructure
CN111464589A (en) Intelligent contract processing method, computer equipment and storage medium
EP4404059A1 (en) Unified resource management architecture for workload schedulers
CN118113485B (en) Task execution method and device, storage medium and electronic equipment
CN120812029A (en) Container unit reconstruction method and system for containerized cluster, storage medium and computer equipment
CN114968552B (en) Cache allocation method, device, equipment, storage medium and program product
US20070174836A1 (en) System for controlling computer and method therefor
CN116719623A (en) Job scheduling method, job result processing method and device
CN116431337A (en) Cluster process-level task isolation method, device and medium
CN114172903A (en) Node capacity expansion method, device, equipment and medium of slarm scheduling system
CN117539643B (en) Credit card sorting and counting platform, batch task processing method and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination