CN120386639B - External device memory management method and system - Google Patents
External device memory management method and systemInfo
- Publication number
- CN120386639B CN120386639B CN202510884520.1A CN202510884520A CN120386639B CN 120386639 B CN120386639 B CN 120386639B CN 202510884520 A CN202510884520 A CN 202510884520A CN 120386639 B CN120386639 B CN 120386639B
- Authority
- CN
- China
- Prior art keywords
- memory
- priority
- equipment
- external
- pool
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses an external device memory management method and system, which relate to the technical field of computers and comprise the steps of obtaining device information of external devices, constructing a device priority database based on the device information, wherein the device priority database at least comprises a device type, a device hot plug state, an external device resource occupation degree and a priority, dividing a system running memory into a plurality of sub memory pools based on the priority of the devices in the device priority database, wherein the sub memory pools are used for processing memory requests of devices with preset importance degrees, and managing the sub memory pools by utilizing preset management rules corresponding to the sub memory pools so as to manage the external device memory.
Description
Technical Field
The application relates to the technical field of computers, in particular to a method and a system for managing internal memory of external equipment.
Background
In the related memory management scheme, manual intervention is relied on, such as setting of a basic input/output system or disabling of non-critical equipment, and static memory allocation is usually performed, and the mode can meet the requirements in the scenes of less number of external equipment and stable task load, but in the scenes of high-density external equipment, when the memory requirements fluctuate, dynamic adjustment cannot be performed, so that the memory management efficiency of the external equipment is low.
Disclosure of Invention
The application provides a method and a system for managing internal memory of external equipment, which at least solve the problem that in the related art, under the condition of high-density external equipment, when the internal memory requirement fluctuates, dynamic adjustment cannot be performed, so that the internal memory management efficiency is low.
The application provides a memory management method of external equipment, which comprises the following steps:
acquiring equipment information of external equipment;
Based on the equipment information, constructing an equipment priority database, wherein the equipment priority database at least comprises equipment types, equipment hot plug states, external equipment resource occupation degrees and priorities;
Dividing a system running memory into a plurality of sub memory pools based on the priority of the equipment in the equipment priority database, wherein the sub memory pools are used for processing memory requests of equipment with preset importance degrees;
and managing the sub-memory pool by using a preset management rule corresponding to the sub-memory pool so as to manage the memory of the external equipment.
The application provides a memory management system, which comprises an external equipment module, a firmware module, a memory management module and a real-time monitoring module;
the external equipment module is used for acquiring external equipment on the bus;
The firmware module is used for acquiring equipment information corresponding to the external equipment and constructing an equipment priority database based on the equipment information;
The memory management module is used for dividing the system running memory into a plurality of sub memory pools based on the priority of the equipment in the equipment priority database, and managing the sub memory pools by utilizing a preset management rule corresponding to the sub memory pools so as to manage the external equipment memory;
The real-time monitoring module is used for acquiring the running state information of the system and updating the equipment priority database based on the priority of the equipment in the equipment priority database and the running state information of the system.
The application also provides a memory management device, which comprises:
the acquisition unit is used for acquiring equipment information of the external equipment;
The device comprises a construction unit, a priority database and a storage unit, wherein the construction unit is used for constructing a device priority database based on device information, and the device priority database at least comprises a device type field, a device hot plug state field, an external device resource occupation degree field and a priority field;
The dividing unit is used for dividing the memory in the system operation into a plurality of sub memory pools based on the priority of the equipment in the equipment priority database, wherein the sub memory pools are used for processing memory requests of equipment with preset importance degrees;
the memory management unit is used for managing the sub memory pools by using preset management rules corresponding to the sub memory pools so as to manage the memory of the external equipment.
The application also provides the electronic equipment, which comprises a memory and a processor, wherein the memory is used for storing the computer program, and the processor is used for realizing the steps of any external equipment memory management method when executing the computer program.
The application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the steps of any external device memory management method when being executed by a processor.
The application also provides a computer program product, which comprises a computer program, and the computer program realizes the steps of any external device memory management method when being executed by a processor.
The method comprises the steps of obtaining equipment information of external equipment, constructing an equipment priority database based on the equipment information, wherein the equipment priority database at least comprises equipment types, equipment hot plug states, external equipment resource occupation degree and priorities, dividing a system running memory into a plurality of sub memory pools based on the priorities of the equipment in the equipment priority database, wherein the sub memory pools are used for processing memory requests of equipment with preset importance, managing the sub memory pools by using preset management rules corresponding to the sub memory pools so as to manage the external equipment memory, solving the technical problem that the memory management efficiency is low due to the fact that dynamic adjustment cannot be carried out when the memory requirements fluctuate in a high-density equipment scene, and achieving the technical effects of dynamically adjusting memory resources and improving the memory management efficiency.
Drawings
For a clearer description of embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.
Fig. 1 is a flow chart of a method for managing a memory of an external device according to an embodiment of the present application;
fig. 2 is a flow chart of a memory allocation method according to an embodiment of the present application;
fig. 3 is a flow chart of a method for managing a memory of an external device according to an embodiment of the present application;
Fig. 4 is a schematic structural diagram of a memory management device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. Based on the embodiments of the present application, all other embodiments obtained by a person of ordinary skill in the art without making any inventive effort are within the scope of the present application.
It should be noted that in the description of the present application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The terms "first," "second," and the like in this specification are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
In order to facilitate a better understanding of the technical solutions described in the embodiments of the present disclosure by those skilled in the art, technical terms in the embodiments of the present disclosure are explained below before introducing the embodiments of the present disclosure.
The high-speed serial computer expansion bus standard (PERIPHERAL COMPONENT INTERCONNECT EXPRESS, PCIE) is a high-speed serial bus protocol for connecting high-performance peripherals to a computer motherboard. The system is an upgrade version of the traditional parallel PCI bus, adopts a point-to-point communication architecture, supports multi-channel aggregation, and has the characteristics of high bandwidth, low delay, strong expandability and the like.
The BIOS is firmware in the computer for performing hardware initialization and providing runtime services for the operating system, plays a vital role in the computer startup process, is responsible for detecting and initializing hardware components (e.g., hard disk, keyboard, mouse, etc.), and loads the operating system.
Unified extensible firmware interface (Unified Extensible FIRMWARE INTERFACE, UEFI) system a modern firmware interface standard for replacing a traditional BIOS.
Hardware initialization information (Hardware Initialization Information, HII), an information structure that describes how a hardware module (e.g., driver or firmware image) initializes its associated hardware in the UEFI firmware environment. Typically provided by firmware, for invocation during platform startup to complete device initialization.
The runtime boot service memory (Runtime Boot Services Memory, RTB), which is a memory area that remains after the operating system is started and that is not released, is used to support firmware functions that are still needed to be accessed during the operating system running, and is referred to herein simply as runtime memory.
The CPU is a core component of the computer, is responsible for executing instructions in an instruction set to process data, is a brain of the computer, and performs arithmetic logic operation, control data flow, instruction execution of an operating system and the like.
The baseboard management controller (Baseboard Management Controller, BMC) is a dedicated microcontroller that monitors the physical state of the server, including temperature, voltage, fan speed, etc., and supports remote management and diagnostic functions.
The HII table of the PCIE external device occupies the RTB memory, and as a core data structure of device enumeration and configuration, the memory occupation increases in a super linear manner with the number of devices. In a ten thousand level device deployment scenario, the HII table memory requirements may expand by more than a thousand times, far beyond the capacity expectations of conventional fixed allocation policies. This contradiction results in the Non-Protected, non-RP (Non-Region Protected) area being rapidly exhausted, the system is forced to compress the critical control channel, causing the device initialization failure rate to rise by 30% -50%, while 5% -15% of the available memory is inefficiently occupied by fragmentation.
In the related memory management scheme, depending on manual intervention, such as adjusting BIOS settings or disabling non-critical devices, dynamic adaptability is lacking. The conventional PCI (peripheral component interconnect) bus (PERIPHERAL COMPONENT INTERCONNECT, PCI) resource management mechanism adopts static address allocation, and cannot cope with memory requirement fluctuation in the high-density external device scene. For example, when a new device is accessed, the fixed allocation policy may trigger a resource conflict due to insufficient reserved space, and the manual capacity expansion needs to be stopped, so that the full-time operation requirement of the data center cannot be met.
The following three schemes of the external device memory management method in the related art are briefly introduced:
Scheme A provides a method for realizing memory access, which is used for solving the problem of insufficient memory space of a computer, and enables the computer to use the memory space in an external memory pool by enabling an operating system of the computer to call a UEFI BIOS interface, thereby focusing on technical improvement in memory access and space expansion.
Scheme B, scheme B provides a method for initializing the configuration space of PCIE equipment, aiming at PCIE equipment initialization, provides a method for constructing a root bridge structure array in a PEI phase to dynamically allocate bus resources, and encapsulates equipment information and the bus resources through the root bridge structure array to realize dynamic control in the PEI phase.
Scheme C, in order to initialize PCIE equipment, proposes to solve the problem that the PCIE equipment fails to be blocked by the collaborative enumeration and matching equipment between the BIOS and the BMC in the starting process of the server, thereby improving the stability and diversity of the system.
The following disadvantages also exist in the above scheme:
the system is not capable of responding to load change, and is incapable of dynamically adjusting a memory strategy according to load when the equipment runs, for example, the memory requirement of an HII table is increased suddenly and cannot be automatically expanded, so that a Non-protection area (Non-RP) is rapidly exhausted.
The system has the advantages of insufficient monitoring and flexibility, no real-time monitoring mechanism, incapability of tracking the memory occupation trend of the HII table and triggering early warning, dependence on manual intervention or restoration after passive collapse, poor flexibility, fixed memory size during compiling, waste caused by excessive allocation if the insufficient prediction can cause resource exhaustion, and particularly prominent contradiction in high-density PCIE equipment scenes.
The method comprises the steps of obtaining equipment information of external equipment, constructing an equipment priority database based on the equipment information, wherein the equipment priority database at least comprises equipment types, equipment hot plug states, external equipment resource occupation degree and priorities, dividing a system running memory into a plurality of sub memory pools based on the priorities of the equipment in the equipment priority database, wherein the sub memory pools are used for processing memory requests of equipment with preset importance, managing the sub memory pools by using preset management rules corresponding to the sub memory pools so as to manage the external equipment memory, solving the technical problem that the memory management efficiency is low due to the fact that dynamic adjustment cannot be carried out when the memory requirements fluctuate in a high-density equipment scene, and achieving the technical effects of dynamically adjusting memory resources and improving the memory management efficiency.
The external equipment memory management method provided by the embodiment of the application can be applied to PCIE external equipment in a server system, the server system is usually connected with various PCIE external equipment, the external equipment has large difference in memory resource requirements, the server system in the application can be a storage type server, a calculation type server, an artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) type server and the like, the method can also be applied to memory resource scheduling in UEFI firmware, and in the UEFI starting process, the system is not loaded with a complete operating system drive and memory resources need to be reserved for various external equipment.
The present application will be further described in detail below with reference to the drawings and detailed description for the purpose of enabling those skilled in the art to better understand the aspects of the present application.
Fig. 1 is a flowchart of a method for managing a memory of an external device according to an embodiment of the disclosure.
As shown in fig. 1, the method comprises the steps of:
Step 101, obtaining equipment information of external equipment;
In some embodiments, an external device refers to a hardware module, such as a graphics processing unit (Graphics Processing Unit, GPU), NVME hard disk, host channel adapter (Host CHANNEL ADAPTER, HCA) card, smart network card, etc., that connects to the motherboard through a bus interface, such as PCIE, universal serial bus (Universal Serial Bus, USB), nonvolatile memory Host controller interface (Non-Volatile Memory Express, NVME), etc.
In some embodiments, the Device information of the external Device is used to identify and manage basic attribute data of the Device, including, but not limited to, a Device type, vendor ID (Vendor ID), device ID (Device ID), configuration space field, current hot plug status, and current resource occupancy, such as bandwidth, cache usage, etc.
In some embodiments, in the UEFI firmware execution stage, device information of the external Device may be obtained by enumerating all devices on the PCI/PCIE bus, specifically, after the external PCIE Device is scanned, the firmware reads the Device configuration space through the Device protocol layer module, for example, the PCI/PCIE bus protocol, and obtains key information such as the Vendor ID, the Device ID, and the like.
102, Constructing a device priority database based on device information, wherein the device priority database at least comprises a device type, a device hot plug state, an external device resource occupation degree and a priority;
In some embodiments, the device priority database is used for recording and managing the data structures of all external device priorities, supporting dynamic query and update, where the device priority database may be implemented in the form of a structure, an array, or a hash table, each device corresponds to a record, giving higher priority to PCIE devices that assume critical data transmission tasks, and guaranteeing the memory supply of PCIE devices preferentially, and for some non-critical devices, lowering the priority appropriately.
In some embodiments, the device types may include a graphics card (PCIE x 16), an NVME solid state disk (PCIE x 4), a USB controller, and the like, and meanwhile, basic priorities may be preset for different device types, where the initial priorities are determined by the device types, for example, the priorities of the NVME solid state disk are higher than those of the USB, the device hot-plug status includes updating the device connection status in real time, dynamically lifting the device priority, the external device resource occupancy includes monitoring the memory/CPU occupancy, automatically reducing the bandwidth of the low-priority device when the resource is in tension, calculating the priorities from the basic priorities and the dynamic adjustment factors, determining the memory resource allocation sequence, and dynamically adjusting the priorities according to the external device resource occupancy and the hot-plug status when running.
In some embodiments, the device type determines the hardware basis of task execution, the processing capacity and the requirements of different types of devices are different, the hot plug state reflects the access and removal conditions of the devices in real time so as to adjust resource allocation in time, and the utilization of the resources visually presents the use condition of the current resources.
Step 103, dividing the system running memory into a plurality of sub memory pools based on the priority of the equipment in the equipment priority database, wherein the sub memory pools are used for processing memory requests of equipment with preset importance degrees;
In some embodiments, the system runtime memory refers to a memory area reserved in a runtime boot service memory stage for access during operation of an operating system, and is commonly used for variable services, device context maintenance, and the like, where the system runtime memory size may be obtained from a system management mode.
In some embodiments, the sub memory pool is a small memory pool that divides the RTB memory into several different usage, and serves different priority devices.
In some embodiments, by dividing the system runtime memory into multiple sub-memory pools, finer granularity of resource control and optimization of memory utilization may be achieved.
In some embodiments, the memory request of the preset importance device is to set different memory service quality levels according to the device priority, such as an emergency pool, a dynamic pool, a cache pool, etc.
In some embodiments, the interface application and management of the memory using UEFI may also be combined with a memory timeout release mechanism to improve the utilization of the memory.
And 104, managing the sub-memory pool by using a preset management rule corresponding to the sub-memory pool so as to manage the memory of the external equipment.
In some embodiments, the preset management rules refer to memory allocation, access, reclamation policies established for each sub-memory pool. Wherein the allocation policy may determine a maximum allocation upper limit, the access rights control may include allowing only high priority devices to access the emergency pool, the memory reclamation policy may employ a least recently Used (LEAST RECENTLY Used) algorithm, a First In, first Out (FIFO) algorithm, etc.
In some embodiments, when a memory request arrives, the sub memory pools are matched according to the device priority, if no memory pool is available, the memory is attempted to be allocated from the low priority pool, if the request cannot be satisfied, an exception handling flow is triggered, and the log is recorded and returned to a secure mode.
The method comprises the steps of obtaining equipment information of external equipment, constructing an equipment priority database based on the equipment information, wherein the equipment priority database at least comprises equipment types, equipment hot plug states, external equipment resource occupation degree and priorities, dividing a system running memory into a plurality of sub memory pools based on the priorities of the equipment in the equipment priority database, wherein the sub memory pools are used for processing memory requests of equipment with preset importance, managing the sub memory pools by using preset management rules corresponding to the sub memory pools so as to manage the external equipment memory, solving the technical problem that the memory management efficiency is low due to the fact that dynamic adjustment cannot be carried out when the memory requirements fluctuate in a high-density equipment scene, and achieving the technical effects of dynamically adjusting memory resources and improving the memory management efficiency.
In some embodiments, after the device priority database is constructed based on the device information, the external device memory management method includes:
Acquiring running state information of a system;
In some embodiments, the running state information of the system refers to index data of the running state of the current computer system, including but not limited to CPU usage, memory occupancy, peripheral activity level, temperature, power consumption, fan rotation speed, and other hardware health states, and load conditions of an operating system or firmware, where the peripheral activity level may be measured by an input/output request frequency of the PCIE device.
In some embodiments, hardware status may be read using an interface provided by the HII, and CPU/memory usage may be monitored by a performance counter.
Updating the device priority database based on the priority of the device in the device priority database and the running state information of the system.
In some embodiments, the device priority database may be updated periodically based on the priority of the devices in the device priority database and the running state information of the system, or the device priority database may be updated when a specific event occurs, where the specific event may be a new device insertion event or a system load abrupt event.
In some embodiments, the database has the capability of dynamically updating the priority, and when the running state of the system changes, such as equipment access or abrupt change of task resource requirements, the task priority can be rapidly reevaluated and updated, so as to ensure that resource allocation is always reasonable and efficient.
In some embodiments, the system running state information is collected, and the equipment priority is dynamically adjusted by combining the equipment priority information in the equipment priority database, so that intelligent scheduling and efficient utilization of system resources are realized.
In some embodiments, the sub-memory pools include an urgent pool for processing memory requests of critical devices, a dynamic pool for processing memory requests of high priority devices, and a cache pool for processing memory requests of low priority devices.
In some embodiments, the caching policy is as shown in the following table, the emergency pool ensures that the critical device can still obtain the memory area with the minimum memory resource under the extreme conditions (such as shortage of memory and abnormal system), and is usually set to be read-only or protected access to prevent the low priority request from being encroached on, the dynamic pool can dynamically resize the memory area according to the system load and the device requirement, mainly serve the high priority but non-critical device, can be expanded and used when the system resource is sufficient, the caching pool is used for meeting the temporary memory request of the low priority device, can support the caching recovery mechanism (such as LRU and FIFO, and the like), and is released preferentially when the system resource is shortage.
Table 1 three level caching mechanism
In some embodiments, the emergency pool may be a key device such as a GPU and an NVME, the key device refers to a core peripheral on which the system operates normally, if the key device cannot acquire enough memory resources, the system may crash, the dynamic pool may be a device such as a network device, and the cache pool may be a device with a lower priority such as a USB.
In some embodiments, a linked list or reference counting mechanism or the like may be used to track memory usage status.
In some embodiments, the system runtime memory is divided into an emergency pool, a dynamic pool and a cache pool, which are respectively used for processing memory requests of key devices, high-priority devices and low-priority devices, so that fine management and efficient utilization of memory resources are realized.
In some embodiments, as shown in fig. 2, fig. 2 is a flow chart of a memory allocation method provided by the embodiment of the present application, after dividing a system runtime memory into a plurality of sub-memory pools based on the priorities of devices in a device priority database, an external device memory management method includes:
step 201, obtaining a memory request to be processed of each sub memory pool in a plurality of sub memory pools;
In some embodiments, pending memory requests refer to memory allocation requests that have not yet been satisfied, which typically include a desired memory size, request source device identification, and the like.
Step 202, sorting the memory requests to be processed according to the priority of the equipment in the equipment priority database to obtain a target request queue;
In some embodiments, the target request queue is a memory request queue ordered according to device priority, with requests of high priority devices being ordered in front, requests of low priority devices being ordered in back, for use in subsequent memory scheduling decisions.
Step 203, in response to receiving the new memory request, updating the target request queue;
In some embodiments, each sub-memory pool maintains a request queue, which may be a linked list structure, for recording all currently unprocessed requests, periodically polling each memory pool, or acquiring new memory requests in real time through an event notification mechanism, where in a UEFI environment, new memory requests may be captured by registering event listeners.
Step 204, allocating memory for the memory requests in the target task queue according to the order of priority from high to low.
In some embodiments, all pending requests are extracted from each sub-memory pool, for each device corresponding to the request, the current priority of the device is searched in the device priority database, a sorting algorithm such as fast sorting is used to sort the device from high priority to low priority, a target request queue is generated after sorting is completed, if a new memory request is received, the new memory request and the memory request in the target request queue are sorted, a latest target request queue is obtained, that is, the new memory request is inserted into a proper position in the existing queue, and finally, memory is allocated for the memory request according to the priority from high priority to low.
In some embodiments, a memory allocation algorithm constructed based on a priority database uses priority queues to orderly manage memory requests, and when a memory request arrives, the system allocates memory for high priority tasks preferentially according to the order of the priority queues. The high-priority tasks often concern critical functions or real-time response requirements of the system, and the memory supply of the high-priority tasks is preferentially ensured so as to avoid the system from being blocked or delayed in response. And under the condition that system resources are allowed, the low-priority tasks sequentially obtain memory allocation according to the queue order, so that reasonable utilization of the resources and orderly execution of the tasks are realized.
In some embodiments, the requests are ordered by acquiring the requests to be processed in the plurality of sub-memory pools in combination with the device priority database, generating a target request queue, dynamically updating the queue when a new request arrives, and finally distributing resources for the memory requests according to the priority order, thereby realizing efficient memory management and task scheduling.
In some embodiments, after allocating memory for the memory requests in the target task queue according to the order of priority from high to low, the external device memory management method includes:
acquiring the resource occupancy rate of the external equipment, wherein the resource occupancy rate of the external equipment comprises the utilization rate of a central processing unit of the equipment and the occupancy rate of a memory of the equipment;
In some embodiments, the external device resource occupancy rate refers to the proportion of computing and storage resources consumed by the external device in the running process of the system, and is used for measuring the activity level of the device. The utilization rate of the central processing unit of the device represents the execution time duty ratio of the device driver or related tasks on the CPU, reflects the demand intensity of the device on CPU resources, and is used for indicating the proportion of the memory currently used by the device to the allocable memory.
And updating a preset management rule based on the resource occupancy rate of the external equipment and the target request queue.
In some embodiments, updating the preset management rule refers to dynamically adjusting the memory allocation policy according to the device resource usage condition and the state of the request queue, so as to achieve more refined resource scheduling and exception prevention.
In some embodiments, by continuously monitoring key indexes such as CPU utilization, memory occupancy, task queue length, etc., the system can accurately sense load changes, and once the load condition changes, the dynamic adjustment mechanism is started immediately, and the memory allocation policy is flexibly adjusted according to preset rules.
In some embodiments, the preset memory management rule is dynamically adjusted by collecting the CPU utilization rate and the memory occupancy rate of the device and combining the state of the current task queue to be processed, so that intelligent scheduling and optimal utilization of system resources are realized.
In some embodiments, updating the preset management rule based on the external device resource occupancy and the target request queue includes:
Acquiring a first load of external equipment, wherein the first load is an original load of the external equipment;
determining a second load of the equipment based on the resource occupancy rate of the external equipment and the target request queue, wherein the second load is the current load of the external equipment;
Based on the second load and the first load, the preset management rule is updated.
In some embodiments, the original load of the external device refers to the load recorded in the device priority database, and the current load of the external device is monitored and determined in real time.
In some embodiments, if the original load and the current load change, determining a memory allocation, access and reclamation policy according to the change of the load to update a preset management rule, taking the memory allocation policy as an example, dividing the runtime memory into a plurality of sub-memory pools (such as an urgent pool accounting for 60%, a cache pool accounting for 30%) and a dynamic pool accounting for 10%), and according to the change of the load, the space occupation ratio of the plurality of sub-memory pools supports on-demand elastic expansion.
In some embodiments, after allocating memory for the memory requests in the target task queue according to the order of priority from high to low, the external device memory management method includes:
responding to failure of memory allocation for a memory request in a target task queue, acquiring abnormal information of external equipment, wherein the abnormal information of the external equipment comprises at least one of fatal errors, compatibility errors and performance fluctuation;
in some embodiments, memory allocation failures typically occur when memory resources are insufficient, resources compete, and hardware fails, and a portion of memory area is unavailable.
In some embodiments, the device status information, the network traffic information, and the program log information may be obtained by collecting data such as a device status, a network traffic, and a program log by the real-time monitoring system, where the status information refers to whether the device is in a busy state, disconnected, and the like, the network traffic information is used to reflect a data transmission rate and a delay, and the program log information is used to record error logs, debug information, and the like of a device driver or an application program, and determine abnormal information of the external device based on the device status information, the network traffic information, and the program log information.
Determining a priority adjustment instruction corresponding to the abnormal information of the external equipment;
In some embodiments, the priority adjustment instruction corresponding to the abnormal information of the external device includes a rollback priority, if the abnormal information of the external device is embodied as a fatal error, for example, a system crash, the rollback priority is 100, which indicates that emergency rollback needs to be executed immediately, and rollback is based on the priority of the current external device, if the abnormal information of the external device is embodied as a compatibility problem, for example, a driving abnormality, version rollback is planned to be executed, and the rollback priority is 70, and if the abnormal information of the external device is embodied as a performance fluctuation, for example, a memory leak, progressive rollback is triggered, and the rollback priority is 50, where the rollback priorities are 100, 70 and 50 in the present application can be adjusted according to practical situations.
In some embodiments, the priority adjustment instruction is an operation command for modifying the priority of the priority adjustment instruction in the device priority database, and may include operations of priority lifting, lowering, maintaining, and the like, for example, when the device frequently fails to allocate memory and the log indicates that the memory is insufficient, the priority of the device may be lowered.
Based on the priority adjustment instruction, a priority field in the device priority database is updated.
In some embodiments, updating the priority field in the device priority database based on the priority adjustment instruction comprises:
determining an update rule corresponding to the priority adjustment instruction based on the priority adjustment instruction;
The priority field in the device priority database is updated with the update rule.
In some embodiments, when the backoff priority in the priority adjustment instruction is 100, the update rule corresponding to the priority adjustment instruction is to immediately execute emergency backoff, when the backoff priority in the priority adjustment instruction is 70, the update rule corresponding to the priority adjustment instruction is to schedule execution version backoff, when the backoff priority in the priority adjustment instruction is 50, the update rule corresponding to the priority adjustment instruction is to trigger progressive backoff, and after determining the update rule, the priority field in the device priority database is updated according to the corresponding update rule.
In some embodiments, after updating the priority field in the device priority database with the update rule, the external device memory management method further includes:
And carrying out rollback processing on the allocated memory in a preset stage based on the updated priority field, wherein the preset stage comprises a hardware layer stage, a system layer stage and an application layer stage.
In some embodiments, the updated priority field refers to a priority field in the device priority database updated by the update rule, and the allocated memory refers to the memory allocated for the memory request according to the order of priority from high to low.
In some embodiments, the rollback processing on the allocated memory in the preset stage specifically includes rollback processing on the allocated memory in the hardware layer, rollback processing on the allocated memory in the system layer, and rollback processing on the allocated memory in the application layer, where the rollback processing on the hardware layer is implemented by an FPGA dual-mirror architecture, and when a configuration error or a peripheral abnormality is detected, the system automatically triggers an IPROG instruction to reset the configuration memory, loads a Golden Image from a Flash base address, and ensures that the hardware function is quickly recovered. The system layer rollback process employs a staged policy in which a device reset (e.g., PCIE device reset) is triggered by UEFI firmware and then rolled back to a stable version using a system backup mechanism (e.g., windows. Old folder of Windows). The application layer rollback processing realizes fine control through the interception of frame-level events, wherein the system utilizes OnBackPressedDispatcher to manage return key events, supports interaction logic such as double-click exit and the like, and a multi-rollback stack frame (such as a Navigation component) realizes independent rollback paths of different functional modules by saving/restoring a rollback stack state, and the application layer rollback needs to be dynamically adjusted by combining a priority database.
In some embodiments, the three-layer rollback mechanism forms a closed loop through hardware-level quick response, system-level version management and application-level interactive control, so as to ensure that a complete rollback link from a physical layer to an application layer is realized in an abnormal scene.
In some embodiments, as shown in fig. 3, fig. 3 is a flow chart of a method for managing internal memory of an external device according to an embodiment of the present application, where, for a triangle area on the left side in fig. 3, an external PCIE device layer refers to PCIE devices such as GPU cards, graphics cards, network cards, solid state disks, etc. that are extended by a server through PCIE protocols, and a UEFI firmware layer refers to a critical software layer between hardware and an operating system when the server is started, and performs hardware initialization and detects PCIE external devices configured by the server. In a certain server model product, according to different server types, such as a storage type server, a computing type server, an AI type server and the like, aiming at external PCIE equipment of the server, according to the type attribute of the equipment, the equipment type priority is formulated, such as the computing type server, the GPU priority is 0.5, the network card priority is 1, the NVME hard disk is 2, the storage type server, the NVME hard disk is 0.5, the RAID card priority is 1, and the network card priority is 2. After loading and scanning PCIE equipment installed in a server, the UEFI firmware layer sorts the partial information into a database, and establishes a priority database of the PCIE equipment. For the rectangular matrix area at the upper right of fig. 3, in the running memory, the memory Priority area (PR) and the memory Non-Priority area (PR) are mainly used to optimize the performance of the real-time computing task, and the memory of the PR area has the characteristics of low-delay access and preemptive scheduling, for example, when the computing server performs the graphics scrambling operation, because the GPU Priority is higher, the task can preempt the memory of the PR area preferentially, and the memory of the Non-PR area is characterized by high-delay and fair scheduling principle. For the middle square matrix area in fig. 3, the three-level memory buffer pool mechanism refers to different tasks aiming at different device types, PR and non-PR areas can be respectively divided into three-level memory pools, namely an emergency pool, a dynamic pool, a buffer pool, such as graphic rendering task of GPU and artificial intelligent training task, are all divided into the emergency pool, network interaction task of network card is divided into the dynamic pool, read-write task of NVME hard disk is divided into the buffer pool, elastic memory pool management refers to elastically defining task types and task quantity in the three buffer pools and flexibly defining the task types and the task quantity in different product types, and dynamic priority scheduling algorithm refers to combining priority of each type of devices in PCIE device database, And the safe rollback mechanism refers to tasks generated by plugging and pulling various PCIE equipment and business layer task access, such as failure of memory space allocation or resource preemption, and the task rollback is required to be performed and corresponding alarm logs are triggered. The dynamic monitoring system is characterized in that the residual memory space of the current PR area and the non-PR area can be visualized through the real-time monitoring system, the allocation condition of the current three-level memory pool and the task queue condition can be checked through the real-time monitoring system, aiming at the monitored abnormal conditions, the insufficient memory space, the overlong task queue and the like can be printed through log information, and meanwhile, the log information can be fed back to the dynamic priority scheduling algorithm, so that the dynamic priority scheduling algorithm is promoted to further carry out algorithm upgrading and optimization.
In some embodiments, by collecting the abnormal information of the external device when the memory allocation fails, generating a priority adjustment instruction, and updating the device priority database in a staged manner, a more stable and intelligent resource scheduling and abnormal processing mechanism is realized.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment.
The embodiment of the application also provides a memory management system, which comprises an external device module, a firmware module, a memory management module and a real-time monitoring module;
the external equipment module is used for acquiring external equipment on the bus;
The firmware module is used for acquiring equipment information corresponding to the external equipment and constructing an equipment priority database based on the equipment information;
The memory management module is used for dividing the system running memory into a plurality of sub memory pools based on the priority of the equipment in the equipment priority database, and managing the sub memory pools by utilizing a preset management rule corresponding to the sub memory pools so as to manage the external equipment memory;
The real-time monitoring module is used for acquiring the running state information of the system and updating the equipment priority database based on the priority of the equipment in the equipment priority database and the running state information of the system.
The embodiment of the present application further provides a memory management device 400, and fig. 4 is a schematic structural diagram of the memory management device provided in the embodiment of the present disclosure, as shown in fig. 4, including:
an obtaining unit 401, configured to obtain device information of an external device;
A building unit 402, configured to build a device priority database based on the device information, where the device priority database includes at least a device type field, a device hot plug status field, an external device resource occupancy field, and a priority field;
A dividing unit 403, configured to divide the system runtime memory into a plurality of sub-memory pools based on the priorities of the devices in the device priority database, where the sub-memory pools are used to process memory requests of devices with preset importance degrees;
the memory management unit 404 is configured to manage the sub-memory pool by using a preset management rule corresponding to the sub-memory pool, so as to manage the external device memory.
Further, in one possible implementation manner of the embodiment of the present disclosure, the obtaining unit 401 is configured to:
Acquiring running state information of a system;
updating the device priority database based on the priority of the device in the device priority database and the running state information of the system.
In some embodiments, the sub-memory pools include an urgent pool for processing memory requests of critical devices, a dynamic pool for processing memory requests of high priority devices, and a cache pool for processing memory requests of low priority devices.
Further, in a possible implementation manner of the embodiment of the present disclosure, the memory management device 400 further includes a memory allocation unit, where the memory allocation unit is configured to:
Acquiring a memory request to be processed of each sub memory pool in a plurality of sub memory pools;
According to the priority of the equipment in the equipment priority database, sequencing the memory requests to be processed to obtain a target request queue;
Updating the target request queue in response to receiving the new memory request;
And allocating memory for the memory requests in the target task queue according to the order of the priority from high to low.
Further, in one possible implementation manner of the embodiment of the present disclosure, the memory management device 400 further includes an updating unit, where the updating unit is configured to:
acquiring the resource occupancy rate of the external equipment, wherein the resource occupancy rate of the external equipment comprises the utilization rate of a central processing unit of the equipment and the occupancy rate of a memory of the equipment;
And updating a preset management rule based on the resource occupancy rate of the external equipment and the target request queue.
Further, in a possible implementation manner of the embodiment of the present disclosure, the updating unit is further configured to:
Acquiring a first load of the external equipment, wherein the first load is an original load of the external equipment;
determining a second load of the equipment based on the resource occupancy rate of the external equipment and the target request queue, wherein the second load is the current load of the external equipment;
And updating the preset management rule based on the second load and the first load.
Further, in a possible implementation manner of the embodiment of the present disclosure, the updating unit is further configured to:
responding to failure of memory allocation for a memory request in a target task queue, acquiring abnormal information of external equipment, wherein the abnormal information of the external equipment comprises at least one of fatal errors, compatibility errors and performance fluctuation;
determining a priority adjustment instruction corresponding to the abnormal information of the external equipment;
based on the priority adjustment instruction, a priority field in the device priority database is updated.
Further, in a possible implementation manner of the embodiment of the present disclosure, the updating unit is further configured to:
Determining an update rule corresponding to the priority adjustment instruction based on the priority adjustment instruction;
and updating the priority field in the equipment priority database by using the updating rule.
Further, in a possible implementation manner of the embodiment of the present disclosure, the memory management device 400 further includes a memory rollback unit, where the memory rollback unit is configured to:
and carrying out rollback processing on the allocated memory in a preset stage based on the updated priority field, wherein the preset stage comprises a hardware layer stage, a system layer stage and an application layer stage.
The method comprises the steps of obtaining equipment information of external equipment, constructing an equipment priority database based on the equipment information, wherein the equipment priority database at least comprises equipment types, equipment hot plug states, external equipment resource occupation degree and priorities, dividing a system running memory into a plurality of sub memory pools based on the priorities of the equipment in the equipment priority database, wherein the sub memory pools are used for processing memory requests of equipment with preset importance, managing the sub memory pools by using preset management rules corresponding to the sub memory pools so as to manage the external equipment memory, solving the technical problem that the memory management efficiency is low due to the fact that dynamic adjustment cannot be carried out when the memory requirements fluctuate in a high-density equipment scene, and achieving the technical effects of dynamically adjusting memory resources and improving the memory management efficiency.
The description of the features in the embodiment corresponding to the memory management device may refer to the related description of the embodiment corresponding to the memory management method of the external device, which is not described herein in detail.
The embodiment of the application also provides an electronic device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor is configured to run the computer program to execute the steps in any of the external device memory management method embodiments.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program is configured to execute the steps in any external device memory management method embodiment.
In an exemplary embodiment, the computer readable storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, etc. various media in which a computer program may be stored.
The embodiment of the application also provides a computer program product, which comprises a computer program, and the computer program realizes the steps in any external device memory management method embodiment when being executed by a processor.
Embodiments of the present application also provide another computer program product, including a non-volatile computer readable storage medium, where the non-volatile computer readable storage medium stores a computer program, where the computer program when executed by a processor implements the steps in any of the external device memory management method embodiments described above.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The method and the system for managing the external device memory provided by the application are described in detail. The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present application and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.
Claims (8)
1. The method for managing the internal memory of the external device is characterized by comprising the following steps:
acquiring equipment information of external equipment;
constructing a device priority database based on the device information, wherein the device priority database at least comprises a device type, a device hot plug state, an external device resource occupation degree and a priority;
Dividing a system operation time memory into a plurality of sub memory pools based on the priority of the equipment in the equipment priority database, wherein the sub memory pools are used for processing memory requests of equipment with preset importance, the sub memory pools comprise an emergency pool, a dynamic pool and a buffer pool, the emergency pool is used for processing memory requests of key equipment, the dynamic pool is used for processing memory requests of high-priority equipment, and the buffer pool is used for processing memory requests of low-priority equipment;
Managing the sub memory pool by using a preset management rule corresponding to the sub memory pool so as to manage the external device memory;
wherein after dividing the system runtime memory into a plurality of sub-memory pools based on the priority of the device in the device priority database, the method comprises:
Acquiring a memory request to be processed of each sub memory pool in the plurality of sub memory pools;
sorting the memory requests to be processed according to the priority of the equipment in the equipment priority database to obtain a target request queue;
updating the target request queue in response to receiving a new memory request;
And distributing memory for the memory requests in the target task queue according to the order of the priority from high to low.
2. The external device memory management method according to claim 1, wherein after the device priority database is constructed based on the device information, the method comprises:
Acquiring running state information of a system;
And updating the equipment priority database based on the priority of the equipment in the equipment priority database and the running state information of the system.
3. The method for memory management of external device according to claim 1, wherein after allocating memory for memory requests in a target task queue in the order of the priority from high to low, the method comprises:
acquiring the resource occupancy rate of external equipment, wherein the resource occupancy rate of the external equipment comprises the utilization rate of a central processing unit of the equipment and the occupancy rate of a memory of the equipment;
And updating the preset management rule based on the resource occupancy rate of the external equipment and the target request queue.
4. The external device memory management method according to claim 3, wherein updating the preset management rule based on the external device resource occupancy and the target request queue comprises:
Acquiring a first load of the external equipment, wherein the first load is an original load of the external equipment;
determining a second load of the equipment based on the resource occupancy rate of the external equipment and the target request queue, wherein the second load is the current load of the external equipment;
And updating the preset management rule based on the second load and the first load.
5. The method for memory management of external device according to claim 1, wherein after allocating memory for memory requests in a target task queue in the order of the priority from high to low, the method comprises:
responding to failure of memory allocation for the memory request in the target task queue, and acquiring abnormal information of the external equipment, wherein the abnormal information of the external equipment comprises at least one of fatal error, compatibility error and performance fluctuation;
determining a priority adjustment instruction corresponding to the abnormal information of the external equipment;
and updating a priority field in the equipment priority database based on the priority adjustment instruction.
6. The method of claim 5, wherein updating the priority field in the device priority database based on the priority adjustment instruction comprises:
Determining an update rule corresponding to the priority adjustment instruction based on the priority adjustment instruction;
and updating the priority field in the equipment priority database by using the updating rule.
7. The method of claim 6, wherein after updating the priority field in the device priority database with the update rule, the method further comprises:
and carrying out rollback processing on the allocated memory in a preset stage based on the updated priority field, wherein the preset stage comprises a hardware layer stage, a system layer stage and an application layer stage.
8. The memory management system is characterized by comprising an external device module, a firmware module, a memory management module and a real-time monitoring module;
The external equipment module is used for acquiring external equipment on the bus;
The firmware module is used for acquiring equipment information corresponding to the external equipment and constructing an equipment priority database based on the equipment information;
The memory management module is used for dividing the system operation memory into a plurality of sub memory pools based on the priority of the equipment in the equipment priority database, and managing the sub memory pools by utilizing a preset management rule corresponding to the sub memory pools so as to manage the external equipment memory, wherein the sub memory pools comprise an emergency pool, a dynamic pool and a cache pool, the emergency pool is used for processing memory requests of key equipment, the dynamic pool is used for processing memory requests of high-priority equipment, and the cache pool is used for processing memory requests of low-priority equipment;
the real-time monitoring module is used for acquiring the running state information of the system and updating the equipment priority database based on the priority of the equipment in the equipment priority database and the running state information of the system;
Wherein after dividing the system run-time memory into a plurality of sub-memory pools based on the priority of the device in the device priority database, the method further comprises:
Acquiring a memory request to be processed of each sub memory pool in the plurality of sub memory pools;
sorting the memory requests to be processed according to the priority of the equipment in the equipment priority database to obtain a target request queue;
updating the target request queue in response to receiving a new memory request;
And distributing memory for the memory requests in the target task queue according to the order of the priority from high to low.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202510884520.1A CN120386639B (en) | 2025-06-27 | 2025-06-27 | External device memory management method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202510884520.1A CN120386639B (en) | 2025-06-27 | 2025-06-27 | External device memory management method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN120386639A CN120386639A (en) | 2025-07-29 |
CN120386639B true CN120386639B (en) | 2025-09-12 |
Family
ID=96492940
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202510884520.1A Active CN120386639B (en) | 2025-06-27 | 2025-06-27 | External device memory management method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN120386639B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119336513A (en) * | 2024-12-18 | 2025-01-21 | 苏州元脑智能科技有限公司 | Memory pool management method and device, storage medium, and electronic device |
CN120066831A (en) * | 2025-01-02 | 2025-05-30 | 北京智芯微电子科技有限公司 | Method, system, electronic device and storage medium for detecting memory management function |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100359489C (en) * | 2004-07-13 | 2008-01-02 | 中兴通讯股份有限公司 | Method for internal memory allocation in the embedded real-time operation system |
CN111211919B (en) * | 2019-12-23 | 2023-07-28 | 南京壹格软件技术有限公司 | Internet of things intelligent gateway configuration method special for data center machine room |
CN118260211A (en) * | 2022-12-27 | 2024-06-28 | 华为终端有限公司 | Memory management method and related device |
CN117234691A (en) * | 2023-09-27 | 2023-12-15 | 北京奥星贝斯科技有限公司 | Task scheduling method and device |
CN117421116A (en) * | 2023-10-26 | 2024-01-19 | 腾讯科技(深圳)有限公司 | Service processing method, device, computer equipment, storage medium and program product |
CN118869810A (en) * | 2024-07-03 | 2024-10-29 | 天津大学 | A network aggregation task scheduling system and method based on time-space compatibility |
CN119088530A (en) * | 2024-09-12 | 2024-12-06 | 深信服科技股份有限公司 | I/O request scheduling method, device, electronic device and storage medium |
CN119583579A (en) * | 2024-11-27 | 2025-03-07 | 山东省计算中心(国家超级计算济南中心) | A cloud memory pool priority flow control method and system based on RDMA |
-
2025
- 2025-06-27 CN CN202510884520.1A patent/CN120386639B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119336513A (en) * | 2024-12-18 | 2025-01-21 | 苏州元脑智能科技有限公司 | Memory pool management method and device, storage medium, and electronic device |
CN120066831A (en) * | 2025-01-02 | 2025-05-30 | 北京智芯微电子科技有限公司 | Method, system, electronic device and storage medium for detecting memory management function |
Also Published As
Publication number | Publication date |
---|---|
CN120386639A (en) | 2025-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10333859B2 (en) | Multi-tenant resource coordination method | |
WO2019218708A1 (en) | Task processing method and device, and computer system | |
US20170017511A1 (en) | Method for memory management in virtual machines, and corresponding system and computer program product | |
US20130305243A1 (en) | Server system and resource management method and program | |
US20140137121A1 (en) | Job management system and job control method | |
JP2013509658A (en) | Allocation of storage memory based on future usage estimates | |
WO2011076608A2 (en) | Goal oriented performance management of workload utilizing accelerators | |
CN118034917A (en) | PCIe resource allocation method and device, electronic equipment and storage medium | |
US20140156853A1 (en) | Computer and resource retrieval method | |
CN116450328A (en) | Memory allocation method, memory allocation device, computer equipment and storage medium | |
JP6617461B2 (en) | Control device, control program, and control method | |
CN120238511A (en) | System and method for memory resource allocation of multi-level switch firmware combination | |
US10783096B2 (en) | Storage system and method of controlling I/O processing | |
US20070174836A1 (en) | System for controlling computer and method therefor | |
CN104794000A (en) | Work scheduling method and system | |
CN120386639B (en) | External device memory management method and system | |
JP2006323872A (en) | Spare resource providing method and computer system for logical partition | |
CN115756727B (en) | Kubernetes optimization scheduling method and system based on running virtual machines in containers | |
CN115658295A (en) | Resource scheduling method and device, electronic equipment and storage medium | |
JP2002278778A (en) | Scheduling device in symmetric multiprocessor system | |
CN120256134B (en) | Storage resource allocation method, electronic device, storage medium and program product | |
CN120085939B (en) | Resource management method, device, storage medium, and program product | |
JP4997063B2 (en) | Computer startup method and computer system | |
CN1773458A (en) | Method and controller for managing resource element queues | |
US20230393882A1 (en) | Management of virtual machine shutdowns in a computing environment based on resource locks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |