[go: up one dir, main page]

CN121166439A - Starting method, device, equipment, medium and program product of BMC firmware - Google Patents

Starting method, device, equipment, medium and program product of BMC firmware

Info

Publication number
CN121166439A
CN121166439A CN202511232162.2A CN202511232162A CN121166439A CN 121166439 A CN121166439 A CN 121166439A CN 202511232162 A CN202511232162 A CN 202511232162A CN 121166439 A CN121166439 A CN 121166439A
Authority
CN
China
Prior art keywords
image file
bmc firmware
partition
restarting
starting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202511232162.2A
Other languages
Chinese (zh)
Inventor
杨清娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan Maiwei Intelligent Technology Co ltd
Original Assignee
Jinan Maiwei Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan Maiwei Intelligent Technology Co ltd filed Critical Jinan Maiwei Intelligent Technology Co ltd
Priority to CN202511232162.2A priority Critical patent/CN121166439A/en
Publication of CN121166439A publication Critical patent/CN121166439A/en
Pending legal-status Critical Current

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

本申请公开了一种BMC固件的启动方法、装置、设备、介质及程序产品,涉及服务器管理技术领域,包括:在主用镜像文件损坏时,基于备用镜像文件重新启动BMC固件,若启动未成功,则基于远程服务器的最新镜像文件重新启动BMC固件,若启动未成功,则基于本地最小化镜像文件重新启动BMC固件,因此,可以解决双镜像备份机制仍面临系统瘫痪的问题,通过预先部署的多个镜像文件构建多级镜像文件恢复机制,在BMC固件启动异常时,根据多级镜像文件恢复机制逐步恢复系统,达到确保系统的最小可用性,提高系统可靠性和安全性的技术效果。

This application discloses a boot method, apparatus, device, medium, and program product for BMC firmware, relating to the field of server management technology. The method includes: restarting the BMC firmware based on a backup image file when the primary image file is corrupted; if booting fails, restarting the BMC firmware based on the latest image file from a remote server; if booting fails again, restarting the BMC firmware based on a locally minimized image file. Therefore, it can solve the problem of system paralysis still faced by dual-image backup mechanisms. By constructing a multi-level image file recovery mechanism through multiple pre-deployed image files, the system is gradually restored according to the multi-level image file recovery mechanism when the BMC firmware boots abnormally, achieving the technical effect of ensuring minimum system availability and improving system reliability and security.

Description

Starting method, device, equipment, medium and program product of BMC firmware
Technical Field
The present application relates to the field of server management technologies, and in particular, to a method, an apparatus, a device, a medium, and a program product for starting BMC firmware.
Background
The BMC (Baseboard Management Controller ) is a specific management and control unit on the server, and its firmware programs are typically burned in Flash (storage or Flash memory). BMC firmware is typically composed of four parts, uboot partition, kernel partition, rofs partition, and rwfs partition. Wherein, rwfs partition is a readable/writable partition for storing user configuration and runtime data. The partition inevitably needs frequent read-write operations in the running process of the system, so that bad blocks are generated, and the system cannot be started.
In the prior art, a dual-image backup mechanism is adopted, namely, BMC firmware stores two independent image files by adopting two independent Flash, the BMC firmware is started from a first Flash by default, and when the first Flash is started abnormally, the BMC firmware is automatically switched to a second Flash to be started, so that the robustness of the system is improved. However, if the standby mirror image is also in a problem and cannot be started normally, the system is paralyzed and cannot be repaired automatically, and normal operation of the service is affected.
Disclosure of Invention
The application provides a method, a device, equipment, a medium and a program product for starting BMC firmware, which at least solve the problem that the BMC firmware cannot be ensured to be started successfully in the related technology.
The application provides a starting method of BMC firmware, which comprises the steps of restarting the BMC firmware based on a standby image file and judging whether the restarting of the BMC firmware based on the standby image file is successful or not when the BMC firmware is started based on a main image file, restarting the BMC firmware based on a latest image file of a remote server and judging whether the restarting of the BMC firmware based on the latest image file is successful or not if the restarting of the BMC firmware based on the latest image file is unsuccessful, and restarting the BMC firmware based on a local minimum image file if the restarting of the BMC firmware based on the latest image file is unsuccessful.
The application further provides a device for protecting the BMC firmware, which comprises a first-stage starting recovery module, a second-stage starting recovery module and a third-stage starting recovery module, wherein the first-stage starting recovery module is used for restarting the BMC firmware based on a standby image file and judging whether the BMC firmware is successfully restarted based on the standby image file when the BMC firmware is started based on the main image file, the second-stage starting recovery module is used for restarting the BMC firmware based on a latest image file of a remote server and judging whether the BMC firmware is successfully restarted based on the latest image file if the BMC firmware is not successfully started based on the latest image file, and the third-stage starting recovery module is used for restarting the BMC firmware based on a local minimum image file if the BMC firmware is not successfully started based on the latest image file.
The application also provides computer equipment, which comprises a memory and a processor, wherein the memory and the processor are in communication connection, the memory stores computer instructions, and the processor executes the computer instructions so as to execute the protection method of the BMC firmware of the first aspect or any corresponding implementation mode.
The present application also provides a computer readable storage medium, where computer instructions are stored on the computer readable storage medium, where the computer instructions are configured to cause a computer to execute the protection method of the BMC firmware according to the first aspect or any implementation manner corresponding to the first aspect.
The present application also provides a computer program product, including computer instructions for causing a computer to execute the protection method of the BMC firmware of the first aspect or any implementation manner corresponding to the first aspect.
According to the application, when the main image file is damaged, the BMC firmware is restarted based on the standby image file, if the starting is unsuccessful, the BMC firmware is restarted based on the latest image file of the remote server, and if the starting is unsuccessful, the BMC firmware is restarted based on the local minimum image file, so that the problem that a double-image backup mechanism still faces system paralysis can be solved, a multistage image file recovery mechanism is constructed through a plurality of image files which are deployed in advance, and when the BMC firmware is started abnormally, the system is gradually recovered according to the multistage image file recovery mechanism, so that the technical effects of ensuring the minimum availability of the system and improving the reliability and safety of the system are achieved.
Drawings
For a clearer description of embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.
Fig. 1 is a flow chart of a method for starting a BMC firmware according to an embodiment of the present application;
fig. 2 is a flowchart illustrating another method for starting up BMC firmware according to an embodiment of the present application;
FIG. 3 is a schematic diagram of firmware partitions of another method for starting BMC firmware according to an embodiment of the present application;
Fig. 4 is a schematic diagram of an active image file corruption detection procedure of another method for starting a BMC firmware according to an embodiment of the present application;
Fig. 5 is a flowchart of a starting method of a BMC firmware according to another embodiment of the present application;
fig. 6 is a schematic image switching flow chart of a starting method of a BMC firmware according to an embodiment of the present application;
fig. 7 is a block diagram of a starting device of a BMC firmware according to an embodiment of the present application;
fig. 8 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. Based on the embodiments of the present application, all other embodiments obtained by a person of ordinary skill in the art without making any inventive effort are within the scope of the present application.
It should be noted that in the description of the present application, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The terms "first," "second," and the like in this specification are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The BMC is a special microcontroller embedded in a server, data center equipment or a high-end computer main board, is mainly used for hardware monitoring, remote management and fault processing independent of a main system, and is a core component of remote operation and maintenance of the server. The BMC firmware is special software embedded in the BMC, is equivalent to an operating system of the BMC, directly drives BMC hardware, realizes a remote management protocol, monitors hardware states, provides an interface for interaction between an administrator and the BMC, and is a core carrier of the BMC function.
BMC firmware is typically composed of four parts, a uboot (Universal Boot Loader, bootstrap loader) partition, a kernel partition, a rofs (Read-Only FILE SYSTEM), and a rwfs (Read-WRITE FILE SYSTEM ) partition. The uboot is a program which runs first after the BMC is electrified, is responsible for hardware initialization, kernel guidance and starting mode selection, and is an entry switch started by the BMC; the kernel is a resource manager of the BMC, takes charge of hardware resource scheduling and program running management by connecting uboot with a root file system, the rofs is a system disk of the BMC, stores unmodified core programs, configurations and drivers, ensures stability of basic functions of the system, and the rwfs is a data disk of the BMC, stores dynamically-changed data and user configurations and is a partition which is most prone to abnormality in the system.
In the related art, two independent Flash files are used for storing the two independent image files, the first Flash is started by default, when the first Flash is started abnormally, the second Flash is started automatically, but if the standby image cannot be started normally, the system is paralyzed, and at the moment, the system cannot be repaired automatically, so that the normal operation of the service is affected. In addition, in the related art, after detecting that the BMC management system AC is powered down, the write operation is directly prohibited for the Flash, so that data loss and file system damage are prevented, but the files of the battery being written in the write operation cannot be effectively protected, and therefore automatic recovery cannot be realized when the partition is abnormally damaged. Therefore, the starting method of the BMC firmware provided by the embodiment of the application starts the BMC firmware through the multi-stage mirror image file recovery mechanism, so as to achieve the effects of realizing the minimum availability of the system and ensuring the normal operation of the service.
The present application will be further described in detail below with reference to the drawings and detailed description for the purpose of enabling those skilled in the art to better understand the aspects of the present application.
The specific application environment architecture or specific hardware architecture upon which execution of the boot method of the BMC firmware depends is described herein.
An embodiment of the present application provides a method for starting up BMC firmware, which can be used for a server where the BMC firmware is located, and fig. 1 is a flowchart of a method for starting up BMC firmware according to an embodiment of the present application, as shown in fig. 1, where the flowchart includes the following steps:
step S101, when starting the BMC firmware based on the primary image file, if the primary image file is damaged, restarting the BMC firmware based on the standby image file, and determining whether restarting the BMC firmware based on the standby image file is successful.
Specifically, in the embodiment of the present application, the master image file is a master image that is normally started by the BMC firmware, and is stored in a master partition of the local Flash, where the master image file includes all functions of the BMC firmware, such as remote control, hardware monitoring, log management, and a user configuration interface. Thus, each time the BMC firmware system is started, the master image file is first loaded. However, if the main image file is detected to be damaged, if the main partition loses the main image file due to a Flash bad block, the BMC firmware cannot be started successfully based on the main image file.
In some alternative embodiments, the standby image file corresponds to a local backup plan when the primary image fails, and is stored in a standby partition of the local Flash, independent of the primary partition, so as to avoid damage to the area. The standby image file is consistent or compatible with the primary image content, and is generally updated synchronously to ensure that the functions match and only the storage partitions are different. Thus, when the primary image file is corrupted, the backup image file is loaded and BMC firmware is started based on the backup image file. The application starts the BMC firmware based on the standby image file, wherein the standby image file is consistent or compatible with the content of the main image, can directly replace the main image to bear all operation and maintenance work of the BMC firmware, and ensures the full-function realization of the BMC.
In some alternative embodiments, to avoid the simultaneous damage of the dual image files, the present application provides a multi-level image file recovery function, so that when starting the BMC firmware based on the standby image file, it is determined whether the BMC firmware can be successfully started, thereby switching the image file.
Step S102, if the new starting of the BMC firmware based on the standby image file is unsuccessful, restarting the BMC firmware based on the latest image file of the remote server, and judging whether the restarting of the BMC firmware based on the latest image file is successful.
Specifically, in the embodiment of the application, if the standby image file fails to start the BMC firmware, both the main image file and the standby image file are damaged, and at this time, in order to ensure the function of the BMC firmware as much as possible, the latest image file of the remote server (or cloud) is used as a remote rescue image when both the main image file and the standby image file fail. The latest image file is stored in the remote server, so that the latest image file needs to be downloaded to a local temporary partition, is updated compared with the main image file and the standby image file, comprises the latest function iteration, vulnerability restoration and hardware compatibility optimization, and is consistent with the main image file in content integrity.
In some optional embodiments, the application can solve the limitation of a local mirror image (for a main mirror image or a standby mirror image), repair the defects of a known security hole or a history version and improve the security and the stability of a system by acquiring the latest mirror image file of the remote server.
In some alternative embodiments, to avoid that the latest image file cannot be downloaded, or that the BMC firmware cannot be started successfully after the latest image file is downloaded to the local temporary partition, the BMC firmware needs to be restarted again based on the multi-level image file recovery function, so that when the BMC firmware is started based on the latest image file, it is determined whether the BMC firmware can be started successfully, thereby switching the image file again.
Step S103, if the starting of the BMC firmware based on the latest image file is unsuccessful, restarting the BMC firmware based on the local minimized image file.
Specifically, in the embodiment of the application, the main image file, the standby image file and the latest image file can ensure the integral function of the BMC firmware, but when the main image file, the standby image file and the latest image file can not successfully start the BMC firmware, the local minimum image file is used as the final bottom-protected starting image. The minimized image file is a local factory-outgoing minimized image, only a basic starting core of BMC firmware, such as a minimized kernel, a basic hardware driver, an administrator login interface and the like, is reserved, advanced functions such as a graphical interface, log analysis and the like are not supported, and the original state of factory is maintained after factory outgoing, so that reliability in extreme scenes is ensured. The method has the advantages that the reserved partition of the local Flash stored in the mirror image file is minimized, multi-positioning read-only protection is realized, and the mirror image file is physically isolated from the main partition and the spare partition to prevent accidental erasure.
In some optional embodiments, when the BMC firmware cannot be successfully started by the main image file, the standby image file and the latest image file, loading a locally reserved partition to leave the factory default minimized image file, erasing abnormal firmware information, and writing the minimized image, so that the BMC firmware is started based on the minimized image file, the minimum usability of the system is ensured, an alarm log is generated, and an administrator is prompted to perform manual intervention.
According to the application, when the main image file is damaged, the BMC firmware is restarted based on the standby image file, if the starting is unsuccessful, the BMC firmware is restarted based on the latest image file of the remote server, and if the starting is unsuccessful, the BMC firmware is restarted based on the local minimum image file, so that the problem that a double-image backup mechanism still faces system paralysis can be solved, a multistage image file recovery mechanism is constructed through a plurality of image files which are deployed in advance, and when the BMC firmware is started abnormally, the system is gradually recovered according to the multistage image file recovery mechanism, so that the technical effects of ensuring the minimum availability of the system and improving the reliability and safety of the system are achieved.
The embodiment of the application also provides a starting method of the BMC firmware, which can be used for a server where the BMC firmware is located, and FIG. 2 is a flow chart of the starting method of the BMC firmware according to the embodiment of the application, as shown in FIG. 2, the flow comprises the following steps:
in step S201, when the BMC firmware is started based on the primary image file, if the primary image file is damaged, the BMC firmware is restarted based on the standby image file, and whether the restarting of the BMC firmware based on the standby image file is successful is determined.
Specifically, the step S201 includes:
Step S2011, loading the boot loader partition based on the master image file, and performing anomaly detection on the boot loader partition, the kernel partition, the read-only file system partition and the read-write file system partition, so as to determine the health status corresponding to each partition, where the health status is a normal status, a warning status or a serious anomaly status.
Specifically, in the embodiment of the present invention, as shown in fig. 3, the BMC firmware is generally configured by a uboot (bootloader) partition, a kernel partition, a rofs (read-only file system) partition, and a rwfs (readable and writable file system) partition, when the BMC firmware is started based on a main image file, the boot loader of the uboot partition is loaded first, and then the uboot scans its own partition, the kernel partition, the rofs partition, and the rwfs partition in a full disk manner, so as to obtain detection data of each partition, and perform anomaly detection on each partition according to the detection data, so as to determine the health condition of individual feces.
In some optional embodiments, the step S2011 includes:
and a step a1, after the boot loader partition scans all preset partitions, performing data integrity check on a scanning result, and judging whether an abnormal area exists in the boot loader partition, the kernel partition, the read-only file system partition and/or the read-write file system partition.
And a2, if an abnormal area exists, generating abnormal area metadata information, and comparing the abnormal area metadata information with a preset bad block table.
And a3, if the preset bad block table does not contain the abnormal region metadata information, updating the abnormal region metadata information to a metadata area, otherwise, not updating.
And a4, determining the health conditions of the boot loader partition, the kernel partition, the read-only file system partition and the read-write file system partition according to the metadata information in the metadata area.
Specifically, in the embodiment of the present application, as shown in fig. 4, the uboot scans all the partitions to obtain detection data, and ECC (Error CHECKING AND correction) is called to check the integrity of the detection data. Wherein, ECC is a means of memory error checking and correction. When the data verification fails, recording metadata information of an abnormal area, wherein the metadata information comprises a partition identification, a starting address, an ending address, an error type and a time stamp. According to the method and the device for detecting the partition health of the BMC firmware, the abnormal condition of the BMC firmware can be accurately judged by detecting the partition health of the BMC firmware, so that whether the BMC firmware needs to be restarted by switching the image file is judged.
In some optional embodiments, a pre-stored Bad Block Table (BBT) is read, where area information capable of being automatically recovered is stored, and the abnormal area metadata information is compared with a metadata information list in the Bad Block Table, so as to determine whether the abnormal area metadata information is area information in the Bad Block Table. And if the bad block table does not contain the abnormal region metadata information, updating the abnormal region metadata information to a metadata region. According to the method, the abnormal conditions which can be recovered automatically can be screened out by detecting the abnormal of each partition and comparing the abnormal conditions based on the bad block table, and the health condition of the partition is judged only aiming at the abnormal conditions which cannot be recovered automatically, so that a multistage recovery mechanism is triggered accurately.
In some alternative embodiments, as shown in fig. 4, the health status of each firmware partition is classified according to metadata information of the metadata area, including a normal status, that is, no bad blocks exist in the partition and ECC check is passed, the system can be used normally, a warning status, that is, a small number of known repairable bad blocks are detected in the partition, for example, data after ECC error correction can be used normally, alarm information is recorded, a background repair mechanism is triggered, and a serious abnormal status, that is, the number of physical bad blocks in the partition exceeds a threshold or known key data is damaged, serious abnormal information is recorded, and a multi-level recovery mechanism is triggered.
Step 2012, determining whether the serious abnormal state exists in the health conditions corresponding to the boot loader partition, the kernel partition, the read-only file system partition and the readable and writable file system partition.
And step S2013, if the health condition of at least one partition is the serious abnormal state, judging that the main image file is damaged.
Specifically, in the embodiment of the present invention, if the health status of any one of the uboot partition, the kernel partition, the rofs partition and the rwfs partition is a serious abnormal status, it is determined that the active image file is damaged, and at this time, the three-level image file recovery mechanism is triggered to gradually repair the system.
Step S202, if the new starting of the BMC firmware based on the standby image file is unsuccessful, restarting the BMC firmware based on the latest image file of the remote server, and determining whether restarting the BMC firmware based on the latest image file is successful. Please refer to step S102 in the embodiment shown in fig. 1 in detail, which is not described herein.
In step S203, if the starting of the BMC firmware based on the latest image file is unsuccessful, restarting the BMC firmware based on the local minimized image file. Please refer to step S103 in the embodiment shown in fig. 1 in detail, which is not described herein.
And step S204, creating a background repair thread, and repairing the main image file based on the background repair thread.
Specifically, in the embodiment of the application, when the damage of the main image file is detected, the method is switched to other image files based on a multi-stage image file recovery mechanism, and meanwhile, a background automatic repair function is automatically triggered based on an event triggering mechanism, namely, after the BMC firmware is successfully started by other image files, a background repair thread with low priority is automatically created, and when the main image file is repaired, the service function in the system operation is preferentially ensured.
In some alternative embodiments, if both the primary image file and the backup image file are damaged, a background automatic repair thread is created for the backup image file to repair the backup image file.
Step S205, when there is data update, synchronously updating the data to the corresponding area of the data in the master image file.
Specifically, in the embodiment of the invention, a data synchronization service is created in a background repair process, for example, when a standby image file is abnormal and the BMC firmware is successfully started based on the standby image file, when the standby image file has data update in configuration and operation, the data synchronization service is performed in an increment mode to the standby image file so as to automatically repair bad blocks and ensure data consistency in the system operation period, reduce manual intervention and realize redundant backup.
And step S206, if the corresponding area of the data in the main image file is an unrecoverable area, marking the abnormal area as a permanently disabled area, and updating the data to a spare block pool.
Specifically, in the embodiment of the present invention, if the corresponding area of the data in the primary image file is an unrecoverable area, a spare block pool is allocated to the area, and based on a periodic trigger mechanism, the bad block record of the metadata area is periodically read, the valid data of the corresponding area is read from the bad block, the valid data is written into the reserved spare block pool, and the update metadata marks the original bad block as "permanently disabled", and the new address is mapped to the spare block pool.
In some optional embodiments, the application can synchronously update the data of the running image file to the corresponding data block of the damaged image file when repairing the bad block of the damaged image file by adding the data synchronization function in the background repairing process. In addition, if the corresponding data block is detected to be a bad block and cannot be repaired, a spare block pool is allocated for the bad block, and when the main mirror image file is operated subsequently, the data originally updated to the bad block is directly updated to the spare block pool, so that the normal use of the main mirror image file is ensured.
According to the application, when the main image file is damaged, the BMC firmware is restarted based on the standby image file, if the starting is unsuccessful, the BMC firmware is restarted based on the latest image file of the remote server, and if the starting is unsuccessful, the BMC firmware is restarted based on the local minimum image file, so that the problem that a double-image backup mechanism still faces system paralysis can be solved, a multistage image file recovery mechanism is constructed through a plurality of image files which are deployed in advance, and when the BMC firmware is started abnormally, the system is gradually recovered according to the multistage image file recovery mechanism, so that the technical effects of ensuring the minimum availability of the system and improving the reliability and safety of the system are achieved.
The embodiment of the application also provides a starting method of the BMC firmware, which can be used for a server where the BMC firmware is located, and FIG. 5 is a flow chart of the starting method of the BMC firmware according to the embodiment of the application, as shown in FIG. 5, the flow comprises the following steps:
In step S501, when the BMC firmware is started based on the primary image file, if the primary image file is damaged, the BMC firmware is restarted based on the standby image file, and whether the restarting of the BMC firmware based on the standby image file is successful is determined.
Specifically, the step S501 includes:
step S5011, obtaining the standby image file.
Specifically, in the embodiment of the invention, the standby image file is loaded in the standby partition of the Flash.
And step S5012, verifying the integrity and the validity of the standby image file, and if the verification is passed, starting the BMC firmware based on the standby image file.
Specifically, in the embodiment of the invention, as shown in fig. 6, the integrity of the standby image file is checked and verified through the hash checksum structure, the standby image file is determined to be completely consistent with the original release version, content abnormality caused by storage medium faults (such as Flash bad blocks) or malicious attacks is avoided, the standby image file is confirmed to be compatible with current BMC hardware, and a core starting component (such as uboot and kernel) can normally operate, so that starting failure caused by 'complete but unsuitable image' is avoided. If the verification is passed, starting the BMC firmware based on the standby image file, and if the verification is failed, starting the BMC firmware based on the latest image file of the remote server.
And step S5013, judging whether the BMC firmware is started successfully or not.
Specifically, in the embodiment of the application, whether uboot booting and hardware initialization are successful, whether kernel loading and root file system mounting are successful, whether BMC core service starting is successful, whether a management interface can be normally accessed or not and the like are sequentially judged, so that whether BMC firmware is started successfully is judged. If the starting fails, the fault core of the standby image file is positioned, and then the BMC firmware is started based on the standby image file again after the fault is recovered. According to the embodiment of the application, the metadata area information is read through the uboot partition, the metadata information record is analyzed, and the abnormal area in the standby mirror image file is judged. If the abnormal partition of the standby image file is concentrated in the rwfs partition, resetting the BMC after the restoration of the factory is executed, namely, calling a built-in restoration factory script of the BMC, wherein the script only acts on the rwfs partition, so that the abnormality of the rwfs partition is repaired.
In some optional embodiments, when there is a data exception in the uboot partition, the kernel partition, and the rofs partition in the standby image file, the corresponding partition data in the standby image file may be overlaid on the corresponding partition data in the main image file. And after the fault repair is completed, attempting to start the BMC firmware based on the standby image file again, if the starting fails, starting a background recovery thread, and starting the BMC firmware based on the latest image file of the remote server.
Step S502, if the new starting of the BMC firmware based on the standby image file is unsuccessful, restarting the BMC firmware based on the latest image file of the remote server, and determining whether restarting the BMC firmware based on the latest image file is successful.
Specifically, in the embodiment of the present invention, the step S502 includes:
step S5021, obtaining configuration information of the remote server, and performing bidirectional authentication with the remote server based on the configuration information.
Specifically, in the embodiment of the present invention, as shown in fig. 6, when the standby image file fails to start the BMC firmware, the configuration information of the remote server is read, and bidirectional authentication is attempted with the remote server, and if the authentication is not passed, the BMC firmware is directly started based on the minimized image file.
Step S5022, if the authentication is passed, a system recovery request is generated, and the system recovery request is sent to the remote server to obtain the latest image file of the remote server.
Specifically, in the embodiment of the application, if the bidirectional verification of the BMC firmware system and the remote server is passed, a system recovery request is generated, and the system recovery request is sent to the remote server, so that the latest image file of the remote server is downloaded, signature verification and verification are completed, and the latest image file is written into the main partition. The application can solve the limitation of the local mirror image (for the main mirror image or the standby mirror image), repair the defects of the known security hole or the history version and improve the safety and the stability of the system by acquiring the latest mirror image file of the remote server.
Step S5023, starting the BMC firmware based on the latest image file, and judging whether the BMC firmware is started successfully. Please refer to step S5013 in detail, which is not described herein.
In step S503, if the starting of the BMC firmware based on the latest image file is unsuccessful, the BMC firmware is restarted based on the local minimized image file. Please refer to step S103 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S504, obtaining a power supply signal, judging whether power is lost according to the power supply signal, if the power is lost, freezing the writing operation of the readable and writable file system partition, triggering a capacity-increasing capacitor to supply energy, and executing the writing operation.
Specifically, in the embodiment of the present application, a power signal detected by a CPLD (Complex Programmable Logic Device ) is obtained, where the CPLD is a digital integrated circuit in which a user constructs a logic function by himself according to the needs of the user. And detecting whether the BMC firmware system is powered down or not in the running process according to the power supply signal, and if the BMC firmware system is powered down, immediately freezing the writing operation of the rwfs partition and not executing the new writing operation. However, if a write operation is being performed at this time, a function is performed based on the increased storage capacitance, thereby completing the current write operation. According to the application, the energy storage capacitor is added, so that a power-down protection cooperative function can be added for the system, the system responds in time when power is down, the writing operation when power is down is completed by means of energy storage of the capacitor, the file in the writing operation when power is down is effectively protected, and the consistency and the integrity of writing data are ensured.
According to the application, when the main image file is damaged, the BMC firmware is restarted based on the standby image file, if the starting is unsuccessful, the BMC firmware is restarted based on the latest image file of the remote server, and if the starting is unsuccessful, the BMC firmware is restarted based on the local minimum image file, so that the problem that a double-image backup mechanism still faces system paralysis can be solved, a multistage image file recovery mechanism is constructed through a plurality of image files which are deployed in advance, and when the BMC firmware is started abnormally, the system is gradually recovered according to the multistage image file recovery mechanism, so that the technical effects of ensuring the minimum availability of the system and improving the reliability and safety of the system are achieved.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment.
The embodiment of the application also provides a starting device of the BMC firmware, which is used for realizing the embodiment and the preferred implementation mode, and the description is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
The embodiment provides a starting device of a BMC firmware, as shown in fig. 7, including:
And the starting module 701 of the BMC firmware is configured to obtain a data type and an operation frequency of the target data after determining the target data, where the data type includes write data and read data.
The first stage boot recovery module 701 is configured to, when the BMC firmware is booted based on the primary image file, restart the BMC firmware based on the standby image file if the primary image file is damaged, and determine whether restarting the BMC firmware based on the standby image file is successful.
And the second stage boot recovery module 702 is configured to restart the BMC firmware based on a latest image file of a remote server if the new boot of the BMC firmware based on the standby image file is unsuccessful, and determine whether the restart of the BMC firmware based on the latest image file is successful.
And a third-stage boot recovery module 703, configured to restart the BMC firmware based on the local minimized image file if the starting of the BMC firmware based on the latest image file is unsuccessful.
In some alternative embodiments, the first stage boot recovery module 701 includes:
the condition detection unit is used for loading the boot loader partition based on the main image file, carrying out anomaly detection on the boot loader partition, the kernel partition, the read-only file system partition and the read-write file system partition, and determining the health condition corresponding to each partition, wherein the health condition is a normal state, a warning state or a serious anomaly state.
And the condition judging unit is used for judging whether the serious abnormal state exists in the health conditions corresponding to the boot loader partition, the kernel partition, the read-only file system partition and the read-write file system partition.
And the damage detection unit is used for judging that the main image file is damaged if the health condition of at least one partition is the serious abnormal state.
In some alternative embodiments, the condition detection unit includes:
And the abnormal region detection subunit is used for scanning all preset partitions by the boot loader partition, and then checking the data integrity of the scanning result to judge whether the boot loader partition, the kernel partition, the read-only file system partition and/or the read-write file system partition have abnormal regions or not.
And the abnormal region comparison subunit is used for generating abnormal region metadata information if an abnormal region exists, and comparing the abnormal region metadata information with a preset bad block table.
And the abnormal data updating subunit is used for updating the abnormal region metadata information to a metadata area if the abnormal region metadata information is not contained in the preset bad block table, and otherwise, the abnormal region metadata information is not updated.
And the health condition determining subunit is used for determining the health conditions of the boot loader partition, the kernel partition, the read-only file system partition and the read-write file system partition according to the metadata information in the metadata area.
In some alternative embodiments, the first stage boot recovery module 701 further comprises:
And the standby image file acquisition unit is used for acquiring the standby image file.
And the standby image file verification unit is used for verifying the integrity and the validity of the standby image file, and if the verification is passed, the BMC firmware is started based on the standby image file.
And the firmware start checking unit is used for judging whether the BMC firmware is started successfully or not.
In some alternative embodiments, the second stage boot recovery module 702 includes:
And the bidirectional authentication unit is used for acquiring the configuration information of the remote server and performing bidirectional authentication with the remote server based on the configuration information.
And the latest image file acquisition unit is used for generating a system recovery request if the authentication is passed, and sending the system recovery request to the remote server so as to acquire the latest image file of the remote server.
And the latest image file verification unit is used for verifying the integrity and the validity of the latest image file, and if the latest image file passes the verification, the BMC firmware is started based on the latest image file.
And the firmware start checking unit is used for judging whether the BMC firmware is started successfully or not.
In some alternative embodiments, the apparatus further comprises:
And the background repair module is used for creating a background repair thread and repairing the main mirror image file based on the background repair thread.
And the data synchronization module is used for synchronously updating the data to the corresponding area of the data in the master image file when the data update exists.
And the data updating module is used for marking the abnormal area as a permanently disabled area and updating the data to a standby block pool if the corresponding area of the data in the main image file is an unrecoverable area.
In some optional embodiments, the device further comprises a power-down protection module, configured to obtain a power signal, determine whether to power down according to the power signal, and if power down, freeze a write operation of the readable and writable file system partition, trigger a capacity-increasing capacitor to supply energy, and execute the write operation.
The description of the features in the embodiment corresponding to the starting device of the BMC firmware may refer to the related description of the embodiment corresponding to the starting method of the BMC firmware, which is not described herein in detail.
An embodiment of the present application further provides an electronic device, as shown in fig. 8, including a processor 10 and a memory 20, where the memory 20 stores a computer program, and the processor 10 is configured to run the computer program to execute the steps in any of the foregoing embodiments of the method for starting BMC firmware.
Embodiments of the present application also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is configured to perform, when executed, the steps of any of the above-described embodiments of the method for starting BMC firmware.
In an exemplary embodiment, the computer readable storage medium may include, but is not limited to, a U disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, etc. various media in which a computer program may be stored.
Embodiments of the present application also provide a computer program product, where the computer program product includes a computer program, and when the computer program is executed by a processor, implements the steps in any of the foregoing embodiments of the method for starting BMC firmware.
Embodiments of the present application also provide another computer program product, including a non-volatile computer readable storage medium, where the non-volatile computer readable storage medium stores a computer program, where the computer program when executed by a processor implements the steps in any of the foregoing embodiments of a method for starting BMC firmware.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The method, the device, the equipment, the medium and the program product for starting the BMC firmware provided by the application are described in detail. The principles and embodiments of the present application have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present application and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the application can be made without departing from the principles of the application and these modifications and adaptations are intended to be within the scope of the application as defined in the following claims.

Claims (10)

1. A method for starting up BMC firmware, the method comprising:
when starting BMC firmware based on an active image file, if the active image file is damaged, restarting the BMC firmware based on a standby image file, and judging whether restarting the BMC firmware based on the standby image file is successful or not;
If the new starting of the BMC firmware based on the standby image file is unsuccessful, restarting the BMC firmware based on the latest image file of the remote server, and judging whether restarting the BMC firmware based on the latest image file is successful or not;
And restarting the BMC firmware based on the local minimized image file if the starting of the BMC firmware based on the latest image file is unsuccessful.
2. The method of claim 1, wherein the BMC firmware comprises a boot loader partition, a kernel partition, a read-only file system partition, and a read-write file system partition;
The method for starting the BMC firmware based on the master image file comprises the following steps:
Loading the boot loader partition based on the main image file, and performing anomaly detection on the boot loader partition, the kernel partition, the read-only file system partition and the read-write file system partition to determine the health condition corresponding to each partition, wherein the health condition is a normal state, a warning state or a serious anomaly state;
judging whether the serious abnormal state exists in the health conditions corresponding to the boot loader partition, the kernel partition, the read-only file system partition and the read-write file system partition;
And if the health condition of at least one partition is the serious abnormal state, judging that the main image file is damaged.
3. The method of claim 2, wherein the performing anomaly detection on the boot loader partition, the kernel partition, the read-only file system partition, and the read-write file system partition to determine health conditions corresponding to the partitions comprises:
After the boot loader partition scans all preset partitions, data integrity verification is carried out on the scanning result, and whether the boot loader partition, the kernel partition, the read-only file system partition and/or the read-write file system partition have abnormal areas or not is judged;
If an abnormal area exists, generating abnormal area metadata information, and comparing the abnormal area metadata information with a preset bad block table;
if the preset bad block table does not contain the abnormal region metadata information, updating the abnormal region metadata information to a metadata area, otherwise, not updating;
And determining the health conditions of the boot loader partition, the kernel partition, the read-only file system partition and the read-write file system partition according to the metadata information in the metadata area.
4. The method of claim 3, wherein restarting the BMC firmware based on the standby image file and determining whether restarting the BMC firmware based on the standby image file was successful comprises:
acquiring the standby mirror image file;
verifying the integrity and the validity of the standby image file, and if the verification is passed, starting the BMC firmware based on the standby image file;
and judging whether the BMC firmware is started successfully or not.
5. The method of claim 1, wherein restarting the BMC firmware based on the latest image file of the remote server and determining whether restarting the BMC firmware based on the latest image file was successful comprises:
Acquiring configuration information of the remote server, and performing bidirectional authentication with the remote server based on the configuration information;
if the authentication is passed, a system recovery request is generated, and the system recovery request is sent to the remote server to acquire the latest image file of the remote server;
And starting the BMC firmware based on the latest image file, and judging whether the BMC firmware is started successfully or not.
6. The method of any of claims 1 to 5, further comprising, after the BMC firmware is booted:
creating a background repair thread, and repairing based on the master image file of the background repair thread;
When the data is updated, synchronously updating the data to a corresponding area of the data in the main mirror image file;
and if the corresponding area of the data in the main mirror image file is an unrecoverable area, marking the abnormal area as a permanently disabled area, and updating the data to a spare block pool.
7. The method of claim 2, further comprising, after the BMC firmware is booted up:
Acquiring a power supply signal, and judging whether power is lost or not according to the power supply signal;
and if the power is lost, freezing the writing operation of the readable and writable file system partition, triggering the capacity-increasing capacitor to supply energy, and executing the writing operation.
8. A protection device for BMC firmware, the device comprising:
The first-stage starting recovery module is used for restarting the BMC firmware based on the standby image file if the main image file is damaged when the BMC firmware is started based on the main image file, and judging whether restarting the BMC firmware based on the standby image file is successful or not;
the second-stage starting recovery module is used for restarting the BMC firmware based on the latest image file of the remote server if the new starting of the BMC firmware based on the standby image file is unsuccessful, and judging whether the restarting of the BMC firmware based on the latest image file is successful or not;
And the third-stage starting recovery module is used for restarting the BMC firmware based on the local minimum image file if the starting of the BMC firmware based on the latest image file is unsuccessful.
9. A computer device, comprising:
a memory and a processor, the memory and the processor are in communication connection, the memory stores computer instructions, and the processor executes the computer instructions, thereby executing the method for starting the BMC firmware according to any of claims 1 to 7.
10. A computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of boot-up of BMC firmware of any of claims 1 to 7.
CN202511232162.2A 2025-08-29 2025-08-29 Starting method, device, equipment, medium and program product of BMC firmware Pending CN121166439A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202511232162.2A CN121166439A (en) 2025-08-29 2025-08-29 Starting method, device, equipment, medium and program product of BMC firmware

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202511232162.2A CN121166439A (en) 2025-08-29 2025-08-29 Starting method, device, equipment, medium and program product of BMC firmware

Publications (1)

Publication Number Publication Date
CN121166439A true CN121166439A (en) 2025-12-19

Family

ID=98042750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202511232162.2A Pending CN121166439A (en) 2025-08-29 2025-08-29 Starting method, device, equipment, medium and program product of BMC firmware

Country Status (1)

Country Link
CN (1) CN121166439A (en)

Similar Documents

Publication Publication Date Title
JP5909264B2 (en) Secure recovery apparatus and method
CN101221508B (en) Method and device for starting equipment
US6934881B2 (en) Memory including portion storing a copy of primary operating system and method of operating computer including the memory
US7340638B2 (en) Operating system update and boot failure recovery
CN101329631B (en) Method and apparatus for automatically detecting and recovering start-up of embedded system
US6931522B1 (en) Method for a computer using the system image on one of the partitions to boot itself to a known state in the event of a failure
TW201715395A (en) Method for recovering a baseboard management controller and baseboard management controller
CN112328358B (en) Dual-system starting method based on virtual machine and storage medium
CN119356716B (en) Firmware upgrading method and device, storage medium, electronic equipment and program product
CN111552592A (en) Double-backup starting method and system
WO2023103755A1 (en) Terminal starting method, electronic device, and computer-readable storage medium
CN120179321B (en) Firmware loading method, device, electronic device and storage medium
JP2004054616A (en) Information processing device with automatic firmware repair function
CN109086085A (en) A kind of os starting management method and device
CN120255970B (en) Baseboard management controller starting method, computer equipment, medium and product
CN116501409A (en) Dual-Flash-based server starting method, computer equipment and storage medium
JP2005284902A (en) Terminal device, control method and control program thereof, host device, control method and control program thereof, and method, system, and program for remote updating
CN113190256B (en) Upgrading method, device and equipment
CN120872675A (en) Equipment self-repairing method, intelligent equipment, storage medium and product
CN113626262A (en) BMC recovery method, system, equipment and medium
CN121166439A (en) Starting method, device, equipment, medium and program product of BMC firmware
CN120429026B (en) Management controller, management controller starting method, device and storage medium
KR20030062793A (en) Apparatus and Method for operating recovery and backup of linux operting system
CN118051383A (en) Partition damage switching backup method and system
CN121116680A (en) Application program rollback method and device, electronic equipment and vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination