[go: up one dir, main page]

CN112100023B - Board card information acquisition method and device for heterogeneous acceleration platform - Google Patents

Board card information acquisition method and device for heterogeneous acceleration platform Download PDF

Info

Publication number
CN112100023B
CN112100023B CN202010818566.0A CN202010818566A CN112100023B CN 112100023 B CN112100023 B CN 112100023B CN 202010818566 A CN202010818566 A CN 202010818566A CN 112100023 B CN112100023 B CN 112100023B
Authority
CN
China
Prior art keywords
information data
information
fpga
ram
board
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010818566.0A
Other languages
Chinese (zh)
Other versions
CN112100023A (en
Inventor
牟奇
王洪良
刘伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010818566.0A priority Critical patent/CN112100023B/en
Publication of CN112100023A publication Critical patent/CN112100023A/en
Application granted granted Critical
Publication of CN112100023B publication Critical patent/CN112100023B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3031Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a motherboard or an expansion card
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a method and equipment for acquiring board card information of a heterogeneous acceleration platform, wherein the method comprises the following steps: acquiring information data of the board card, and sending the information data to the FPGA at intervals of threshold time; responding to the FPGA receiving the information data, and storing the information data into a corresponding RAM; periodically acquiring information data in the RAM through a PCIe Bar space, and verifying and analyzing the information data; and displaying the analyzed information data through a display device. By using the scheme of the invention, the limitation of the server BMC can be eliminated, and the board card state information can be monitored in real time, so that the board card state information can be acquired in real time in a system without a matched BMC, and the operation and maintenance work can be conveniently expanded.

Description

一种异构加速平台的板卡信息获取的方法和设备Method and device for obtaining board information of a heterogeneous acceleration platform

技术领域technical field

本领域涉及计算机领域,并且更具体地涉及一种异构加速平台的板卡信息获取的方法和设备。This field relates to the computer field, and more specifically relates to a method and a device for obtaining board information of a heterogeneous acceleration platform.

背景技术Background technique

随着大数据、云计算、人工智能技术的兴起,人们对数据计算速度的要求越来越高。基于这种考虑,我们提出了一种适用于服务器的CPU+FPGA+MCU(微控制单元)异构加速平台。其基本原理是将一部分数据放到FPGA中使用特定算法进行快速处理,然后将处理结果反馈给CPU(中央处理器),减轻CPU的压力,提高了服务器工作效率,减少了运行时间。如图1所示,图1的左侧部分是未使用FPGA加速的传统服务器,在处理数据B的时候需要较长的时间,图1的右侧部分是使用FPGA(现场可编辑门阵列)加速平台之后,将数据B放在FPGA中使用特定算法进行处理,其处理速度明显快于CPU,这样B的处理时间被大大缩减。由此可以看出,在未来,以FPGA主导的异构加速平台在大数据、云计算、人工智能方面具有广泛的应用前景。With the rise of big data, cloud computing, and artificial intelligence technologies, people's requirements for data computing speed are getting higher and higher. Based on this consideration, we propose a CPU+FPGA+MCU (micro control unit) heterogeneous acceleration platform suitable for servers. The basic principle is to put part of the data in the FPGA and use a specific algorithm for fast processing, and then feed back the processing results to the CPU (Central Processing Unit), which reduces the pressure on the CPU, improves the server's work efficiency, and reduces the running time. As shown in Figure 1, the left part of Figure 1 is a traditional server that does not use FPGA acceleration, and it takes a long time to process data B. The right part of Figure 1 is accelerated by FPGA (field programmable gate array) After the platform, the data B is processed in the FPGA using a specific algorithm, and its processing speed is significantly faster than that of the CPU, so that the processing time of B is greatly reduced. It can be seen from this that in the future, FPGA-led heterogeneous acceleration platforms have broad application prospects in big data, cloud computing, and artificial intelligence.

图2为异构加速平台的整体框架,主要包括服务器以及与其通过PCIe连接的FPGA异构加速卡。异构加速卡包括主芯片(FPGA芯片)、监控管理芯片(MCU芯片)、DDR、FLASH、EEPROM、监控Sensor以及一些其他外围电路,其中EEPROM用于存储板卡的基本信息,外围Sensor用于对板卡状态进行监测,包括板卡温度、功耗等信息。Figure 2 shows the overall framework of the heterogeneous acceleration platform, which mainly includes the server and the FPGA heterogeneous acceleration card connected to it through PCIe. The heterogeneous accelerator card includes the main chip (FPGA chip), monitoring and management chip (MCU chip), DDR, FLASH, EEPROM, monitoring Sensor and some other peripheral circuits, where the EEPROM is used to store the basic information of the board, and the peripheral Sensor is used to The board status is monitored, including board temperature, power consumption and other information.

为实时监控板卡状态信息,目前通用做法是将FPGA加速卡作为服务器BMC监控管理中的一部分,服务器BMC可通过ipmitool工具获取板卡状态信息。此方法虽然可以到达实时监控板卡状态的目的,但是其致命缺陷在于需要配套的BMC系统。假如用户服务器中的BMC并非板卡厂家配套BMC,那通过此方法依然无法实现对板卡信息的获取。In order to monitor the status information of the board in real time, the current general practice is to use the FPGA accelerator card as a part of the monitoring and management of the server BMC. The server BMC can obtain the status information of the board through the ipmitool tool. Although this method can achieve the purpose of real-time monitoring of the status of the board, its fatal flaw lies in the need for a supporting BMC system. If the BMC in the user server is not a BMC provided by the board manufacturer, it is still impossible to obtain board information through this method.

发明内容Contents of the invention

有鉴于此,本发明实施例的目的在于提出一种异构加速平台的板卡信息获取的方法和设备,通过使用本发明的方法,能够摆脱服务器BMC的限制而对板卡状态信息的实时监控,使得在无配套BMC的系统中也可以实时获取板卡状态信息,方便了运维工作的展开。In view of this, the purpose of the embodiment of the present invention is to propose a method and device for obtaining board information of a heterogeneous acceleration platform. By using the method of the present invention, it is possible to get rid of the limitation of the server BMC and monitor the status information of the board in real time. , so that the board status information can be obtained in real time even in a system without a supporting BMC, which facilitates the development of operation and maintenance work.

基于上述目的,本发明的实施例的一个方面提供了一种异构加速平台的板卡信息获取的方法,包括以下步骤:Based on the above purpose, an aspect of the embodiments of the present invention provides a method for obtaining board information of a heterogeneous acceleration platform, including the following steps:

获取板卡的信息数据,并每间隔阈值时间将信息数据发送到FPGA中;Obtain the information data of the board, and send the information data to the FPGA every threshold time interval;

响应于FPGA接收到信息数据,将信息数据存储到相应的RAM中;In response to the FPGA receiving the information data, storing the information data in the corresponding RAM;

经由PCIe Bar空间定期获取RAM中的信息数据,并对信息数据进行校验和解析;Obtain the information data in RAM regularly through the PCIe Bar space, and check and analyze the information data;

将解析后的信息数据通过显示装置进行显示。The analyzed information data is displayed by a display device.

根据本发明的一个实施例,获取板卡的信息数据,并每间隔阈值时间将信息数据发送到FPGA中还包括:According to an embodiment of the present invention, obtaining the information data of the board, and sending the information data to the FPGA every interval threshold time also includes:

使用MCU读出存储设备中存储的板卡的信息数据,并将信息数据存储到缓存中;Use the MCU to read out the information data of the board stored in the storage device, and store the information data in the cache;

MCU每间隔阈值时间将缓存中的信息数据经由PCIe接口上的SMbus总线发送到FPGA。The MCU sends the information data in the cache to the FPGA via the SMbus bus on the PCIe interface every threshold time interval.

根据本发明的一个实施例,还包括:According to an embodiment of the present invention, also include:

在每间隔阈值时间将信息数据发送到FPGA中之前将更新状态标识进行置位;Set the update status flag before sending the information data to the FPGA every threshold time interval;

在将信息数据发送到FPGA之后将更新状态标识进行复位。Reset the update status flag after sending the message data to the FPGA.

根据本发明的一个实施例,还包括:According to an embodiment of the present invention, also include:

在经由PCIe Bar空间获取RAM中的信息数据之前判断更新状态标识的状态;Before obtaining the information data in the RAM via the PCIe Bar space, judge the status of the update status flag;

响应于更新状态标识为置位状态,停止获取并提示用户板卡信息正在更新;In response to the update status flag being set, stop acquiring and prompt the user that the board information is being updated;

响应于更新状态标识为复位状态,继续获取RAM中的信息数据。In response to the update state being identified as the reset state, continue to obtain information data in the RAM.

根据本发明的一个实施例,信息数据包括板卡的FRU(现场可更换单元)信息和板卡的状态信息。According to an embodiment of the present invention, the information data includes FRU (Field Replaceable Unit) information of the board and status information of the board.

根据本发明的一个实施例,响应于FPGA接收到信息数据,将信息数据存储到相应的RAM中包括:According to an embodiment of the present invention, in response to the FPGA receiving the information data, storing the information data into the corresponding RAM includes:

响应于FPGA接收到信息数据,将信息数据存储到相应的片上的RAM中;In response to the FPGA receiving the information data, storing the information data into a corresponding on-chip RAM;

经由FPGA将RAM挂载到PCIe Bar空间上以使服务器端可以通过访问PCIe Bar空间的方式访问RAM。Mount the RAM to the PCIe Bar space via the FPGA so that the server can access the RAM by accessing the PCIe Bar space.

本发明的实施例的另一个方面,还提供了一种异构加速平台的板卡信息获取的设备,设备包括:Another aspect of the embodiments of the present invention also provides a device for obtaining board information of a heterogeneous acceleration platform, and the device includes:

获取模块,获取模块配置为获取板卡的信息数据,并每间隔阈值时间将信息数据发送到FPGA中;The acquiring module is configured to acquire the information data of the board, and send the information data to the FPGA every threshold time interval;

存储模块,存储模块配置为响应于FPGA接收到信息数据,将信息数据存储到相应的RAM中;A storage module, the storage module is configured to store the information data in the corresponding RAM in response to the FPGA receiving the information data;

解析模块,解析模块配置为经由PCIe Bar空间定期获取RAM中的信息数据,并对信息数据进行校验和解析;An analysis module, the analysis module is configured to regularly obtain the information data in the RAM via the PCIe Bar space, and check and analyze the information data;

显示模块,显示模块配置为将解析后的信息数据通过显示装置进行显示。A display module, the display module is configured to display the parsed information data through a display device.

根据本发明的一个实施例,获取模块还配置为:According to an embodiment of the present invention, the acquisition module is also configured to:

使用MCU读出存储设备中存储的板卡的信息数据,并将信息数据存储到缓存中;Use the MCU to read out the information data of the board stored in the storage device, and store the information data in the cache;

MCU每间隔阈值时间将缓存中的信息数据经由PCIe接口上的SMbus总线发送到FPGA。The MCU sends the information data in the cache to the FPGA via the SMbus bus on the PCIe interface every threshold time interval.

本发明的实施例的另一个方面,还提供了一种计算机设备,包括:Another aspect of the embodiments of the present invention also provides a computer device, including:

至少一个处理器;以及at least one processor; and

存储器,存储器存储有可在处理器上运行的计算机程序,其特征在于,处理器执行程序时执行如上任意一项的方法。A memory, the memory stores a computer program that can run on the processor, and it is characterized in that, when the processor executes the program, any one of the methods above is executed.

本发明的实施例的另一个方面,还提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时执行上述任意一项的方法。Another aspect of the embodiments of the present invention further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, any one of the above-mentioned methods is executed.

本发明具有以下有益技术效果:本发明实施例提供的板卡信息获取的方法,通过获取板卡的信息数据,并每间隔阈值时间将信息数据发送到FPGA中;响应于FPGA接收到信息数据,将信息数据存储到相应的RAM中;经由PCIe Bar空间定期获取RAM中的信息数据,并对信息数据进行校验和解析;将解析后的信息数据通过显示装置进行显示的技术方案,能够摆脱服务器BMC的限制而对板卡状态信息的实时监控,使得在无配套BMC的系统中也可以实时获取板卡状态信息,方便了运维工作的展开。The present invention has the following beneficial technical effects: the method for board information acquisition provided by the embodiment of the present invention, by obtaining the information data of the board, and sending the information data to the FPGA every interval threshold time; in response to the FPGA receiving the information data, Store the information data in the corresponding RAM; regularly obtain the information data in the RAM through the PCIe Bar space, and verify and analyze the information data; the technical solution of displaying the analyzed information data through the display device can get rid of the server Due to the limitation of BMC, the real-time monitoring of the status information of the board makes it possible to obtain the status information of the board in real time even in a system without a supporting BMC, which facilitates the development of operation and maintenance work.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的实施例。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention, and those skilled in the art can obtain other embodiments according to these drawings without any creative effort.

图1为现有技术的异构平台加速模型;Fig. 1 is the heterogeneous platform acceleration model of prior art;

图2为现有技术的异构加速平台;Fig. 2 is the heterogeneous acceleration platform of prior art;

图3为根据本发明一个实施例的板卡信息获取的方法的示意性流程图;Fig. 3 is a schematic flowchart of a method for obtaining board information according to an embodiment of the present invention;

图4为根据本发明一个实施例的板卡信息获取的设备的示意图。Fig. 4 is a schematic diagram of a device for obtaining board information according to an embodiment of the present invention.

具体实施方式detailed description

为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明实施例进一步详细说明。In order to make the object, technical solution and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

基于上述目的,本发明的实施例的第一个方面,提出了一种改善信号信噪比的方法的一个实施例。图3示出的是该方法的示意性流程图。Based on the above purpose, the first aspect of the embodiments of the present invention proposes an embodiment of a method for improving the signal-to-noise ratio. Fig. 3 shows a schematic flowchart of the method.

如图3中所示,该方法可以包括以下步骤:As shown in Figure 3, the method may include the following steps:

S1获取板卡的信息数据,并每间隔阈值时间将信息数据发送到FPGA中,在开机后硬件进行初始化,包括板卡的硬件信息会存储到EEPROM中,可以使用CPU+FPGA+MCU异构加速平台中MCU读取EEPROM中的存储的板卡的信息,板卡的信息包括FRU信息,FRU信息包括序列号信息和产品号信息,并且MCU还要定期轮询板卡的状态信息,状态信息包括温度、功耗、电流等信息,MCU获取到上述信息后先将这些信息存储到缓存中,然后MCU通过PCIe接口上的SMbus总线定时将板卡信息数据(包含FRU信息和板卡状态信息)一起发送到FPGA;S1 obtains the information data of the board, and sends the information data to the FPGA every threshold time, and initializes the hardware after booting, including the hardware information of the board will be stored in the EEPROM, and can use CPU+FPGA+MCU heterogeneous acceleration The MCU in the platform reads the board information stored in the EEPROM. The board information includes FRU information, and the FRU information includes serial number information and product number information. The MCU also polls the status information of the board regularly. The status information includes Information such as temperature, power consumption, current, etc. After the MCU obtains the above information, it first stores the information in the cache, and then the MCU regularly collects the board information data (including FRU information and board status information) through the SMbus bus on the PCIe interface. send to FPGA;

S2响应于FPGA接收到信息数据,将信息数据存储到相应的RAM中,FPGA接收到信息数据之后依据地址将数据存放到相应的片上RAM中,FPGA会将该RAM挂载到PCIe Bar空间上以使服务器Host(主机)端可以通过访问PCIe Bar空间的方式访问;S2 stores the information data in the corresponding RAM in response to the FPGA receiving the information data. After receiving the information data, the FPGA stores the data in the corresponding on-chip RAM according to the address, and the FPGA mounts the RAM to the PCIe Bar space for Make the host (host) side of the server accessible by accessing the PCIe Bar space;

S3经由PCIe Bar空间定期获取RAM中的信息数据,并对信息数据进行校验和解析,Host端通过PCIe Bar空间读写与FPGA进行交互,通过PCIe Bar空间获取RAM中相应地址内的数据,Host端获取数据后首先对数据进行校验,校验获取到的信息是否正确,校验通过之后再通过协议对数据进行解析;S3 regularly obtains the information data in RAM through the PCIe Bar space, and checks and analyzes the information data. The Host terminal interacts with the FPGA through reading and writing in the PCIe Bar space, and obtains the data in the corresponding address in the RAM through the PCIe Bar space. After the terminal obtains the data, it first verifies the data, verifies whether the obtained information is correct, and then parses the data through the protocol after the verification is passed;

S4将解析后的信息数据通过显示装置进行显示,解析后的板卡信息数据通过人机交互界面(如显示器等)将板卡信息进行显示。S4 displays the analyzed information data through a display device, and displays the analyzed board information data through a human-computer interaction interface (such as a display, etc.).

本发明的技术方案,主要应用在CPU+FPGA+MCU异构加速平台中,可有效避免BMC的束缚,仅需服务器Host端程序即可获取板卡信息,且该Host端程序仅负责获取板卡信息,不会对服务器其他功能产生影响,用户可随时执行Host程序实时获取板卡状态信息。The technical solution of the present invention is mainly applied in the CPU+FPGA+MCU heterogeneous acceleration platform, which can effectively avoid the shackles of the BMC, and only the host program of the server can obtain the board information, and the host program is only responsible for obtaining the board The information will not affect other functions of the server, and the user can execute the Host program at any time to obtain the status information of the board in real time.

通过本发明的技术方案,能够摆脱服务器BMC的限制而对板卡状态信息的实时监控,使得在无配套BMC的系统中也可以实时获取板卡状态信息,方便了运维工作的展开。Through the technical solution of the present invention, it is possible to get rid of the limitation of the server BMC and monitor the status information of the board card in real time, so that the status information of the board card can also be obtained in real time in a system without a supporting BMC, which facilitates the development of operation and maintenance work.

在本发明的一个优选实施例中,获取板卡的信息数据,并每间隔阈值时间将信息数据发送到FPGA中还包括:In a preferred embodiment of the present invention, obtaining the information data of the board, and sending the information data to the FPGA every interval threshold time also includes:

使用MCU读出存储设备中存储的板卡的信息数据,并将信息数据存储到缓存中;Use the MCU to read out the information data of the board stored in the storage device, and store the information data in the cache;

MCU每间隔阈值时间将缓存中的信息数据经由PCIe接口上的SMbus总线发送到FPGA。MCU在系统上电时获取板卡的FRU信息并定时获取板卡状态信息,将两部分信息分别存在两个缓存中以便于MCU端程序的编写和数据拷贝,将这两种信息分别放在两个缓存中可以增加数据信息在写入和读取时的响应速度。缓存中的板卡状态信息定时更新,MCU通过PCIe接口上的SMbus总线定时将板卡信息数据发送到FPGA。The MCU sends the information data in the cache to the FPGA via the SMbus bus on the PCIe interface every threshold time interval. When the system is powered on, the MCU obtains the FRU information of the board and regularly obtains the status information of the board, and stores the two parts of information in two buffers for the convenience of program writing and data copying on the MCU side. A cache can increase the response speed of data information when writing and reading. The board status information in the cache is regularly updated, and the MCU regularly sends the board information data to the FPGA through the SMbus bus on the PCIe interface.

在本发明的一个优选实施例中,还包括:In a preferred embodiment of the present invention, also include:

在每间隔阈值时间将信息数据发送到FPGA中之前将更新状态标识进行置位;Set the update status flag before sending the information data to the FPGA every threshold time interval;

在将信息数据发送到FPGA之后将更新状态标识进行复位,该状态标识位理论上可放在BAR空间任意未被使用到的地址上,为方便后期功能扩展,本案将标识位放在Bar2空间最后一个字节上。为避免数据冲突,设置更新状态标识,该标识为置位状态时表示RAM中的信息数据正在更新,这时服务器端不能够读取RAM中存储信息数据,并向用户做出提示,该标识位复位状态时,服务器端可以读取RAM中存储的信息数据。After the information data is sent to the FPGA, the update status flag will be reset. The status flag bit can theoretically be placed on any unused address in the BAR space. In order to facilitate later function expansion, the flag bit is placed at the end of the Bar2 space in this case. on one byte. In order to avoid data conflicts, set the update status flag. When the flag is in the set state, it means that the information data in the RAM is being updated. At this time, the server cannot read the information data stored in the RAM and prompt the user. The flag bit When the state is reset, the server side can read the information data stored in the RAM.

在本发明的一个优选实施例中,还包括:In a preferred embodiment of the present invention, also include:

在经由PCIe Bar空间获取RAM中的信息数据之前判断更新状态标识的状态;Before obtaining the information data in the RAM via the PCIe Bar space, judge the status of the update status flag;

响应于更新状态标识为置位状态,停止获取并提示用户板卡信息正在更新;In response to the update status flag being set, stop acquiring and prompt the user that the board information is being updated;

响应于更新状态标识为复位状态,继续获取RAM中的信息数据。In response to the update state being identified as the reset state, continue to obtain information data in the RAM.

在本发明的一个优选实施例中,响应于FPGA接收到信息数据,将信息数据存储到相应的RAM中包括:In a preferred embodiment of the present invention, in response to the FPGA receiving the information data, storing the information data in the corresponding RAM includes:

响应于FPGA接收到信息数据,将信息数据存储到相应的片上的RAM中,FPGA接收到信息数据后首先对数据进行校验,如果校验通过则覆盖之前RAM中的数据,如果校验不通过则将未通过校验的数据直接删除;In response to the FPGA receiving the information data, the information data is stored in the corresponding on-chip RAM. After the FPGA receives the information data, the data is first verified. If the verification is passed, the data in the previous RAM will be overwritten. If the verification is not passed The data that has not passed the verification will be deleted directly;

经由FPGA将RAM挂载到PCIe Bar空间上以使服务器端可以通过访问PCIe Bar空间的方式访问RAM。Mount the RAM to the PCIe Bar space via the FPGA so that the server can access the RAM by accessing the PCIe Bar space.

通过本发明的技术方案,能够摆脱服务器BMC的限制而对板卡状态信息的实时监控,使得在无配套BMC的系统中也可以实时获取板卡状态信息,方便了运维工作的展开。Through the technical solution of the present invention, it is possible to get rid of the limitation of the server BMC and monitor the status information of the board card in real time, so that the status information of the board card can also be obtained in real time in a system without a supporting BMC, which facilitates the development of operation and maintenance work.

需要说明的是,本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关硬件来完成,上述的程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中存储介质可为磁碟、光盘、只读存储器(Read-Only Memory,ROM)或随机存取存储器(Random AccessMemory,RAM)等。上述计算机程序的实施例,可以达到与之对应的前述任意方法实施例相同或者相类似的效果。It should be noted that those skilled in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented through computer programs to instruct relevant hardware to complete, and the above programs can be stored in computer-readable storage media. When the program is executed, it may include the processes of the embodiments of the above-mentioned methods. The storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM). The foregoing computer program embodiments can achieve the same or similar effects as any of the foregoing method embodiments corresponding thereto.

此外,根据本发明实施例公开的方法还可以被实现为由CPU 执行的计算机程序,该计算机程序可以存储在计算机可读存储介质中。在该计算机程序被CPU 执行时,执行本发明实施例公开的方法中限定的上述功能。In addition, the method disclosed according to the embodiment of the present invention can also be implemented as a computer program executed by a CPU, and the computer program can be stored in a computer-readable storage medium. When the computer program is executed by the CPU, the above functions defined in the methods disclosed in the embodiments of the present invention are executed.

基于上述目的,本发明的实施例的第二个方面,提出了一种异构加速平台的板卡信息获取的设备,如图4所示,设备200包括:Based on the above purpose, the second aspect of the embodiment of the present invention proposes a device for obtaining board information of a heterogeneous acceleration platform. As shown in FIG. 4 , the device 200 includes:

获取模块201,获取模块201配置为获取板卡的信息数据,并每间隔阈值时间将信息数据发送到FPGA中;Acquisition module 201, the acquisition module 201 is configured to obtain the information data of the board, and send the information data to the FPGA every interval threshold time;

存储模块202,存储模块202配置为响应于FPGA接收到信息数据,将信息数据存储到相应的RAM中;A storage module 202, the storage module 202 is configured to store the information data into a corresponding RAM in response to the FPGA receiving the information data;

解析模块203,解析模块203配置为经由PCIe Bar空间定期获取RAM中的信息数据,并对信息数据进行校验和解析;An analysis module 203, the analysis module 203 is configured to regularly obtain the information data in the RAM via the PCIe Bar space, and check and analyze the information data;

显示模块204,显示模块204配置为将解析后的信息数据通过显示装置进行显示。A display module 204, the display module 204 is configured to display the analyzed information data through a display device.

在本发明的一个优选实施例中,获取模块还配置为:In a preferred embodiment of the present invention, the acquisition module is also configured as:

使用MCU读出存储设备中存储的板卡的信息数据,并将信息数据存储到缓存中;Use the MCU to read out the information data of the board stored in the storage device, and store the information data in the cache;

MCU每间隔阈值时间将缓存中的信息数据经由PCIe接口上的SMbus总线发送到FPGA。The MCU sends the information data in the cache to the FPGA via the SMbus bus on the PCIe interface every threshold time interval.

在本发明的一个优选实施例中,还包括标识模块,标识模块配置为:In a preferred embodiment of the present invention, it also includes an identification module, and the identification module is configured as:

在每间隔阈值时间将信息数据发送到FPGA中之前将更新状态标识进行置位;Set the update status flag before sending the information data to the FPGA every threshold time interval;

在将信息数据发送到FPGA之后将更新状态标识进行复位。Reset the update status flag after sending the message data to the FPGA.

在本发明的一个优选实施例中,还包括判断模块,判断模块配置为:In a preferred embodiment of the present invention, a judging module is also included, and the judging module is configured as:

在经由PCIe Bar空间获取RAM中的信息数据之前判断更新状态标识的状态;Before obtaining the information data in the RAM via the PCIe Bar space, judge the status of the update status flag;

响应于更新状态标识为置位状态,停止获取并提示用户板卡信息正在更新;In response to the update status flag being set, stop acquiring and prompt the user that the board information is being updated;

响应于更新状态标识为复位状态,继续获取RAM中的信息数据。In response to the update state being identified as the reset state, continue to obtain information data in the RAM.

基于上述目的,本发明的实施例的第三个方面,提出了一种计算机设备,包括:Based on the above purpose, a third aspect of the embodiments of the present invention proposes a computer device, including:

至少一个处理器;以及at least one processor; and

存储器,存储器存储有可在处理器上运行的计算机程序,其特征在于,处理器执行程序时执行如上任意一项的方法。A memory, the memory stores a computer program that can run on the processor, and it is characterized in that, when the processor executes the program, any one of the methods above is executed.

基于上述目的,本发明的实施例的第四个方面,提出了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时执行上述任意一项的方法。Based on the above purpose, the fourth aspect of the embodiments of the present invention proposes a computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, any one of the above-mentioned methods is performed.

需要特别指出的是,上述系统的实施例采用了上述方法的实施例来具体说明各模块的工作过程,本领域技术人员能够很容易想到,将这些模块应用到上述方法的其他实施例中。It should be pointed out that the embodiments of the above-mentioned system use the embodiments of the above-mentioned method to specifically illustrate the working process of each module, and those skilled in the art can easily think of applying these modules to other embodiments of the above-mentioned method.

本领域技术人员还将明白的是,结合这里的公开所描述的各种示例性逻辑块、模块、电路和算法步骤可以被实现为电子硬件、计算机软件或两者的组合。为了清楚地说明硬件和软件的这种可互换性,已经就各种示意性组件、方块、模块、电路和步骤的功能对其进行了一般性的描述。这种功能是被实现为软件还是被实现为硬件取决于具体应用以及施加给整个系统的设计约束。本领域技术人员可以针对每种具体应用以各种方式来实现的功能,但是这种实现决定不应被解释为导致脱离本发明实施例公开的范围。Those of skill would also appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described generally in terms of their functionality. Whether such functionality is implemented as software or as hardware depends upon the particular application and design constraints imposed on the overall system. Those skilled in the art may implement the functions in various ways for each specific application, but such implementation decisions should not be interpreted as causing a departure from the scope disclosed in the embodiments of the present invention.

上述实施例,特别是任何“优选”实施例是实现的可能示例,并且仅为了清楚地理解本发明的原理而提出。可以在不脱离本文所描述的技术的精神和原理的情况下对上述实施例进行许多变化和修改。所有修改旨在被包括在本公开的范围内并且由所附权利要求保护。The above-described embodiments, particularly any "preferred" embodiments, are possible examples of implementations, and were merely presented for a clear understanding of the principles of the invention. Many changes and modifications can be made to the above-described embodiments without departing from the spirit and principles of the technology described herein. All modifications are intended to be included within the scope of this disclosure and protected by the appended claims.

Claims (7)

1. A method for acquiring board card information of a heterogeneous acceleration platform is characterized by comprising the following steps:
reading out information data of a board card stored in a storage device by using an MCU (microprogrammed control unit), and storing the information data into a cache;
sending the information data in the cache to an FPGA (field programmable gate array) through an SMbus on a PCIe (peripheral component interconnect express) interface by using the MCU at intervals of threshold time;
responding to the FPGA receiving the information data, storing the information data into a RAM on a corresponding chip, and mounting the RAM onto a PCIe Bar space through the FPGA so that a server side accesses the RAM by accessing the PCIe Bar space;
the information data in the RAM is regularly acquired through a PCIe Bar space, and the information data is checked and analyzed;
and displaying the analyzed information data through a display device.
2. The method of claim 1, further comprising:
setting the update state identifier before sending the information data to the FPGA at each threshold interval time;
resetting the update status flag after sending the information data to the FPGA.
3. The method of claim 2, further comprising:
judging the state of the update state identifier before acquiring the information data in the RAM through a PCIe Bar space;
in response to the update state identification being in a set state, stopping acquiring and prompting that the user board card information is being updated;
and responding to the updating state mark as a reset state, and continuously acquiring the information data in the RAM.
4. The method of claim 1, wherein the information data includes FRU information for the board and status information for the board.
5. The utility model provides an equipment that heterogeneous integrated circuit board information of accelerating platform obtained which characterized in that, equipment includes:
the acquisition module is configured to read out information data of the board card stored in the storage device by using the MCU and store the information data into a cache;
the obtaining module is further configured to send the information data in the cache to the FPGA via an SMbus on a PCIe interface at every threshold interval time by using the MCU;
the storage module is configured to respond to the information data received by the FPGA, store the information data into the RAM on the corresponding chip, and mount the RAM onto a PCIe Bar space through the FPGA so that a server side can access the RAM by accessing the PCIe Bar space;
the analysis module is configured to periodically acquire the information data in the RAM via a PCIe Bar space and check and analyze the information data;
and the display module is configured to display the analyzed information data through a display device.
6. A computer device, comprising:
at least one processor; and
memory storing a computer program operable on the processor, wherein the processor, when executing the program, performs the method of any of claims 1-4.
7. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method of any one of claims 1 to 4.
CN202010818566.0A 2020-08-14 2020-08-14 Board card information acquisition method and device for heterogeneous acceleration platform Active CN112100023B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010818566.0A CN112100023B (en) 2020-08-14 2020-08-14 Board card information acquisition method and device for heterogeneous acceleration platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010818566.0A CN112100023B (en) 2020-08-14 2020-08-14 Board card information acquisition method and device for heterogeneous acceleration platform

Publications (2)

Publication Number Publication Date
CN112100023A CN112100023A (en) 2020-12-18
CN112100023B true CN112100023B (en) 2023-01-06

Family

ID=73752976

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010818566.0A Active CN112100023B (en) 2020-08-14 2020-08-14 Board card information acquisition method and device for heterogeneous acceleration platform

Country Status (1)

Country Link
CN (1) CN112100023B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115061852B (en) * 2022-08-15 2022-11-18 广东科伺智能科技有限公司 Functional board card, production system of functional board card and use method of servo system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108920334A (en) * 2018-07-25 2018-11-30 郑州云海信息技术有限公司 A kind of monitoring device of FPGA isomery accelerator card
CN109614293A (en) * 2018-12-13 2019-04-12 广东浪潮大数据研究有限公司 A kind of management system and method for FPGA isomery accelerator card
CN111124828A (en) * 2019-12-22 2020-05-08 苏州浪潮智能科技有限公司 A data processing method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108920334A (en) * 2018-07-25 2018-11-30 郑州云海信息技术有限公司 A kind of monitoring device of FPGA isomery accelerator card
CN109614293A (en) * 2018-12-13 2019-04-12 广东浪潮大数据研究有限公司 A kind of management system and method for FPGA isomery accelerator card
CN111124828A (en) * 2019-12-22 2020-05-08 苏州浪潮智能科技有限公司 A data processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112100023A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
TWI229796B (en) Method and system to implement a system event log for system manageability
CN108376107A (en) A kind of method, apparatus, equipment and the storage medium of server failure detection
CN102722431B (en) process monitoring method and device
CN107656856B (en) A system state display method and device based on CPLD
CN108287775A (en) A kind of method, apparatus, equipment and the storage medium of server failure detection
CN103176876B (en) A kind of computer On-line self-diagnosis method of highly effective and safe and self-checking unit
CN112241160B (en) Vehicle testing method and device, vehicle detection system and test board card
CN110362435B (en) PCIE fault positioning method, device, equipment and medium for Purley platform server
CN103631685A (en) Fault self-inspection system and method
WO2023222109A1 (en) Network wakeup management method and apparatus, electronic device, and storage medium
CN108268361A (en) A kind of method, system, device and the storage medium of BMC monitoring GPU
CN113392090B (en) Data verification method, device, equipment and medium based on database migration
CN103810099B (en) Code tracing method and code tracing system
CN116662123B (en) Method and device for monitoring server component, electronic equipment and storage medium
CN115543746A (en) Graphics processor monitoring method, system and device and electronic equipment
CN112100023B (en) Board card information acquisition method and device for heterogeneous acceleration platform
CN107783844A (en) A kind of computer program operation exception detection method, device and medium
CN114328103A (en) Method, system and related equipment for OpenBMC monitoring and management of discrete sensor
CN115934446A (en) A self-test method, server, device and storage medium
US8738939B2 (en) System and method for testing WOL function of computers
CN116743619B (en) Network service testing method, device, equipment and storage medium
CN107612755A (en) The management method and its device of a kind of cloud resource
CN115599617B (en) Bus detection method, device, server and electronic equipment
CN116431731A (en) Data asynchronous export method, device, equipment and storage medium thereof
CN116684302A (en) Test method and device for vehicle Ethernet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: Building 9, No.1, guanpu Road, Guoxiang street, Wuzhong Economic Development Zone, Wuzhong District, Suzhou City, Jiangsu Province

Patentee after: Suzhou Yuannao Intelligent Technology Co.,Ltd.

Country or region after: China

Address before: Building 9, No.1, guanpu Road, Guoxiang street, Wuzhong Economic Development Zone, Wuzhong District, Suzhou City, Jiangsu Province

Patentee before: SUZHOU LANGCHAO INTELLIGENT TECHNOLOGY Co.,Ltd.

Country or region before: China