CN107832255A

CN107832255A - The optimization method of dynamic requests reconfigurable core during a kind of operation

Info

Publication number: CN107832255A
Application number: CN201710827202.7A
Authority: CN
Inventors: 胡威; 沈欢; 吕向宇; 郭宏; 蒋旻; 张凯; 刘小明; 刘俊; 王磊; 贺娟娟
Original assignee: Wuhan University of Science and Technology WHUST
Current assignee: Wuhan University of Science and Technology WHUST
Priority date: 2017-09-14
Filing date: 2017-09-14
Publication date: 2018-03-23
Anticipated expiration: 2037-09-14
Also published as: CN107832255B

Abstract

The present invention discloses an optimization method for dynamically requesting a reconfigurable core during operation, comprising steps: S1, configuring a general reconfigurable core: configuring the general reconfigurable core on an FPGA; S2, dividing a reconfigurable task: Divide the reconfigurable task into reconfigurable subsystem and software subsystem. R is composed of m hardware modules, m≥1; the software subsystem is a part that must be executed by software, denoted as S, and the software subsystem S is composed of n software modules, n≥2; S3, on a general-purpose processor Execute reconfigurable tasks: schedule reconfigurable tasks to be executed on general-purpose processors, execute the first software module of reconfigurable tasks; S4, dynamically request reconfigurable cores during runtime: execute software modules of reconfigurable tasks , dynamically request reconfigurable cores. The invention can realize the optimal use of reconfigurable cores and improve the efficiency of reconfigurable task execution.

Description

An Optimization Method for Dynamically Requesting Reconfigurable Cores at Runtime

技术领域technical field

本发明涉及一种运行时动态请求可重构核的优化方法，属于可重构技术领域。The invention relates to an optimization method for dynamically requesting a reconfigurable core during operation, and belongs to the technical field of reconfigurability.

背景技术Background technique

可重构计算被视为能够将传统处理器的高度灵活性与ASIC（ApplicationSpecific Integrated Circuit）所具有的高处理效率进行结合的有效解决方案。由于可重构体系结构具有较好的适应性，针对不同应用能够通过不同粒度的并行来加快处理速度。在可重构设备中，FPGA（Field－Programmable Gate Array）是最广泛使用的可重构器件。动态可充配置的FPGA是实现硬件级别多任务的重要基础。Reconfigurable computing is regarded as an effective solution that can combine the high flexibility of traditional processors with the high processing efficiency of ASIC (Application Specific Integrated Circuit). Due to the good adaptability of the reconfigurable architecture, the processing speed can be accelerated through different granularities of parallelism for different applications. Among reconfigurable devices, FPGA (Field-Programmable Gate Array) is the most widely used reconfigurable device. Dynamically rechargeable and configurable FPGA is an important basis for realizing hardware-level multitasking.

可重构硬件任务的调度管理需要针对哪些任务在何时使用硬件、给每个任务分配多少可重构资源等问题作出优化的决策。这些决策可以基于静态的程序信息在编译时制定，可以在运行时基于系统的动态状态制定，也可以基于前述两种的组合来制定。调度的优化目标可以是最大化单一应用或者系统整体的性能、最小化能耗或者满足实时任务的截止时限等，此外还需要考虑减少和调度和重配置等开销。The scheduling management of reconfigurable hardware tasks needs to make optimal decisions on which tasks use hardware when and how many reconfigurable resources are allocated to each task. These decisions can be made at compile time based on static program information, at runtime based on the dynamic state of the system, or based on a combination of the two. The optimization goal of scheduling can be to maximize the performance of a single application or the overall system, minimize energy consumption, or meet the deadline of real-time tasks, etc. In addition, it also needs to consider reducing and scheduling and reconfiguration overhead.

静态调度依赖于对程序行为的分析、剖析和标记来决定何时一个硬件任务需要被配置。通过这些分析、剖析和标记，静态调度可以获得更加全局的任务需求等信息。而且，静态调度器一般在是离线（offline）工作，有些调度甚至在编译时进行，几乎不需要考虑调度自身的开销。因此，一些复杂度较高的调度算法可以用于静态调度，如遗传算法等。Static scheduling relies on the analysis, profiling, and marking of program behavior to determine when a hardware task needs to be allocated. Through these analysis, profiling and marking, static scheduling can obtain more global task requirements and other information. Moreover, the static scheduler generally works offline, and some scheduling is even performed at compile time, so there is almost no need to consider the overhead of scheduling itself. Therefore, some scheduling algorithms with high complexity can be used for static scheduling, such as genetic algorithm.

由于静态调度能够掌握程序的全局信息，因此，可以根据静态分析得到的程序流图（Application Flow Graph）获知特定硬件任务需要的时间，继而插入硬件任务的预取指令，通过将重叠硬件配置和其他任务的执行来隐藏重配置的开销。Since static scheduling can grasp the global information of the program, the time required for a specific hardware task can be known according to the application flow graph (Application Flow Graph) obtained by static analysis, and then the prefetch instruction of the hardware task can be inserted. By overlapping the hardware configuration and other Task execution to hide the overhead of reconfiguration.

一般来说，静态调度可以获得更优化的结果，但是其缺点也显而易见，为了保证调度的有效性，静态调度要求任务集（包括到达时间、执行时间和任务资源需求）和资源可用性必须是可预测的。然而，由于无法获知程序和系统的运行时状态，一旦静态预测的结果错误，静态调度的结果不仅无法得到良好的性能，反而将会带来严重的负面影响，预取也可能导致额外的重配置代价。Generally speaking, static scheduling can achieve more optimal results, but its disadvantages are also obvious. In order to ensure the effectiveness of scheduling, static scheduling requires task sets (including arrival time, execution time, and task resource requirements) and resource availability must be predictable of. However, since the running state of the program and the system cannot be known, once the result of static prediction is wrong, the result of static scheduling will not only fail to achieve good performance, but will bring serious negative impact, and prefetching may also lead to additional reconfiguration cost.

动态调度，也称运行时调度或在线调度，它根据系统的运行时信息在线地做出调度决策。运行时信息包括当前系统负载、资源利用状态、前一阶段的任务频率和任务性能等。动态调度通常要求快速做出调度决策，减少系统的调度开销，因此调度结果可能不是最优的。Dynamic scheduling, also called runtime scheduling or online scheduling, makes scheduling decisions online based on system runtime information. Runtime information includes current system load, resource utilization status, task frequency and task performance in the previous stage, etc. Dynamic scheduling usually requires fast scheduling decisions to reduce system scheduling overhead, so the scheduling results may not be optimal.

一些动态调度方法根据数据流图（Dataflow Graph）来决定哪些硬件任务应该被配置执行，在这种方法里，只有在数据流图中一个节点的所有前辈节点都已被执行的情况下，该节点才有可能被调度执行。另一些动态调度算法不考虑程序的流图，而将时间分为“窗口”，以窗口为间隔，根据系统运行时状态周期性地执行调度算法，为下一个窗口选择需要配置执行的任务[127，150-151]。这两种方法的主要区别在于：前者在任务到达时进行调度决策，调度开销随着任务到达次数的正相关；而后者以“窗口”为间隔进行调度决策，调度开销是固定的；前者调度及时，有利于性能优化，而后者则有利于调度开销的控制；但前者只适用于相互之间有依赖关系的硬件任务的调度场景，后者适用于所有场景。Some dynamic scheduling methods determine which hardware tasks should be configured and executed according to the data flow graph (Dataflow Graph). In this method, only when all predecessor nodes of a node in the data flow graph have been executed, the node It is possible to be scheduled for execution. Other dynamic scheduling algorithms do not consider the flow graph of the program, but divide the time into "windows", take the window as an interval, and periodically execute the scheduling algorithm according to the system runtime state, and select the tasks that need to be configured and executed for the next window [127 , 150-151]. The main difference between these two methods is: the former makes scheduling decisions when tasks arrive, and the scheduling overhead is positively correlated with the number of task arrivals; while the latter makes scheduling decisions at intervals of "windows", and the scheduling overhead is fixed; the former schedules in a timely manner. , which is conducive to performance optimization, while the latter is beneficial to the control of scheduling overhead; but the former is only applicable to the scheduling scenarios of hardware tasks that are dependent on each other, while the latter is applicable to all scenarios.

动态调度算法的开销直接影响系统的性能，是需要考虑的重要因素之一。一些研究中采用传统的动态调度方法，如先到先服务(FCFS)、最短任务优先(SJF)、最短剩余处理时间优先(SRPT)、最近最频繁使用优先(MRU)、最大加速比优先(HS)和最早截止时间优先(EDF)等。这些传统调度方法算法简单，运行速度快，开销较小，但是由于硬件任务不同于通用处理器上的传统软件任务，硬件任务带来的性能好处、需要的可重构资源都因任务而异，且任务的加载/切换（重配置）代价较高，所以，这些简单的调度算法可能无法获得较好的调度结果。The overhead of the dynamic scheduling algorithm directly affects the performance of the system and is one of the important factors to be considered. Some studies use traditional dynamic scheduling methods, such as First Come First Serve (FCFS), Shortest Task First (SJF), Shortest Remaining Processing Time First (SRPT), Recently Most Frequently Used First (MRU), Maximum Speed Up First (HS ) and Earliest Deadline First (EDF), etc. These traditional scheduling methods have simple algorithms, fast running speed, and low overhead. However, because hardware tasks are different from traditional software tasks on general-purpose processors, the performance benefits brought by hardware tasks and the required reconfigurable resources vary from task to task. And task loading/switching (reconfiguration) costs are high, so these simple scheduling algorithms may not be able to obtain better scheduling results.

因此，一些动态调度采用更加复杂的基于优先级的方法，优先级综合考虑任务需要的可重构资源、带来的性能提升、执行时间、当前资源的配置状态、甚至执行频次等因素。Therefore, some dynamic scheduling adopts a more complex priority-based method. The priority comprehensively considers factors such as reconfigurable resources required by the task, performance improvement, execution time, current resource configuration status, and even execution frequency.

为了给调度算法提供更多的程序信息，并降低运行时调度开销，作出更加高效的调度决策。一些研究采用设计时和运行时相结合的混合调度方法。Resano等人的研究采用设计时和运行时相结合的混合调度方法，在设计时完成调度算法中计算密集的部分，而在运行时进行较少开销来处理非确定性的动态行为。他们基于TCM提出了面向可重构多任务的混合调度方法，在设计时为每个任务生成可行的近似最优调度序列集合；在运行时根据动态信息，从调度序列集合中为每个任务选择一个最合适调度。然而，在任务数较多、不同情况多的局面下，生成所有可行近似优的调度序列的代价很高。因此，他们又提出了一种改进的启发式混合调度方法，在设计阶段完成处关键路径上子任务的识别，基于所有关键子任务一直被配置的前提假设，计算生成所有子任务的启发式预取调度，该调度结果将作为运行阶段的调度输入；而在运行时资源不足够配置所有关键子任务时，运行时调度只需要根据设计时计算的加载顺序重配置子任务，确保关键子任务在执行开始时已经被配置。In order to provide more program information to the scheduling algorithm, and reduce runtime scheduling overhead, make more efficient scheduling decisions. Some studies use a hybrid scheduling approach that combines design-time and run-time. Resano et al.'s research adopts a hybrid scheduling method that combines design time and runtime, and completes the computationally intensive part of the scheduling algorithm at design time, while dealing with non-deterministic dynamic behavior with less overhead at runtime. Based on TCM, they proposed a hybrid scheduling method for reconfigurable multi-tasks, which generates a feasible approximate optimal scheduling sequence set for each task at design time; selects from the scheduling sequence set for each task according to dynamic information at runtime. A best fit schedule. However, in the case of a large number of tasks and many different situations, the cost of generating all feasible approximately optimal scheduling sequences is very high. Therefore, they proposed an improved heuristic hybrid scheduling method. The identification of subtasks on the critical path is completed in the design stage. The scheduling result will be used as the scheduling input in the running phase; and when the runtime resources are not enough to configure all the key subtasks, the runtime scheduling only needs to reconfigure the subtasks according to the loading order calculated at design time to ensure that the key subtasks are in the Already configured when execution starts.

总体来说，在所有任务信息可获知的前提下，静态调度可以提供最高效的结果，但是在任务和资源不确定的情况下，真实的任务序列和动态资源状态只有在运行时才能确定，因此静态调度在这种情况下并不适用。而动态调度虽然可以根据系统的运行时信息在线地做出调度决策，但其开销过高，通常混合调度通常能够获得较好的性能与开销的折中。Generally speaking, under the premise that all task information is known, static scheduling can provide the most efficient results, but in the case of uncertain tasks and resources, the real task sequence and dynamic resource status can only be determined at runtime, so Static scheduling does not apply in this case. Although dynamic scheduling can make scheduling decisions online based on system runtime information, its overhead is too high, and hybrid scheduling can usually achieve a better compromise between performance and overhead.

发明内容Contents of the invention

本发明为了克服以上技术的不足，提供了一种运行时动态请求可重构核的优化方法。In order to overcome the shortcomings of the above technologies, the present invention provides an optimization method for dynamically requesting reconfigurable cores during runtime.

本发明克服其技术问题所采用的技术方案是：The technical scheme that the present invention overcomes its technical problem adopts is:

一种运行时动态请求可重构核的优化方法，包括步骤如下：An optimization method for dynamically requesting a reconfigurable core during runtime, comprising the following steps:

S1、配置通用可重构核S1. Configure a general reconfigurable core

将通用可重构核配置到FPGA上；Configure the general-purpose reconfigurable core on the FPGA;

S2、切分可重构任务S2. Segment reconfigurable tasks

将可重构任务切分为可重构子系统和软件子系统两部分：所述可重构子系统为可构建为可重构核、由硬件执行的部分，记为R，设可重构子系统R由m个硬件模块构成，其中m≥1；所述软件子系统为必须由软件执行的部分，记为S，设软件子系统S由n个软件模块构成，其中 n≥1；Divide the reconfigurable task into two parts: the reconfigurable subsystem and the software subsystem: the reconfigurable subsystem is a part that can be constructed as a reconfigurable core and executed by hardware, denoted as R, and the reconfigurable The subsystem R is composed of m hardware modules, wherein m≥1; the software subsystem is a part that must be executed by software, denoted as S, and the software subsystem S is composed of n software modules, wherein n≥1;

S3、在通用处理器上执行可重构任务S3, Execute reconfigurable tasks on general-purpose processors

将可重构任务调度到通用处理器上执行，执行可重构任务的第一个软件模块；Scheduling the reconfigurable task to the general-purpose processor to execute the first software module of the reconfigurable task;

S4、运行时动态请求可重构核S4. Dynamically request reconfigurable cores at runtime

可重构任务的软件模块执行时，动态请求可重构核。When the software module of the reconfigurable task is executed, the reconfigurable core is dynamically requested.

本发明优选的，所述步骤S1中，通用可重构核是指完成通用性计算任务的可重构核，可以被不同的可重构任务使用；通用可重构核由与该通用可重构核对应的流文件实现，将通用可重构核配置到FPGA上，就是将对应的流文件烧写到FPGA上。Preferably in the present invention, in the step S1, the general reconfigurable core refers to a reconfigurable core that completes general computing tasks and can be used by different reconfigurable tasks; the general reconfigurable core is composed of The implementation of the flow file corresponding to the construction core, configuring the general reconfigurable core on the FPGA is to program the corresponding flow file to the FPGA.

本发明优选的，所述步骤S2中，可重构子系统R={RH1，RH2，…，RHm}，R中的任意一个模块RHi可以单独构建为可重构核的模块，同时模块RHi也可以使用软件来实现，用软件实现的模块RHi记为S（RHi），其中1≤i≤m；软件子系统S={SS1，SS2，…，SSn}，S中的任意一个模块SSj能且只能由软件执行，其中1≤j≤n；对于任何可重构任务，在可重构任务执行序列中的第一个执行的模块是软件模块。Preferably in the present invention, in the step S2, the reconfigurable subsystem R={RH1, RH2, ..., RHm}, any module RHi in R can be separately constructed as a reconfigurable core module, and the module RHi is also It can be realized by software, and the module RHi implemented by software is denoted as S(RHi), where 1≤i≤m; the software subsystem S={SS1, SS2,...,SSn}, any module SSj in S can and Can only be executed by software, where 1≤j≤n; for any reconfigurable task, the first executed module in the reconfigurable task execution sequence is a software module.

本发明优选的，所述步骤S2中，Preferably in the present invention, in the step S2,

所述每个硬件模块包括以下属性：1）可重构核的配置时间WCRT，表示这个硬件模块在FPGA上重构为硬件所需要的时间；2）可重构核的执行时间WCET1，表示执行这个硬件模块所需完成的任务所需要的最大时间；3）软件执行时间WCST，表示这个硬件模块所对应的软件实现执行这个硬件模块所需完成的任务所需要的最大时间；Each hardware module includes the following attributes: 1) The configuration time WCRT of the reconfigurable core indicates the time required for this hardware module to be reconfigured into hardware on the FPGA; 2) The execution time WCET1 of the reconfigurable core indicates the execution time The maximum time required for the task to be completed by this hardware module; 3) The software execution time WCST indicates the maximum time required for the software corresponding to this hardware module to implement the task required to be completed by this hardware module;

每个软件模块包括以下属性：软件模块的执行时间WCET2，表示这个软件模块完成任务所需要的最大时间。Each software module includes the following attributes: The execution time WCET2 of the software module indicates the maximum time required by the software module to complete the task.

本发明优选的，对于任何一个可重构核RHi，如果WCRT（RHi）+WCET1（RHi）＞WCST（RHi），在执行时优先使用S（RHi）完成计算任务；如果WCRT（RHi）+WCET1（RHi）≤WCST（RHi），在执行时优先使用RHi完成计算任务。Preferably in the present invention, for any reconfigurable core RHi, if WCRT (RHi) + WCET1 (RHi) > WCST (RHi), S (RHi) is preferred to complete the calculation task during execution; if WCRT (RHi) + WCET1 (RHi) ≤ WCST (RHi), RHi is given priority to complete computing tasks during execution.

本发明优选的，所述步骤S4中，对可重构核的动态请求包括两类：Preferably in the present invention, in the step S4, the dynamic request to the reconfigurable core includes two types:

第一类，对通用可重构核的请求；The first category, requests for general reconfigurable cores;

第二类，对硬件模块所对应的可重构核的请求。The second category is a request for a reconfigurable core corresponding to a hardware module.

进一步的，所述第一类中，对通用可重构核的请求，如果通用可重构核可用，则使用通用可重构核完成计算；如果通用可重构核不可用，则：Further, in the first category, for the request for the general reconfigurable core, if the general reconfigurable core is available, the general reconfigurable core is used to complete the calculation; if the general reconfigurable core is not available, then:

1）如果有足够的FPGA空间，计算等待时间WT=通用可重构核的WCRT+通用可重构核的WCET1；如果WT＞通用可重构核的WCST，则使用通用可重构核对应的软件模块完成计算任务；如果WT≤通用可重构核的WCST，则重新在FPGA上配置一个通用可重构核并使用该通用可重构核完成计算任务；1) If there is enough FPGA space, calculate the waiting time WT = WCRT of the general reconfigurable core + WCET1 of the general reconfigurable core; if WT > WCST of the general reconfigurable core, use the corresponding software of the general reconfigurable core The module completes the calculation task; if WT≤WCST of the general reconfigurable core, reconfigure a general reconfigurable core on the FPGA and use the general reconfigurable core to complete the calculation task;

2）如果没有足够的FPGA空间，则使用通用可重构核对应的软件模块完成计算任务。2) If there is not enough FPGA space, use the software module corresponding to the general reconfigurable core to complete the computing task.

进一步的，所述第二类中，对硬件模块所对应的可重构核的请求，Further, in the second category, the request for the reconfigurable core corresponding to the hardware module,

1）如果硬件模块所对应的可重构核已经配置到了FPGA上并且可用，则使用对应的可重构核完成对应的计算任务；1) If the reconfigurable core corresponding to the hardware module has been configured on the FPGA and is available, use the corresponding reconfigurable core to complete the corresponding computing task;

2）如果硬件模块所对应的可重构核已经配置到了FPGA上但是不可用，则计算：等待时间WT=可重构核的WCET1*2，则：2) If the reconfigurable core corresponding to the hardware module has been configured on the FPGA but is unavailable, calculate: waiting time WT=WCET1*2 of the reconfigurable core, then:

a）如果WT＞可重构核的WCST，则使用可重构核对应的软件模块完成计算任务；a) If WT > WCST of the reconfigurable core, use the software module corresponding to the reconfigurable core to complete the computing task;

b）如果WT≤可重构核的WCST，则等待可重构核完成当前计算任务，再使用可重构核完成计算任务；b) If WT≤WCST of the reconfigurable core, wait for the reconfigurable core to complete the current computing task, and then use the reconfigurable core to complete the computing task;

3）硬件模块所对应的可重构核没有配置到FPGA，则：3) If the reconfigurable core corresponding to the hardware module is not configured in the FPGA, then:

a）如果没有足够的FPGA空间，则使用可重构核对应的软件模块完成计算任务；a) If there is not enough FPGA space, use the software module corresponding to the reconfigurable core to complete the computing task;

b）如果有足够的FPGA空间，计算等待时间WT=可重构核的WCRT+可重构核的WCET1，则：b) If there is enough FPGA space, calculate the waiting time WT = WCRT of the reconfigurable core + WCET1 of the reconfigurable core, then:

ⅰ）如果WT＞可重构核的WCST，则使用可重构核对应的软件模块完成计算任务；i) If WT>WCST of the reconfigurable core, use the software module corresponding to the reconfigurable core to complete the computing task;

ⅱ）如果WT≤可重构核的WCST，则将可重构核配置到FPGA上完成计算任务。ii) If WT ≤ WCST of the reconfigurable core, configure the reconfigurable core to the FPGA to complete the computing task.

本发明的有益效果是：The beneficial effects of the present invention are:

本发明将可重构任务中的硬件模块切分出来，为可重构任务的执行建立了动态请求方法。不管是通用可重构核还是可重构核，都在运行时动态判断可用情况或者可配置情况，从而能够实现可重构核的优化使用，进而提高可重构任务执行的效率。其主要特点如下：The invention separates the hardware modules in the reconfigurable task, and establishes a dynamic request method for the execution of the reconfigurable task. Regardless of whether it is a general-purpose reconfigurable core or a reconfigurable core, the available or configurable conditions are dynamically judged at runtime, so that the optimized use of the reconfigurable core can be realized, and the efficiency of reconfigurable task execution can be improved. Its main features are as follows:

(1)高效性：在传统的可重构任务执行时，其中使用可重构核的部分会在运行时直接配置到FPGA上并进行执行，存在众多等待时间；而本发明则是根据运行时的情况进行动态的安排，从而能够更高效率的完成可重构任务的执行。(1) Efficiency: When traditional reconfigurable tasks are executed, the part using the reconfigurable core will be directly configured on the FPGA and executed at runtime, and there is a lot of waiting time; and the present invention is based on runtime The situation can be dynamically arranged, so that the execution of reconfigurable tasks can be completed more efficiently.

(2)灵活性：传统的可重构任务执行，往往完全限定了硬件模块的执行必须使用可重构核，这就使得可重构任务的执行依赖于可重构核的可用情况、FPGA空间的可用情况等；而本发明则提供了软件模块执行的方案，根据运行时状态进行灵活安排，从而实现可重构任务执行的灵活性。(2) Flexibility: Traditional reconfigurable task execution often completely limits the execution of hardware modules to use reconfigurable cores, which makes the execution of reconfigurable tasks depend on the availability of reconfigurable cores and FPGA space available conditions, etc.; while the present invention provides a software module execution solution, which is flexibly arranged according to the runtime state, so as to realize the flexibility of reconfigurable task execution.

附图说明Description of drawings

图1为本实施例的实现流程示意图。FIG. 1 is a schematic diagram of the implementation flow of this embodiment.

图2为本实施例的可重构任务切分的示意图。FIG. 2 is a schematic diagram of reconfigurable task segmentation in this embodiment.

图3为本实施例的可重构任务P具体切分的示意图。FIG. 3 is a schematic diagram of specific segmentation of the reconfigurable task P in this embodiment.

具体实施方式Detailed ways

为了便于本领域人员更好的理解本发明，下面结合附图和具体实施例对本发明做进一步详细说明，下述仅是示例性的不限定本发明的保护范围。In order to facilitate those skilled in the art to better understand the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments. The following is only exemplary and does not limit the protection scope of the present invention.

本发明所述的一种运行时动态请求可重构核的优化方法，如图1所示，包括步骤如下：An optimization method for dynamically requesting a reconfigurable core at runtime according to the present invention, as shown in Figure 1, includes the following steps:

S1、配置通用可重构核S1. Configure a general reconfigurable core

将通用可重构核配置到FPGA上，所述通用可重构核是指完成通用性计算任务的可重构核，可以被不同的可重构任务使用；通用可重构核由与该通用可重构核对应的流文件实现，将通用可重构核配置到FPGA上，就是将对应的流文件烧写到FPGA上。每个通用可重构核都有一个对应的软件模块来实现这个通用可重构核所完成的功能，如果通用可重构核不可用，则可以使用对应的软件模块来完成对应的功能。Configure the general-purpose reconfigurable core on the FPGA. The general-purpose reconfigurable core refers to the reconfigurable core that completes the general-purpose computing task and can be used by different reconfigurable tasks; the general-purpose reconfigurable core is composed of the general-purpose The implementation of the flow file corresponding to the reconfigurable core, configuring the general reconfigurable core on the FPGA is to program the corresponding flow file to the FPGA. Each general reconfigurable core has a corresponding software module to realize the function completed by this general reconfigurable core. If the general reconfigurable core is not available, the corresponding software module can be used to complete the corresponding function.

对于通用可重构核SP1和通用可重构核SP2，SP1实现了快速傅里叶变换的功能，SP2实现了数组变换的功能，都是常用的计算功能。SP1由与SP1对应的流文件File（SP1）实现，SP2由与SP2对应的流文件File（SP2）实现。将流文件File（SP1）和流文件File（SP2）烧写到FPGA上，SP1和SP2对应的软件模块分别为S（SP1）和S（SP2）。For the general reconfigurable core SP1 and the general reconfigurable core SP2, SP1 realizes the function of fast Fourier transform, and SP2 realizes the function of array transformation, both of which are commonly used computing functions. SP1 is realized by the stream file File (SP1) corresponding to SP1, and SP2 is realized by the stream file File (SP2) corresponding to SP2. Program the stream file File (SP1) and stream file (SP2) to the FPGA, and the software modules corresponding to SP1 and SP2 are S (SP1) and S (SP2) respectively.

S2、切分可重构任务S2. Segment reconfigurable tasks

将可重构任务切分为可重构子系统和软件子系统两部分：所述可重构子系统为可构建为可重构核、由硬件执行的部分，记为R，设可重构子系统R由m个硬件模块构成，其中m≥1，可重构子系统R={RH1，RH2，…，RHm}，R中的任意一个模块RHi可以单独构建为可重构核的模块，同时模块RHi也可以使用软件来实现，用软件实现的模块RHi记为S（RHi），其中1≤i≤m。所述软件子系统为必须由软件执行的部分，记为S，设软件子系统S由n个软件模块构成，其中 n≥1，软件子系统S={SS1，SS2，…，SSn}，S中的任意一个模块SSj能且只能由软件执行，其中1≤j≤n。对于任何可重构任务，在可重构任务执行序列中的第一个执行的模块是软件模块。Divide the reconfigurable task into two parts: the reconfigurable subsystem and the software subsystem: the reconfigurable subsystem is a part that can be constructed as a reconfigurable core and executed by hardware, denoted as R, and the reconfigurable The subsystem R is composed of m hardware modules, where m≥1, the reconfigurable subsystem R={RH1, RH2,...,RHm}, any module RHi in R can be separately constructed as a reconfigurable core module, At the same time, the module RHi can also be realized by software, and the module RHi realized by software is denoted as S(RHi), where 1≤i≤m. The software subsystem is a part that must be executed by software, denoted as S, and the software subsystem S is composed of n software modules, wherein n≥1, the software subsystem S={SS1, SS2, ..., SSn}, S Any module SSj in can and can only be executed by software, where 1≤j≤n. For any reconfigurable task, the first executed module in the reconfigurable task execution sequence is a software module.

如图2所示，图2左侧是完整的可重构任务执行序列，在进行切分之后，被切分为4个软件执行的软件模块和3个可硬件执行的硬件模块，如图2右侧所示。4软件执行的软件模块构成了可重构任务的软件子系统，3个可硬件执行的硬件模块构成了可重构任务的可重构子系统。可硬件执行的硬件模块中“可硬件执行”是指该模块既可以在FPGA上构建为可重构核、由硬件来执行完成，也可以直接由软件执行完成。As shown in Figure 2, the left side of Figure 2 is a complete reconfigurable task execution sequence. After segmentation, it is divided into 4 software modules for software execution and 3 hardware modules for hardware execution, as shown in Figure 2 shown on the right. The 4 software modules executed by software constitute the software subsystem of the reconfigurable task, and the 3 hardware modules capable of hardware execution constitute the reconfigurable subsystem of the reconfigurable task. "Hardware-executable" in hardware-executable hardware modules means that the module can be constructed as a reconfigurable core on the FPGA, executed by hardware, or directly executed by software.

本实施例所述每个硬件模块包括以下属性：Each hardware module described in this embodiment includes the following attributes:

1）可重构核的配置时间WCRT，表示这个硬件模块在FPGA上重构为硬件所需要的时间；1) The configuration time WCRT of the reconfigurable core indicates the time required for this hardware module to be reconfigured into hardware on the FPGA;

2）可重构核的执行时间WCET1，表示执行这个硬件模块所需完成的任务所需要的最大时间；2) The execution time WCET1 of the reconfigurable core indicates the maximum time required to execute the tasks required by this hardware module;

3）软件执行时间WCST，表示这个硬件模块所对应的软件实现执行这个硬件模块所需完成的任务所需要的最大时间；3) The software execution time WCST indicates the maximum time required by the software corresponding to the hardware module to execute the tasks required by the hardware module;

对于任何一个可重构核RHi，如果WCRT（RHi）+WCET1（RHi）＞WCST（RHi），在执行时优先使用S（RHi）完成计算任务；如果WCRT（RHi）+WCET1（RHi）≤WCST（RHi），在执行时优先使用RHi完成计算任务。For any reconfigurable core RHi, if WCRT (RHi) + WCET1 (RHi) > WCST (RHi), S (RHi) is preferred to complete the computing task during execution; if WCRT (RHi) + WCET1 (RHi) ≤ WCST (RHi), it is preferred to use RHi to complete computing tasks during execution.

具体地，对于可重构任务P，切分p为两个部分，被切分为4个软件执行的软件模块和3个可硬件执行的硬件模块，如图3所示，切分后P的可重构子系统R_p和软件子系统S_p分别为：Specifically, for a reconfigurable task P, split p into two parts, which are divided into four software modules that can be executed by software and three hardware modules that can be executed by hardware. As shown in Figure 3, after splitting, the Reconfigurable subsystem R _p and software subsystem S _p are respectively:

R_p={RH1，RH2，RH3 } _Rp = {RH1, RH2, RH3}

S_p={SS1，SS2，SS3，SS4}S _p = {SS1, SS2, SS3, SS4}

其中，对于硬件模块RH1、RH2和RH3来说，分别有：Among them, for the hardware modules RH1, RH2 and RH3, respectively:

1）WCRT（RH1）=rt1，WCET1（RH1）=et1，WCST（RH1）=st11) WCRT (RH1) = rt1, WCET1 (RH1) = et1, WCST (RH1) = st1

2）WCRT（RH2）=rt2，WCET1（RH2）=et2，WCST（RH2）=st22) WCRT (RH2) = rt2, WCET1 (RH2) = et2, WCST (RH2) = st2

3）WCRT（RH3）=rt3，WCET1（RH3）=et3，WCST（RH3）=st33) WCRT (RH3) = rt3, WCET1 (RH3) = et3, WCST (RH3) = st3

其中，对于软件模块SS1、SS2和SS3来说，分别有：Among them, for the software modules SS1, SS2 and SS3, there are respectively:

WCET2（SS1）=dt1WCET2(SS1)=dt1

WCET2（SS2）=dt2WCET2 (SS2) = dt2

WCET2（SS3）=dt3WCET2 (SS3) = dt3

WCET2（SS4）=dt4WCET2 (SS4) = dt4

具体地，对于可重构任务P，将P调度到通用处理器上执行，则在通用处理器上执行的是P的第一个软件模块SS1。Specifically, for the reconfigurable task P, if P is scheduled to be executed on a general-purpose processor, then the first software module SS1 of P is executed on the general-purpose processor.

本实施例中，对可重构核的动态请求包括两类：In this embodiment, dynamic requests to the reconfigurable core include two types:

所述第二类中，对硬件模块所对应的可重构核的请求，In the second category, the request for the reconfigurable core corresponding to the hardware module,

以上仅描述了本发明的基本原理和优选实施方式，本领域人员可以根据上述描述做出许多变化和改进，这些变化和改进应该属于本发明的保护范围。The above only describes the basic principle and preferred implementation of the present invention, and those skilled in the art can make many changes and improvements according to the above description, and these changes and improvements should belong to the protection scope of the present invention.

Claims

1. the optimization method of dynamic requests reconfigurable core during a kind of operation, it is characterised in that as follows including step：

S1, the general reconfigurable core of configuration

General reconfigurable core is configured on FPGA；

S2, cutting reconfigurable task

It is restructural subsystem and software subsystem two parts by reconfigurable task cutting：The restructural subsystem is to build The part performed for reconfigurable core, by hardware, is designated as R, if restructural subsystem R is made up of m hardware module, wherein m >=1； The software subsystem is the part that must be performed by software, is designated as S, if software subsystem S is made up of n software module, its Middle n >=1；

S3, reconfigurable task is performed on aageneral-purposeaprocessor

Reconfigurable task is dispatched on general processor and performed, performs the first software module of reconfigurable task；

Dynamic requests reconfigurable core when S4, operation

When the software module of reconfigurable task performs, dynamic requests reconfigurable core.

2. optimization method according to claim 1, it is characterised in that in the step S1, general reconfigurable core has referred to Into the reconfigurable core of versatility calculating task, can be used by different reconfigurable tasks；General reconfigurable core is by general with this Stream file corresponding to reconfigurable core is realized, general reconfigurable core is configured on FPGA, exactly arrived corresponding stream file programming On FPGA.

3. optimization method according to claim 1, it is characterised in that in the step S2, restructural subsystem R=RH1, RH2 ..., RHm }, any one module RHi in R can individually be configured to the module of reconfigurable core, while module RHi also may be used To be realized using software, the module RHi realized with software is designated as S（RHi）, wherein 1≤i≤m；Software subsystem S=SS1, SS2 ..., SSn }, any one module SSj in S can and can only be performed by software, wherein 1≤j≤n；For any restructural Task, the module of first execution in reconfigurable task Perform sequence is software module.

4. optimization method according to claim 3, it is characterised in that in the step S2,

Each hardware module is included with properties：1）The setup time WCRT of reconfigurable core, represent that this hardware module exists The time being reconstructed on FPGA required for hardware；2）The execution time WCET1 of reconfigurable core, represent to perform this hardware module institute The maximum time that the required by task that need to be completed is wanted；3）Software performs time WCST, represents the software corresponding to this hardware module Realize and perform the maximum time that the required by task completed needed for this hardware module is wanted；

Each software module includes following attribute：The execution time WCET2 of software module, represent that this software module completes task Required maximum time.

5. optimization method according to claim 4, it is characterised in that for any one reconfigurable core RHi, if WCRT （RHi）+WCET1（RHi）＞ WCST（RHi）, preferentially use S upon execution（RHi）Complete calculating task；If WCRT（RHi）+ WCET1（RHi）≤WCST（RHi）, preferentially complete calculating task using RHi upon execution.

6. optimization method according to claim 1, it is characterised in that, please to the dynamic of reconfigurable core in the step S4 Ask including two classes：

The first kind, the request to general reconfigurable core；

Second class, the request to the reconfigurable core corresponding to hardware module.

7. optimization method according to claim 6, it is characterised in that in the first kind, asked to general reconfigurable core Ask, if general reconfigurable core can use, complete to calculate using general reconfigurable core；If general reconfigurable core is unavailable,：

1）If enough FPGA spaces, the general reconfigurable cores of WCRT+ of stand-by period WT=general reconfigurable core are calculated WCET1；If the WCST of the general reconfigurable cores of WT ＞, complete to calculate using software module corresponding to general reconfigurable core and appoint Business；If the WCST of WT≤general reconfigurable core, a general reconfigurable core is configured on FPGA again and uses this general Reconfigurable core completes calculating task；

2）If without enough FPGA spaces, calculating task is completed using software module corresponding to general reconfigurable core.

8. according to the optimization method described in claim 6, it is characterised in that in second class, to corresponding to hardware module can The request of core is reconstructed,

1）If the reconfigurable core corresponding to hardware module has been configured on FPGA and can use, can be weighed corresponding to use Calculating task corresponding to the completion of structure core；

2）If the reconfigurable core corresponding to hardware module has been configured on FPGA but unavailable, calculate：During wait Between WT=reconfigurable core WCET1*2, then：

a）If the WCST of WT ＞ reconfigurable cores, calculating task is completed using software module corresponding to reconfigurable core；

b）If the WCST of WT≤reconfigurable core, wait reconfigurable core to complete current calculating task, it is complete to reuse reconfigurable core Into calculating task；

3）Reconfigurable core corresponding to hardware module is configured without FPGA, then：

a）If without enough FPGA spaces, calculating task is completed using software module corresponding to reconfigurable core；

b）If enough FPGA spaces, the WCET1 of the WCRT+ reconfigurable cores of stand-by period WT=reconfigurable core is calculated, then：

ⅰ）If the WCST of WT ＞ reconfigurable cores, calculating task is completed using software module corresponding to reconfigurable core；

ⅱ）If the WCST of WT≤reconfigurable core, reconfigurable core is configured on FPGA and completes calculating task.