WO2018000765A1 - Co-processor, data reading method, processor system and storage medium - Google Patents
Co-processor, data reading method, processor system and storage medium Download PDFInfo
- Publication number
- WO2018000765A1 WO2018000765A1 PCT/CN2016/110754 CN2016110754W WO2018000765A1 WO 2018000765 A1 WO2018000765 A1 WO 2018000765A1 CN 2016110754 W CN2016110754 W CN 2016110754W WO 2018000765 A1 WO2018000765 A1 WO 2018000765A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- read
- space
- write
- processor
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
- G06F15/167—Interprocessor communication using a common memory, e.g. mailbox
Definitions
- the present invention relates to wireless communication technologies, and in particular, to a coprocessor, a data reading method, a processor system, and a storage medium.
- the processor's efficient access to external registers and peripheral space for operations can greatly improve overall architecture performance.
- the processor's access to the physical layer includes frequent configuration of a large number of physical layer registers, the physical layer hardware will process the data according to the configuration, and then the processor will re-read the data in the physical layer, regenerate the new configuration based on the data, and then again
- the physical layer registers are configured, and the process loops in sequence to complete the interaction between the processor and the physical layer. Therefore, in the entire SOC architecture, processor access to the physical layer's register space and data space performance is the key to the overall architecture performance.
- the processor accesses the external registers and the peripheral space is slow, the timing will be tight, the physical layer process will be abnormal, and during the interaction between the processor and the physical layer, if the interaction time is too long, it will also cause processing. The power consumption of the device is increased.
- embodiments of the present invention are expected to provide a coprocessor, a data reading method, a processor system, and a storage medium, which can improve performance of a processor accessing an external register or a peripheral space, and save time for a physical layer process. Improve the performance of the physical layer.
- a coprocessor comprising:
- a read circuit configured to receive a transport instruction sent by the read space; according to the transport instruction, Determining a handling lane that needs to perform a handling task; transferring data of the space to be read to the tightly coupled memory of the main processor through the handling lane; and sending a read notification to the main processor when the handling task is completed So that the main processor reads the data of the space to be read.
- the handling instruction includes information about the space to be read
- the coprocessor further includes:
- a storage area configured to store correspondence between information of each register and peripheral space and a channel
- the read circuit is configured to use, as the transport channel, a channel corresponding to the information determining the space to be read according to the correspondence.
- the coprocessor further includes:
- a write circuit configured to receive a write request and write data sent by the processor, the write request is for requesting writing the write data to a write space; sending a write response to the processor; A write space sends the write request and the write data.
- the buffer depth of the write circuit and the buffer depth of the read circuit are preset.
- a data reading method comprising:
- a read notification is sent to the main processor, so that the main processor reads the data of the space to be read.
- the handling instruction includes information about the space to be read, and the method further includes:
- the determining, according to the carrying instruction, the carrying lane that needs to perform a carrying task includes:
- a channel corresponding to the information of the space to be read is determined as the transport channel.
- the method further includes:
- a coprocessor comprising:
- a receiving module configured to receive a transport instruction sent by the read space
- a determining module configured to determine a handling lane that needs to perform a handling task according to the handling instruction
- a handling module configured to carry data of the space to be read into a tightly coupled memory of the main processor through the handling channel;
- a sending module configured to send a read notification to the main processor when the carrying task is completed, so that the main processor reads the data of the space to be read.
- a processor system comprising:
- the coprocessor is configured to receive a transport instruction sent by the read space; determine, according to the transport instruction, a transport channel that needs to perform a transport task; and carry the data of the space to be read to the main processor through the transport channel In the tightly coupled memory; when the handling task is completed, a read notification is sent to the main processor, so that the main processor reads the data of the space to be read.
- the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data reading method.
- the embodiment provides a coprocessor, a data reading method, a processor system, and a storage medium.
- the coprocessor includes: a read circuit configured to receive a transport instruction to be sent in a read space; determine, according to the transport instruction, a transport path that needs to perform a transport task; and carry the data of the space to be read through the transport channel Going to the tightly coupled memory of the main processor; sending a read notification to the main processor when the transfer task is completed; the main processor being configured to receive the read notification and read the read space The data.
- the main processor can directly read the data of the space to be read stored on the tightly coupled memory of the main processor, which greatly saves the processor reading the external
- the time in the register or peripheral space improves the performance of the processor reading external space, saving time for the physical layer and optimizing the physical layer flow.
- FIG. 1 is a schematic structural diagram 1 of a coprocessor according to an embodiment of the present invention.
- FIG. 2 is a schematic structural diagram 2 of a coprocessor according to an embodiment of the present invention.
- FIG. 3 is a schematic structural diagram of a processor provided by the prior art
- FIG. 5 is a schematic structural diagram 3 of a coprocessor according to an embodiment of the present invention.
- FIG. 6 is a schematic structural diagram of a coprocessor according to an embodiment of the present invention.
- FIG. 7 is a schematic structural diagram 1 of a processor according to an embodiment of the present invention.
- FIG. 8 is a schematic structural diagram 2 of a processor according to an embodiment of the present invention.
- This embodiment provides a coprocessor 10. As shown in FIG. 1, the coprocessor 10 includes:
- the read circuit 101 is configured to receive a transport instruction sent by the read space; according to the transport instruction, Determining a handling lane that needs to perform a handling task; transferring data of the space to be read to the tightly coupled memory of the main processor through the handling lane; and sending a read notification to the main processor when the handling task is completed So that the main processor reads the data of the space to be read.
- the read circuit 101 sets an unlimited number of parallel channels, each channel having a unique fixed identification.
- the tightly coupled memory of the main processor is configured to store data of the space to be read.
- the tightly coupled memory of the main processor includes: a primary storage space and a secondary storage space; wherein the primary storage space has a higher handling priority than the secondary storage space.
- the main processor can directly read the data of the space to be read stored on the tightly coupled memory of the main processor, which greatly saves the processor reading. Taking the time of the external register or peripheral space improves the performance of the processor to read the external space, saves time for the physical layer, and optimizes the physical layer flow.
- the transport instruction includes information about the space to be read
- the coprocessor 10 further includes:
- the storage area 102 is configured to store a correspondence between information of each register and a peripheral space and a channel;
- the read circuit 101 is specifically configured to: according to the correspondence, a channel corresponding to the information determining the space to be read is used as the transport channel.
- information is needed to read a register or a peripheral space, the information includes an address interval of a register or a peripheral space, a resource amount, etc., and then the coprocessor reads directly according to the information without any other components.
- the register or peripheral space is taken, and the read data is written into the primary storage space or the secondary storage space of the main processor allocated in advance, where the primary and secondary storage spaces are tightly coupled with the primary processor. of.
- the performance of accessing the primary storage space is the highest, and the performance of accessing the secondary storage space is second.
- the processor really needs to read the data of these registers or peripheral space, it can directly access the level that has been written. Storage or secondary storage interval.
- the reading device can set a parallel channel without limiting the number to solve the problem of process parallelism.
- the coprocessor 10 further includes:
- Write circuit 103 configured to receive a write request and write data sent by the processor, the write request is for requesting writing the write data to a write-only space; sending a write response to the processor; A write space is required to transmit the write request and the write data.
- the buffer depth of the write circuit and the buffer depth of the read circuit are preset.
- the main processor externally integrates the L2 cache tightly coupled memory, the L2 cache outside the integrated matrix bridge, the external registers or peripheral space is integrated on the matrix bridge slave.
- external registers and peripheral space are not cacheable to the processor, but the processor can only access external registers and peripheral space through the L2 cache to read and write. Therefore, in this architecture, the requirements of the bus capability and the performance of the processor access input/output (I/O) are very high.
- the coprocessor provided in this embodiment can avoid reading or writing to an external register or a peripheral space through the second level cache, and directly writes to the external register or the external space by the coprocessor configuration. Improves the processor's ability to read external space, saves time for the physical layer, and optimizes physical layer processes.
- the embodiment of the invention provides a data reading method, which is applied to a coprocessor as shown in FIG. 4, and the method includes:
- Step 201 Receive a transport instruction sent by the read space.
- Step 202 Determine a transport channel that needs to perform a transport task according to the transport instruction.
- Step 203 Carry the data of the space to be read into the tightly coupled memory of the main processor through the transport channel.
- Step 204 When the handling task is completed, send a read notification to the main processor, so that the main processor reads the data of the space to be read.
- the main processor can directly read the data of the space to be read stored on the tightly coupled memory of the main processor, which greatly saves the processor reading. Taking the time of the external register or peripheral space improves the performance of the processor to read the external space, saves time for the physical layer, and optimizes the physical layer flow.
- the transport instruction includes information about the space to be read, and the method further includes: setting a correspondence between information of each register and the peripheral space and the channel;
- step 202 may specifically include:
- a channel corresponding to the information of the space to be read is determined as the transport channel.
- the method further includes:
- the write request is for requesting writing the write data to a write space; sending a write response to the processor; and sending the write to the space to be written Write request and the write data.
- the embodiment of the present invention provides a coprocessor 30.
- the coprocessor 30 includes:
- the receiving module 301 is configured to receive a transport instruction sent by the read space.
- the determination module 302 determines a transportation lane that needs to perform a transportation task according to the transportation instruction.
- the transport module 303 transports the data of the space to be read to the tightly coupled memory of the main processor through the transport path.
- the sending module 304 sends a read notification to the main processor when the handling task is completed, so that the main The processor reads the data of the space to be read.
- the main processor can directly read the data of the space to be read stored on the tightly coupled memory of the main processor, which greatly saves the processor reading. Taking the time of the external register or peripheral space improves the performance of the processor to read the external space, saves time for the physical layer, and optimizes the physical layer flow.
- the transport instruction includes information about the space to be read
- the coprocessor 30 further includes: a setting module 305 configured to set a correspondence between the information of each register and the peripheral space and the channel;
- the determining module 302 is specifically configured as:
- a channel corresponding to the information of the space to be read is determined as the transport channel.
- determining module 302 is specifically configured to:
- the write request is for requesting writing the write data to a write space; sending a write response to the processor; and sending the write to the space to be written Write request and the write data.
- the embodiment of the invention provides a processor system 40, as shown in FIG. 7, comprising:
- Coprocessor 10 and main processor 20 are coprocessor 10 and main processor 20.
- the coprocessor 10 is configured to: receive a transport instruction to be sent in the read space; determine, according to the transport command, a transport channel that needs to perform a transport task; and carry the data of the space to be read to the main through the transport channel a tightly coupled memory of the processor; when the handling task is completed, sending a read notification to the main processor;
- the main processor 20 is configured to receive the read notification and read data of the space to be read.
- the processor system formed by the main processor plus the coprocessor, the coprocessor-assisted read and write operations are all performed by the coprocessor separately, and the operation data is stored in the coprocessor.
- the main processor needs data, it needs to be retrieved from the coprocessor's memory, and the main processor accesses the coprocessor's memory with a delay, which leads to a decrease in performance, and only when the main processor needs to read and write peripheral space.
- the circuit provided by the present embodiment effectively utilizes the feature of the host processor to efficiently access the tightly coupled memory, and breaks the conventional method of directly accessing the peripheral space by the host processor or the coprocessor, and the software analyzes in advance.
- peripheral data required by the processor is combined with software and hardware to acquire peripheral data in advance, improving the performance of the processor to read and write peripheral space.
- the read circuit device and the write circuit device can be flexibly used. This architecture provides the flexibility to increase processor access to external registers and peripheral space. Provides higher performance guarantees for the processor to interact with the physical layer.
- embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention can take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
- the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
- the device is implemented in a flow chart A function specified in a block or blocks of a process or multiple processes and/or block diagrams.
- These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
- the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Advance Control (AREA)
Abstract
Description
本发明涉及无线通信技术,尤其涉及一种协处理器、数据读取方法、处理器系统及存储介质。The present invention relates to wireless communication technologies, and in particular, to a coprocessor, a data reading method, a processor system, and a storage medium.
在多模手机芯片的系统级芯片(System on Chip,SOC)架构中,处理器高效访问外部寄存器和外设空间进行运算对提升整个架构性能有很大帮助。处理器对物理层的访问包括频繁的配置大量物理层寄存器,物理层硬件会根据配置进行数据处理,然后处理器会重新读取物理层中的数据,根据这些数据重新生成新的配置,然后再次配置物理层寄存器,此过程依次循环,完成处理器与物理层之间的交互。因此,在整个SOC架构中,处理器访问物理层的寄存器空间和数据空间的性能是整个架构性能的关键。但是,如果处理器访问外部寄存器和外设空间的速度较慢,会导致时序紧张,物理层流程出现异常,且在处理器和物理层交互过程中,如果交互时间过长,那么也会导致处理器功耗增加。In the system-on-chip (SOC) architecture of multi-mode mobile phone chips, the processor's efficient access to external registers and peripheral space for operations can greatly improve overall architecture performance. The processor's access to the physical layer includes frequent configuration of a large number of physical layer registers, the physical layer hardware will process the data according to the configuration, and then the processor will re-read the data in the physical layer, regenerate the new configuration based on the data, and then again The physical layer registers are configured, and the process loops in sequence to complete the interaction between the processor and the physical layer. Therefore, in the entire SOC architecture, processor access to the physical layer's register space and data space performance is the key to the overall architecture performance. However, if the processor accesses the external registers and the peripheral space is slow, the timing will be tight, the physical layer process will be abnormal, and during the interaction between the processor and the physical layer, if the interaction time is too long, it will also cause processing. The power consumption of the device is increased.
发明内容Summary of the invention
为解决上述技术问题,本发明实施例期望提供一种协处理器、数据读取方法、处理器系统及存储介质,能够提高处理器访问外部寄存器或外设空间的性能,为物理层流程节省时间,提高物理层运算性能。To solve the above technical problem, embodiments of the present invention are expected to provide a coprocessor, a data reading method, a processor system, and a storage medium, which can improve performance of a processor accessing an external register or a peripheral space, and save time for a physical layer process. Improve the performance of the physical layer.
本发明的技术方案是这样实现的:The technical solution of the present invention is implemented as follows:
第一方面,提供一种协处理器,所述协处理器包括:In a first aspect, a coprocessor is provided, the coprocessor comprising:
读电路,配置为接收需读空间发送的搬运指令;根据所述搬运指令, 确定需要执行搬运任务的搬运通道;通过所述搬运通道,将所述需读空间的数据搬运到主处理器的紧耦合存储器中;当所述搬运任务完成时,向主处理器发送读取通知,以便所述主处理器读取所述需读空间的数据。a read circuit configured to receive a transport instruction sent by the read space; according to the transport instruction, Determining a handling lane that needs to perform a handling task; transferring data of the space to be read to the tightly coupled memory of the main processor through the handling lane; and sending a read notification to the main processor when the handling task is completed So that the main processor reads the data of the space to be read.
上述方案中,所述搬运指令包括所述需读空间的信息,所述协处理器还包括:In the above solution, the handling instruction includes information about the space to be read, and the coprocessor further includes:
存储区域,配置为存储各个寄存器和外设空间的信息与通道的对应关系;a storage area configured to store correspondence between information of each register and peripheral space and a channel;
所述读电路配置为:根据所述对应关系,将确定所述需读空间的信息对应的通道作为所述搬运通道。The read circuit is configured to use, as the transport channel, a channel corresponding to the information determining the space to be read according to the correspondence.
上述方案中,所述协处理器还包括:In the above solution, the coprocessor further includes:
写电路,配置为接收所述处理器发送的写请求和写数据,所述写入请求用于请求将所述写数据写入需写空间;向所述处理器发送写响应;向所述需写空间发送所述写入请求和所述写数据。a write circuit configured to receive a write request and write data sent by the processor, the write request is for requesting writing the write data to a write space; sending a write response to the processor; A write space sends the write request and the write data.
上述方案中,所述写电路的缓存深度和所述读电路的缓存深度是预先设置的。In the above solution, the buffer depth of the write circuit and the buffer depth of the read circuit are preset.
第二方面,提供一种数据读取方法,所述方法包括:In a second aspect, a data reading method is provided, the method comprising:
接收需读空间发送的搬运指令;Receiving a handling instruction sent by the space to be read;
根据所述搬运指令,确定需要执行搬运任务的搬运通道;Determining, according to the transport instruction, a transport lane that needs to perform a transport task;
通过所述搬运通道,将所述需读空间的数据搬运到主处理器中;Carrying data of the space to be read into the main processor through the transport channel;
当所述搬运任务完成时,向主处理器发送读取通知,以便所述主处理器读取所述需读空间的数据。When the handling task is completed, a read notification is sent to the main processor, so that the main processor reads the data of the space to be read.
上述方案中,所述搬运指令包括所述需读空间的信息,所述方法还包括:In the above solution, the handling instruction includes information about the space to be read, and the method further includes:
设置各个寄存器和外设空间的信息与通道的对应关系;Set the correspondence between the information of each register and peripheral space and the channel;
所述根据所述搬运指令,确定需要执行搬运任务的搬运通道包括: The determining, according to the carrying instruction, the carrying lane that needs to perform a carrying task includes:
根据所述对应关系,将确定所述需读空间的信息对应的通道作为所述搬运通道。According to the correspondence, a channel corresponding to the information of the space to be read is determined as the transport channel.
上述方案中,所述方法还包括:In the above solution, the method further includes:
接收所述处理器发送的写请求和写数据,所述写入请求用于请求将所述写数据写入需写空间;Receiving a write request and write data sent by the processor, where the write request is used to request to write the write data into a write space;
向所述处理器发送写响应;Sending a write response to the processor;
向所述需写空间发送所述写入请求和所述写数据。Transmitting the write request and the write data to the space to be written.
第三方面,提供一种协处理器,所述协处理器包括:In a third aspect, a coprocessor is provided, the coprocessor comprising:
接收模块,配置为接收需读空间发送的搬运指令;a receiving module configured to receive a transport instruction sent by the read space;
确定模块,配置为根据所述搬运指令,确定需要执行搬运任务的搬运通道;a determining module configured to determine a handling lane that needs to perform a handling task according to the handling instruction;
搬运模块,配置为通过所述搬运通道,将所述需读空间的数据搬运到主处理器的紧耦合存储器中;a handling module configured to carry data of the space to be read into a tightly coupled memory of the main processor through the handling channel;
发送模块,配置为当所述搬运任务完成时,向主处理器发送读取通知,以便所述主处理器读取所述需读空间的数据。And a sending module configured to send a read notification to the main processor when the carrying task is completed, so that the main processor reads the data of the space to be read.
第四方面,提供一种处理器系统,包括:In a fourth aspect, a processor system is provided, comprising:
主处理器和协处理器;Main processor and coprocessor;
所述协处理器配置为接收需读空间发送的搬运指令;根据所述搬运指令,确定需要执行搬运任务的搬运通道;通过所述搬运通道,将所述需读空间的数据搬运到主处理器的紧耦合存储器中;当所述搬运任务完成时,向主处理器发送读取通知,以便所述主处理器读取所述需读空间的数据。The coprocessor is configured to receive a transport instruction sent by the read space; determine, according to the transport instruction, a transport channel that needs to perform a transport task; and carry the data of the space to be read to the main processor through the transport channel In the tightly coupled memory; when the handling task is completed, a read notification is sent to the main processor, so that the main processor reads the data of the space to be read.
本发明实施例还提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行指令,该计算机可执行指令用于执行上述的数据读取方法。The embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions are used to execute the data reading method.
本实施例提供一种协处理器、数据读取方法、处理器系统及存储介质, 所述协处理器包括:读电路配置为接收需读空间发送的搬运指令;根据所述搬运指令,确定需要执行搬运任务的搬运通道;通过所述搬运通道,将所述需读空间的数据搬运到主处理器的紧耦合存储器中;当所述搬运任务完成时,向主处理器发送读取通知;所述主处理器,配置为接收所述读取通知,并读取所述需读空间的数据。如此,利用了主处理器的紧耦合存储器和处理器的偶合性,主处理器可以直接读取主处理器的紧耦合存储器上存储的需读空间的数据,这样大大节省了处理器读取外部寄存器或外设空间的时间,提升了处理器读取外部空间的性能,为物理层节省了时间,可以优化物理层流程。The embodiment provides a coprocessor, a data reading method, a processor system, and a storage medium. The coprocessor includes: a read circuit configured to receive a transport instruction to be sent in a read space; determine, according to the transport instruction, a transport path that needs to perform a transport task; and carry the data of the space to be read through the transport channel Going to the tightly coupled memory of the main processor; sending a read notification to the main processor when the transfer task is completed; the main processor being configured to receive the read notification and read the read space The data. Thus, by utilizing the coupling of the main processor's tightly coupled memory and the processor, the main processor can directly read the data of the space to be read stored on the tightly coupled memory of the main processor, which greatly saves the processor reading the external The time in the register or peripheral space improves the performance of the processor reading external space, saving time for the physical layer and optimizing the physical layer flow.
图1为本发明实施例提供的一种协处理器的结构示意图1;1 is a schematic structural diagram 1 of a coprocessor according to an embodiment of the present invention;
图2为本发明实施例提供的一种协处理器的结构示意图2;2 is a schematic structural diagram 2 of a coprocessor according to an embodiment of the present invention;
图3为现有技术提供的一种处理器的结构示意图;3 is a schematic structural diagram of a processor provided by the prior art;
图4为本发明实施例提供的一种数据读取方法的流程图;4 is a flowchart of a data reading method according to an embodiment of the present invention;
图5为本发明实施例提供的一种协处理器的结构示意图3;FIG. 5 is a schematic structural diagram 3 of a coprocessor according to an embodiment of the present invention;
图6为本发明实施例提供的一种协处理器的结构示意图4FIG. 6 is a schematic structural diagram of a coprocessor according to an embodiment of the present invention.
图7为本发明实施例提供的一种处理器的结构示意图1;FIG. 7 is a schematic structural diagram 1 of a processor according to an embodiment of the present invention;
图8为本发明实施例提供的一种处理器的结构示意图2。FIG. 8 is a schematic structural diagram 2 of a processor according to an embodiment of the present invention.
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。The technical solutions in the embodiments of the present invention will be clearly and completely described in the following with reference to the accompanying drawings.
实施例一Embodiment 1
本实施例提供一种协处理器10,如图1所示,所述协处理器10包括:This embodiment provides a
读电路101,配置为接收需读空间发送的搬运指令;根据所述搬运指令, 确定需要执行搬运任务的搬运通道;通过所述搬运通道,将所述需读空间的数据搬运到主处理器的紧耦合存储器中;当所述搬运任务完成时,向主处理器发送读取通知,以便所述主处理器读取所述需读空间的数据。The read circuit 101 is configured to receive a transport instruction sent by the read space; according to the transport instruction, Determining a handling lane that needs to perform a handling task; transferring data of the space to be read to the tightly coupled memory of the main processor through the handling lane; and sending a read notification to the main processor when the handling task is completed So that the main processor reads the data of the space to be read.
这里,读电路101设置不限个数的多个并行的通道,每个通道具有唯一固定标识。Here, the read circuit 101 sets an unlimited number of parallel channels, each channel having a unique fixed identification.
所述主处理器的紧耦合存储器,配置为存储所述需读空间的数据。The tightly coupled memory of the main processor is configured to store data of the space to be read.
所述主处理器的紧耦合存储器包括:一级存储空间和二级存储空间;其中,所述一级存储空间的搬运优先级高于二级存储空间的搬运优先级。The tightly coupled memory of the main processor includes: a primary storage space and a secondary storage space; wherein the primary storage space has a higher handling priority than the secondary storage space.
这样一来,利用了主处理器的紧耦合存储器和处理器的偶合性,主处理器可以直接读取主处理器的紧耦合存储器上存储的需读空间的数据,这样大大节省了处理器读取外部寄存器或外设空间的时间,提升了处理器读取外部空间的性能,为物理层节省了时间,可以优化物理层流程。In this way, by utilizing the coupling of the main processor's tightly coupled memory and the processor, the main processor can directly read the data of the space to be read stored on the tightly coupled memory of the main processor, which greatly saves the processor reading. Taking the time of the external register or peripheral space improves the performance of the processor to read the external space, saves time for the physical layer, and optimizes the physical layer flow.
进一步的,如图2所示,所述搬运指令包括所述需读空间的信息,所述协处理器10还包括:Further, as shown in FIG. 2, the transport instruction includes information about the space to be read, and the
存储区域102,配置为存储各个寄存器和外设空间的信息与通道的对应关系;The
所述读电路101具体配置为:根据所述对应关系,将确定所述需读空间的信息对应的通道作为所述搬运通道。The read circuit 101 is specifically configured to: according to the correspondence, a channel corresponding to the information determining the space to be read is used as the transport channel.
本实施例中,给出需要读取寄存器或外设空间的信息,该信息包括寄存器或外设空间的地址区间,资源量等,然后协处理器会根据这些信息不通过任何其他组件,直接读取寄存器或外设空间,将读取回来的数据写入提前分配好的主处理器的一级存储空间或者二级存储空间,这里,一级和二级存储空间是与主处理器是紧耦合的。在存储系统中,访问一级存储空间的性能是最高的,其次访问二级存储空间的性能次高。当处理器真正需要读取这些寄存器或外设空间的数据时,可以直接访问已经被写入的一级 存储或二级存储区间。这样大大节省了处理器读取外部寄存器或外设空间的时间,提升了处理器读取外部空间的性能,为物理层节省了时间,可以优化物理层流程,与此同时,在需要处理读取数据的时候处理器不需要等待,从而节省功耗。且读装置中可以设置不限制个数的并行通道,来解决进程并行的问题。In this embodiment, information is needed to read a register or a peripheral space, the information includes an address interval of a register or a peripheral space, a resource amount, etc., and then the coprocessor reads directly according to the information without any other components. The register or peripheral space is taken, and the read data is written into the primary storage space or the secondary storage space of the main processor allocated in advance, where the primary and secondary storage spaces are tightly coupled with the primary processor. of. In the storage system, the performance of accessing the primary storage space is the highest, and the performance of accessing the secondary storage space is second. When the processor really needs to read the data of these registers or peripheral space, it can directly access the level that has been written. Storage or secondary storage interval. This greatly saves the processor's time to read external registers or peripheral space, improves the performance of the processor to read the external space, saves time for the physical layer, optimizes the physical layer flow, and at the same time, needs to process the read The data does not need to wait for the processor to save power. And the reading device can set a parallel channel without limiting the number to solve the problem of process parallelism.
进一步的,如图2所示,所述协处理器10还包括:Further, as shown in FIG. 2, the
写电路103,配置为接收所述处理器发送的写请求和写数据,所述写入请求用于请求将所述写数据写入需写空间;向所述处理器发送写响应;向所述需写空间发送所述写入请求和所述写数据。Write
进一步的,所述写电路的缓存深度和所述读电路的缓存深度是预先设置的。Further, the buffer depth of the write circuit and the buffer depth of the read circuit are preset.
如图3所示,通常的SOC架构,主处理器外部集成二级缓存的紧耦合存储器,二级缓存外面集成矩阵桥,外部寄存器或外设空间集成在矩阵桥的slave上。通常,外部寄存器和外设空间对处理器来讲是不可缓存的,但是处理器只能通过二级缓存访问外部寄存器和外设空间,对其进行读写操作。因此,在此架构中,对总线能力的要求和处理器访问输入/输出(Input/Output,I/O)的性能要求很高。As shown in Figure 3, the normal SOC architecture, the main processor externally integrates the L2 cache tightly coupled memory, the L2 cache outside the integrated matrix bridge, the external registers or peripheral space is integrated on the matrix bridge slave. In general, external registers and peripheral space are not cacheable to the processor, but the processor can only access external registers and peripheral space through the L2 cache to read and write. Therefore, in this architecture, the requirements of the bus capability and the performance of the processor access input/output (I/O) are very high.
而本实施例提供的协处理器,就可以免于通过二级缓存的方式读取或写入外部寄存器或外设空间,仅仅通过协处理器的调配,直接向外部寄存器或外部空间读数写数,提升了处理器读取外部空间的性能,为物理层节省了时间,可以优化物理层流程。The coprocessor provided in this embodiment can avoid reading or writing to an external register or a peripheral space through the second level cache, and directly writes to the external register or the external space by the coprocessor configuration. Improves the processor's ability to read external space, saves time for the physical layer, and optimizes physical layer processes.
实施例二Embodiment 2
本发明实施例提供一种数据读取方法,如图4所示,应用于协处理器,该方法包括:The embodiment of the invention provides a data reading method, which is applied to a coprocessor as shown in FIG. 4, and the method includes:
步骤201、接收需读空间发送的搬运指令。 Step 201: Receive a transport instruction sent by the read space.
步骤202、根据搬运指令,确定需要执行搬运任务的搬运通道。Step 202: Determine a transport channel that needs to perform a transport task according to the transport instruction.
步骤203、通过搬运通道,将需读空间的数据搬运到主处理器的紧耦合存储器中。Step 203: Carry the data of the space to be read into the tightly coupled memory of the main processor through the transport channel.
步骤204、当搬运任务完成时,向主处理器发送读取通知,以便主处理器读取需读空间的数据。Step 204: When the handling task is completed, send a read notification to the main processor, so that the main processor reads the data of the space to be read.
这样一来,利用了主处理器的紧耦合存储器和处理器的偶合性,主处理器可以直接读取主处理器的紧耦合存储器上存储的需读空间的数据,这样大大节省了处理器读取外部寄存器或外设空间的时间,提升了处理器读取外部空间的性能,为物理层节省了时间,可以优化物理层流程。In this way, by utilizing the coupling of the main processor's tightly coupled memory and the processor, the main processor can directly read the data of the space to be read stored on the tightly coupled memory of the main processor, which greatly saves the processor reading. Taking the time of the external register or peripheral space improves the performance of the processor to read the external space, saves time for the physical layer, and optimizes the physical layer flow.
进一步的,搬运指令包括所述需读空间的信息,所述方法还包括:设置各个寄存器和外设空间的信息与通道的对应关系;Further, the transport instruction includes information about the space to be read, and the method further includes: setting a correspondence between information of each register and the peripheral space and the channel;
相应的,步骤202可以具体包括:Correspondingly, step 202 may specifically include:
根据所述对应关系,将确定所述需读空间的信息对应的通道作为所述搬运通道。According to the correspondence, a channel corresponding to the information of the space to be read is determined as the transport channel.
进一步的,所述方法还包括:Further, the method further includes:
接收所述处理器发送的写请求和写数据,所述写入请求用于请求将所述写数据写入需写空间;向所述处理器发送写响应;向所述需写空间发送所述写入请求和所述写数据。Receiving a write request and write data sent by the processor, the write request is for requesting writing the write data to a write space; sending a write response to the processor; and sending the write to the space to be written Write request and the write data.
实施例三Embodiment 3
本发明实施例提供一种协处理器30,如图5所示,该协处理器30包括:The embodiment of the present invention provides a
接收模块301,配置为接收需读空间发送的搬运指令。The receiving module 301 is configured to receive a transport instruction sent by the read space.
确定模块302,根据搬运指令,确定需要执行搬运任务的搬运通道。The
搬运模块303,通过搬运通道,将需读空间的数据搬运到主处理器的紧耦合存储器中。The
发送模块304,当搬运任务完成,向主处理器发送读取通知,以便主处
理器读取需读空间的数据。The sending
这样一来,利用了主处理器的紧耦合存储器和处理器的偶合性,主处理器可以直接读取主处理器的紧耦合存储器上存储的需读空间的数据,这样大大节省了处理器读取外部寄存器或外设空间的时间,提升了处理器读取外部空间的性能,为物理层节省了时间,可以优化物理层流程。In this way, by utilizing the coupling of the main processor's tightly coupled memory and the processor, the main processor can directly read the data of the space to be read stored on the tightly coupled memory of the main processor, which greatly saves the processor reading. Taking the time of the external register or peripheral space improves the performance of the processor to read the external space, saves time for the physical layer, and optimizes the physical layer flow.
进一步的,如图6所示,搬运指令包括所述需读空间的信息,所述协处理器30还包括:设置模块305,配置为设置各个寄存器和外设空间的信息与通道的对应关系;Further, as shown in FIG. 6, the transport instruction includes information about the space to be read, and the
相应的,确定模块302具体配置为:Correspondingly, the determining
根据所述对应关系,将确定所述需读空间的信息对应的通道作为所述搬运通道。According to the correspondence, a channel corresponding to the information of the space to be read is determined as the transport channel.
进一步的,所述确定模块302具体配置为:Further, the determining
接收所述处理器发送的写请求和写数据,所述写入请求用于请求将所述写数据写入需写空间;向所述处理器发送写响应;向所述需写空间发送所述写入请求和所述写数据。Receiving a write request and write data sent by the processor, the write request is for requesting writing the write data to a write space; sending a write response to the processor; and sending the write to the space to be written Write request and the write data.
本发明实施例提供一种处理器系统40,如图7所示,包括:The embodiment of the invention provides a
协处理器10和主处理器20。
其中,协处理器10配置为:接收需读空间发送的搬运指令;根据所述搬运指令,确定需要执行搬运任务的搬运通道;通过所述搬运通道,将所述需读空间的数据搬运到主处理器的紧耦合存储器中;当所述搬运任务完成时,向主处理器发送读取通知;The
所述主处理器20,配置为接收所述读取通知,并读取所述需读空间的数据。The
本发明实施例中主处理器加协处理器构成的处理器系统,协处理器辅助的读写操作都是协处理器单独完成,操作数据都是保存在协处理器内部, 主处理器需要数据的时候要从协处理器的内存中获取,而主处理器访问协处理器的内存会有延时,从而导致性能有所下降,并且只有当主处理器需要读写外设空间的时候进行操作。如图8所示,用本实施提供的电路,有效的利用了主处理器高效访问紧耦合存储器的特性,打破了常规直接由主处理器或协处理器访问外设空间的方法,软件提前分析处理器需要的外设数据,通过软硬件结合,提前获取外设数据,提高处理器读写外设空间的性能。并且,可以灵活使用读电路装置和写电路装置。这种架构可以灵活提升处理器访问外部寄存器和外设空间的性能。给处理器与物理层交互提供了更高性能的保障。In the embodiment of the present invention, the processor system formed by the main processor plus the coprocessor, the coprocessor-assisted read and write operations are all performed by the coprocessor separately, and the operation data is stored in the coprocessor. When the main processor needs data, it needs to be retrieved from the coprocessor's memory, and the main processor accesses the coprocessor's memory with a delay, which leads to a decrease in performance, and only when the main processor needs to read and write peripheral space. When you are working. As shown in FIG. 8, the circuit provided by the present embodiment effectively utilizes the feature of the host processor to efficiently access the tightly coupled memory, and breaks the conventional method of directly accessing the peripheral space by the host processor or the coprocessor, and the software analyzes in advance. The peripheral data required by the processor is combined with software and hardware to acquire peripheral data in advance, improving the performance of the processor to read and write peripheral space. Also, the read circuit device and the write circuit device can be flexibly used. This architecture provides the flexibility to increase processor access to external registers and peripheral space. Provides higher performance guarantees for the processor to interact with the physical layer.
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention can take the form of a hardware embodiment, a software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个 流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The device is implemented in a flow chart A function specified in a block or blocks of a process or multiple processes and/or block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。 The above is only the preferred embodiment of the present invention and is not intended to limit the scope of the present invention.
Claims (10)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201610485982.7A CN107544937A (en) | 2016-06-27 | 2016-06-27 | A kind of coprocessor, method for writing data and processor |
| CN201610485982.7 | 2016-06-27 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018000765A1 true WO2018000765A1 (en) | 2018-01-04 |
Family
ID=60786545
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2016/110754 Ceased WO2018000765A1 (en) | 2016-06-27 | 2016-12-19 | Co-processor, data reading method, processor system and storage medium |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN107544937A (en) |
| WO (1) | WO2018000765A1 (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113032013A (en) * | 2021-01-29 | 2021-06-25 | 成都商汤科技有限公司 | Data transmission method, chip, equipment and storage medium |
| CN113765935A (en) * | 2021-09-17 | 2021-12-07 | 展讯通信(深圳)有限公司 | Communication method and device, readable storage medium, application processor and terminal |
| CN114116593A (en) * | 2021-11-22 | 2022-03-01 | 珠海泰为电子有限公司 | Control method, device, storage medium and electronic device for low-frequency high-performance chip |
| CN114237718A (en) * | 2021-12-30 | 2022-03-25 | 海光信息技术股份有限公司 | Instruction processing method and configuration method, device and related equipment |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109408450B (en) * | 2018-09-27 | 2021-03-30 | 中兴飞流信息科技有限公司 | Data processing method, system, co-processing device and main processing device |
| CN110188067B (en) * | 2019-07-15 | 2023-04-25 | 北京一流科技有限公司 | Coprocessor and data processing acceleration method thereof |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100064122A1 (en) * | 2008-09-09 | 2010-03-11 | Via Technologies, Inc. | Fast string moves |
| CN202548823U (en) * | 2012-02-10 | 2012-11-21 | 上海算芯微电子有限公司 | Non-blocking coprocessor interface system |
| CN103473188A (en) * | 2013-09-12 | 2013-12-25 | 华为技术有限公司 | Method, device and system for data interaction between digital signal processor (DSP) and external memory |
| CN103793342A (en) * | 2012-11-02 | 2014-05-14 | 中兴通讯股份有限公司 | Multichannel direct memory access (DMA) controller |
| CN104298639A (en) * | 2014-09-23 | 2015-01-21 | 天津国芯科技有限公司 | Embedded connecting method for host processor and multiple coprocessors and connecting interface |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2103767A1 (en) * | 1992-11-09 | 1994-05-10 | Ravi K. Arimilli | Cache architecture for high speed memory-to-i/o data transfers |
| US6925534B2 (en) * | 2001-12-31 | 2005-08-02 | Intel Corporation | Distributed memory module cache prefetch |
| US7360027B2 (en) * | 2004-10-15 | 2008-04-15 | Intel Corporation | Method and apparatus for initiating CPU data prefetches by an external agent |
| CN102169428A (en) * | 2010-06-22 | 2011-08-31 | 上海盈方微电子有限公司 | Dynamic configurable instruction access accelerator |
| CN103902469B (en) * | 2012-12-25 | 2017-03-15 | 华为技术有限公司 | A kind of method and system of data pre-fetching |
| US9251083B2 (en) * | 2013-03-11 | 2016-02-02 | Via Technologies, Inc. | Communicating prefetchers in a microprocessor |
| CN104808967B (en) * | 2015-05-07 | 2017-07-04 | 盐城工学院 | A kind of dynamic data pre-fetching system of processor |
-
2016
- 2016-06-27 CN CN201610485982.7A patent/CN107544937A/en not_active Withdrawn
- 2016-12-19 WO PCT/CN2016/110754 patent/WO2018000765A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100064122A1 (en) * | 2008-09-09 | 2010-03-11 | Via Technologies, Inc. | Fast string moves |
| CN202548823U (en) * | 2012-02-10 | 2012-11-21 | 上海算芯微电子有限公司 | Non-blocking coprocessor interface system |
| CN103793342A (en) * | 2012-11-02 | 2014-05-14 | 中兴通讯股份有限公司 | Multichannel direct memory access (DMA) controller |
| CN103473188A (en) * | 2013-09-12 | 2013-12-25 | 华为技术有限公司 | Method, device and system for data interaction between digital signal processor (DSP) and external memory |
| CN104298639A (en) * | 2014-09-23 | 2015-01-21 | 天津国芯科技有限公司 | Embedded connecting method for host processor and multiple coprocessors and connecting interface |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113032013A (en) * | 2021-01-29 | 2021-06-25 | 成都商汤科技有限公司 | Data transmission method, chip, equipment and storage medium |
| CN113765935A (en) * | 2021-09-17 | 2021-12-07 | 展讯通信(深圳)有限公司 | Communication method and device, readable storage medium, application processor and terminal |
| CN113765935B (en) * | 2021-09-17 | 2023-09-12 | 展讯通信(深圳)有限公司 | Communication method and device, readable storage medium, application processor and terminal |
| CN114116593A (en) * | 2021-11-22 | 2022-03-01 | 珠海泰为电子有限公司 | Control method, device, storage medium and electronic device for low-frequency high-performance chip |
| CN114237718A (en) * | 2021-12-30 | 2022-03-25 | 海光信息技术股份有限公司 | Instruction processing method and configuration method, device and related equipment |
Also Published As
| Publication number | Publication date |
|---|---|
| CN107544937A (en) | 2018-01-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TWI397009B (en) | Data processing apparatus of basic input/output system | |
| WO2018000765A1 (en) | Co-processor, data reading method, processor system and storage medium | |
| CN101952808B (en) | Extended Utilization Area of Storage Devices | |
| CN112214240B (en) | Host output input command execution device and method and computer readable storage medium | |
| US11163710B2 (en) | Information processor with tightly coupled smart memory unit | |
| US7689734B2 (en) | Method for toggling non-adjacent channel identifiers during DMA double buffering operations | |
| US9690720B2 (en) | Providing command trapping using a request filter circuit in an input/output virtualization (IOV) host controller (HC) (IOV-HC) of a flash-memory-based storage device | |
| US20130019050A1 (en) | Flexible flash commands | |
| CN102279712A (en) | Storage control method, system and device applied to network storage system | |
| CN119885247B (en) | Data query method, system, device, medium and program product | |
| CN108304334A (en) | Application processor and integrated circuit including interrupt control unit | |
| US9274860B2 (en) | Multi-processor device and inter-process communication method thereof | |
| KR20180010951A (en) | Method of achieving low write latency in a data starage system | |
| TW201835757A (en) | Methods for garbage collection and apparatuses using the same | |
| CN115481072A (en) | Inter-core data transmission method, multi-core chip and machine-readable storage medium | |
| CN104461730A (en) | Virtual resource allocation method and device | |
| US11099762B2 (en) | Multi host controller and semiconductor device including the same | |
| CN101720040B (en) | Video decoding optimizing method fusing high speed memory and DMA channel | |
| US20070198879A1 (en) | Method, system, and medium for providing interprocessor data communication | |
| US20180336147A1 (en) | Application processor including command controller and integrated circuit including the same | |
| CN118193144A (en) | Storage medium, method and device for executing host write command | |
| CN110383259B (en) | Computing processing device and information processing system | |
| CN107807888B (en) | Data prefetching system and method for SOC architecture | |
| JP2016026345A (en) | Temporary stop of memory operation for shortening reading standby time in memory array | |
| KR102882975B1 (en) | Command processor and method for performing auto fetch thereby |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16907152 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 16907152 Country of ref document: EP Kind code of ref document: A1 |