Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, an embodiment of the present invention provides an out-of-order receiving management method for RapidIO Message transactions based on FPGA, which solves the problems set forth in the above-mentioned background art by performing simultaneous receiving and ordering on Message data of a plurality of different source nodes.
In order to achieve the above purpose, the present invention provides the following technical solutions:
An out-of-order receiving management method of rapidIO Message transactions based on FPGA comprises the following steps:
step S1, receiving a rapidIO transaction packet and entering a rapidIO driving module, performing cross-clock processing on a rapidIO interface and caching a Message transaction packet;
Step S2, carrying out protocol analysis on the Message transaction package, processing the analyzed Message transaction data to obtain Message framing data, triggering a Message response request, and carrying out package output according to the Message response request;
step S3, arbitrating the SRAM buffer channel, and writing Message framing data into the corresponding channel;
Step S4, mapping the SRAM buffer address and the fragment offset of the Message framing one by one, and writing framing data into a corresponding space of the SRAM module to complete automatic sequencing of the Message framing;
the SRAM cache is a high-performance memory for temporarily storing data, the rapidIO is a high-performance serial communication protocol for embedded systems and high-performance computing environments, message transactions are Message transactions, and the IP interface is an interface for data communication between equipment and a network and is used for information interaction operation.
In a preferred embodiment, the method for out-of-order receiving and managing the RapidIO Message transaction based on the FPGA mainly comprises three modules, namely a RapidIO IP interface module, a RapidIO driving module and a multichannel sequencing module.
The rapidIO IP interface module is used for completing bottom layer sending and receiving of the rapidIO transaction packet;
The rapidIO driving module is used for completing protocol analysis of the rapidIO transaction packet and giving a response to a request packet requiring the response;
The multi-channel ordering module is used for completing the ordering and combining of Message transaction data from different source nodes and outputting complete Message frame data after the ordering and combining is completed.
In a preferred implementation manner, the rapidIO driving module comprises three modules, namely an AXI FIFO buffer module, a Message request transaction analysis module and a Message response transaction group package module;
the rapidIO driving module analyzes the protocol of the received Message transaction and gives a Message request transaction response;
the AXI FIFO buffer module is used for cross-clock domain processing of the RapidIO interface and buffering the Message transaction packet;
The Message request transaction analysis module is used for carrying out protocol analysis on the Message request transaction and outputting Message framing parameters;
The Message response transaction grouping module is used for responding to the Message request transaction and performing grouping and sending according to a Message response frame format.
In a preferred embodiment, in step S1, the AXI FIFO buffer module performs the clock domain crossing processing on the RapidIO interface as follows:
Identifying clock domains, namely acquiring data and frequency of each clock domain, and determining data and interfaces transmitted across the clock domains;
The clock domain crossing is realized by using an AXI FIFO with asynchronous and independent clocks, the data and clock interfaces between different clock domains are butted, and simultaneously, the data is captured by using a double-edge trigger at a receiving end, so that the occurrence rate of time sequence problems of data sampling is reduced;
Determining the head and effective load of data package, and controlling the data sending and receiving process by existing state machine;
data synchronization, namely using a synchronizer to ensure that received data can be stabilized in a new clock domain;
And data caching, namely caching the data packet in the new clock domain.
In a preferred embodiment, in step S2, the Message request transaction parsing module performs protocol parsing on the Message transaction packet and processes the parsed Message transaction data as follows:
Obtaining a rapidIO transaction packet, namely reading an original rapidIO transaction packet through a Serial RapidIO Gen IP interface;
Analyzing the payload, namely extracting payload data according to the header information;
checking the tail information to ensure that the rapidIO transaction packet is free from errors in the transmission process;
Frame processing, namely processing the parsed Message transaction data to obtain frame parameters, wherein the frame processing comprises the following steps of:
msg_id, msg_len, msg_seg, msg_data, msg_data_val, msg_data_last, which are Message source ID, message fragment number, message fragment offset, message data valid, and Message data end flag, respectively.
In a preferred embodiment, in step S2, each framing parameter corresponds to framing data in the Message transaction packet, and after framing, the Message response request is triggered;
After receiving the Message response request, the Message response transaction group packet module generates a corresponding Message response transaction packet and sends the Message response transaction packet to the AXI FIFO buffer module, and the AXI FIFO buffer module caches and processes the Message response transaction packet across clock domains and outputs the Message response transaction packet to the Serial RapidIO Gen IP interface for sending.
In a preferred embodiment, in step S3, the multi-channel sorting module includes a FIFO buffer module, a channel arbitration module, an SRAM buffer control module, an SRAM buffer module, and a Message output control module;
The FIFO buffer module is used for buffering the Message frames;
The channel arbitration module is used for arbitrating the SRAM buffer channel, inputting the Message framing data into the corresponding SRAM buffer channel for arbitration, and the arbitration steps are as follows:
after receiving new Message framing data, polling occupy _st register of SRAM buffer module, if there is msg_id and occupy _id of Message framing in occupied SRAM buffer module, writing the Message framing data into corresponding SRAM buffer channel, otherwise applying for new SRAM buffer channel, assigning msg_id of Message framing to occupy _id of SRAM, and writing Message framing data into new buffer channel.
In a preferred embodiment, in step S4, the SRAM buffer control module is configured to write Message frame data of the same Message source ID to a corresponding address of the SRAM module, and allocate a plurality of parameters for state management of the SRAM module, including:
occupy _id, ocupy_st, msg_rlen, msg_rdata, msg_rcnt, which are SRAM_taken ID, SRAM_taken state, message full frame actual length, message full frame data, message full frame fragment count, respectively;
occupy _id is used for inquiring whether Message framing data from the same source node is written into a corresponding channel;
occupy _st is used to query the current occupancy state of the SRAM, marking the idle state as occupy _st [1:0] =2' b00;
The occupancy state is marked occupy _st [1:0] =2' b01;
marking the packet completion status as occupy _st [1:0] =2' b10;
the Message output state is marked occupy _st [1:0] =2' b11;
msg_rcnt is used for counting the complete frame fragments of the Message, and the count value of msg_rcnt is increased by 1 every time one frame of the Message fragments is written.
In a preferred embodiment, in step S4, when the Message frame storage is completed, the msg_rcnt value for the Message frame statistics count is increased by 1, and when it is detected that the Message frame statistics count msg_rcnt is equal to the number of fragments of the Message frames msg_len, the SRAM occupied state is set to occupy _st [1:0] =2 'b10, the Message complete frame output is waited, the Message complete frame after the completion of the output is completed, the frame statistics count msg_rcnt is cleared, and the SRAM occupied state is set to occupy _st [1:0] =2' b00;
If the SRAM buffer channel occupied in the preset specified time is not completely ordered into packets, when a timeout interrupt is generated, the Message framing statistics count msg_rcnt is cleared, and the SRAM occupied state is occupy _st [1:0] =2' b00.
The invention discloses a method for managing out-of-order receiving of rapidIO Message transactions based on FPGA, which has the technical effects and advantages that:
According to the method, the device and the system, the rapidIO transaction packet is received and transmitted into the rapidIO driving module, in the rapidIO driving module, the rapidIO interface is subjected to cross-clock domain processing to enable data transmission in different clock domains to be synchronous, after the cross-clock domain processing, the Message transaction packet is buffered, the Message transaction packet is subjected to protocol analysis, frame data after the Message transaction packet analysis are obtained, the analyzed Message frame data are output, the Message frame data are buffered, then an SRAM buffer channel is arbitrated, the matched SRAM buffer channel is obtained, the Message frame data are written into the corresponding channel, the mapping processing is carried out on the SRAM buffer address and the fragment offset of the Message frame, and the frame data are written into the corresponding space of the SRAM module, so that the automatic sequencing of the Message frame is completed, and the problem of communication faults caused by Message transaction packet retransmission disorder caused by competition among source nodes is reduced.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
According to the method, the device and the system, the rapidIO transaction packet is received and transmitted into the rapidIO driving module, in the rapidIO driving module, the rapidIO interface is subjected to cross-clock domain processing to enable data transmission in different clock domains to be synchronous, after the cross-clock domain processing, the Message transaction packet is buffered, the Message transaction packet is subjected to protocol analysis, frame data after the Message transaction packet analysis are obtained, the analyzed Message frame data are output, the Message frame data are buffered, then an SRAM buffer channel is arbitrated, the matched SRAM buffer channel is obtained, the Message frame data are written into the corresponding channel, the mapping processing is carried out on the SRAM buffer address and the fragment offset of the Message frame, and the frame data are written into the corresponding space of the SRAM module, so that the automatic sequencing of the Message frame is completed, and the problem of communication faults caused by Message transaction packet retransmission disorder caused by competition among source nodes is reduced.
An embodiment, as shown in fig. 1, of an out-of-order receiving management method for RapidIO Message transactions based on FPGA, includes the following steps:
step S1, receiving a rapidIO transaction packet and entering a rapidIO driving module, performing cross-clock processing on a rapidIO interface and caching a Message transaction packet;
Step S2, carrying out protocol analysis on the Message transaction package, processing the analyzed Message transaction data to obtain Message framing data, triggering a Message response request, and carrying out package output according to the Message response request;
step S3, arbitrating the SRAM buffer channel, and writing Message framing data into the corresponding channel;
And S4, mapping the SRAM buffer address and the fragment offset of the Message framing one by one, and writing framing data into a corresponding space of the SRAM module to complete automatic sequencing of the Message framing.
The specific implementation is as follows:
The method for out-of-order receiving and managing the rapidIO Message transaction based on the FPGA mainly comprises three modules, namely a rapidIO IP interface module, a rapidIO driving module and a multichannel sequencing module, as shown in figure 2.
The rapidIO IP interface module is mainly used for completing bottom layer sending and receiving of rapidIO transaction packets;
The rapidIO driving module is used for completing protocol analysis of the rapidIO transaction packet and giving a response to a request packet requiring the response;
The multi-channel ordering module is used for completing the ordering and combining of Message transaction data from different source nodes and outputting complete Message frame data after the ordering and combining is completed.
It should be noted that, the FPGA is an integrated circuit, a user may configure the FPGA after manufacturing to implement a highly flexible digital circuit design, rapidIO is a high-performance serial communication protocol used in an embedded system and a high-performance computing environment, a Message transaction is a Message transaction, and the IP interface is an interface for performing data communication between a device and a network, and is used for information interaction operation.
In step S1, interaction of the RapidIO transaction packet is achieved by calling Serial RapidIO Gen IP inside the FPGA when the RapidIO transaction packet is received, that is, the RapidIO transaction packet or the outgoing processed data file is received.
The RapidIO driving module mainly comprises three modules, namely an AXI FIFO buffer module, a Message request transaction analysis module and a Message response transaction group package module, as shown in fig. 3, and the RapidIO driving module performs protocol analysis on received Message transactions and gives Message request transaction responses.
The AXI FIFO buffer module is used for performing cross-clock domain processing of the Rapid IO interface, buffering Message transaction packets, the Message request transaction analysis module is used for performing protocol analysis on the Message request transactions and outputting Message framing parameters, and the Message response transaction grouping module is used for responding to the Message request transactions and performing grouping and sending according to a Message response frame format.
The specific steps of the AXI FIFO buffer module for carrying out cross-clock domain processing on the RapidIO interface are as follows:
Identifying clock domains, namely acquiring data and frequency of each clock domain, and determining data and interfaces which need to be transmitted across the clock domains, namely a Message transaction packet and Serial RapidIO Gen IP interfaces in the example;
The clock domain crossing is realized by using an AXI FIFO with asynchronous and independent clocks, the data and clock interfaces between different clock domains are butted, and simultaneously, the data is captured by using a double-edge trigger at a receiving end, so that the occurrence rate of time sequence problems of data sampling is reduced;
Determining the head and effective load of data package, and controlling the data sending and receiving process by existing state machine;
data synchronization, namely using a synchronizer to ensure that received data can be stabilized in a new clock domain;
And data caching, namely caching the data packet in the new clock domain.
It should be noted that, in the digital circuit design, the clock domains refer to circuit parts driven by specific clock signals, and circuits of each clock domain synchronously work in the same clock period, but different clock domains may operate under different clock frequencies and phases, so that data are inconsistent or lost, and thus, the clock domains need to be processed, and synchronizers used in data synchronization are not unique, for example, a dual-D trigger synchronizer is used for data synchronization.
In step S2, after the AXI FIFO buffer module performs the cross-clock domain processing on the RapidIO interface, the Message request transaction analysis module performs protocol analysis on the Message transaction packet and performs the specific steps of processing the analyzed Message transaction data as follows:
Obtaining a rapidIO transaction packet, namely reading the original rapidIO transaction packet through Serial RapidIO Gen IP;
extracting header information, namely analyzing the header of the transaction packet, extracting key fields, judging the type of the transaction packet, and knowing the length of a payload after determining the type of the transaction packet as a Message transaction packet;
analyzing the effective load, namely extracting effective load data according to the head information;
checking the tail information to ensure that the rapidIO transaction packet is free from errors in the transmission process;
frame processing, namely processing the parsed Message transaction to obtain frame parameters, wherein the frame parameters comprise:
msg_id, msg_len, msg_seg, msg_data, msg_data_val, msg_data_last, which are Message source ID, message fragment number, message fragment offset, message data valid, and Message data end flag, respectively;
And triggering a Message response request after framing processing is carried out on the framing data corresponding to each framing parameter in the Message transaction packet.
After receiving the Message response request, the Message response transaction group packet module generates a corresponding Message response transaction packet and sends the Message response transaction packet to the AXI FIFO buffer module, and the AXI FIFO buffer module caches and processes the Message response transaction packet across clock domains and outputs the Message response transaction packet to the Serial RapidIO Gen IP interface for sending.
It should be noted that Serial RapidIO Gen IP is a preset IP interface in this example, which is used to interact with RapidIO transaction packets, and the parsing algorithm and the checking algorithm are not unique, for example, the parsing algorithm may use recursive downward parsing, and the checking algorithm may use cyclic redundancy checking and so on.
In step S3, the multi-channel ordering module includes a FIFO buffer module, a channel arbitration module, an SRAM buffer control module, an SRAM buffer module, and a Message output control module, as shown in fig. 4, the multi-channel ordering module performs ordering combination on Message transaction data of different source nodes, that is, the Message frame data output by the RapidIO driving module is ordered according to frame parameters, and then integrated and outputs complete Message frame data.
The FIFO buffer module is used for buffering the Message frames;
The channel arbitration module is used for arbitrating the SRAM buffer channel, inputting the Message framing data into the corresponding SRAM buffer channel for arbitration, the flow relation is shown in figure 5, and the arbitration steps are as follows:
after receiving new Message framing data, polling occupy _st register of SRAM buffer module, if there is msg_id and occupy _id of Message framing in occupied SRAM buffer module, writing the Message framing data into corresponding SRAM buffer channel, otherwise applying for new SRAM buffer channel, assigning msg_id of Message framing to occupy _id of SRAM, and writing Message framing data into new buffer channel.
The SRAM cache control module is configured to write Message frame data of the same Message source ID to a corresponding address of the SRAM module, and allocate a plurality of parameters for state management of the SRAM module, as shown in fig. 6, including:
occupy _id, ocupy_st, msg_rlen, msg_rdata, msg_rcnt, which are SRAM busy ID, SRAM busy state, message full frame actual length, message full frame data, and Message full frame fragment count, respectively.
Occupy _id is used for inquiring whether Message framing data from the same source node is written into a corresponding channel;
occupy _st is used to query the current occupancy state of the SRAM, mark the idle state as occupy _st [1:0] =2 'b00, mark the occupancy state as occupy _st [1:0] =2' b01, mark the packet completion state as occupy _st [1:0] =2 'b10, mark the Message output state as occupy _st [1:0] =2' b11;
msg_rcnt is used for counting the complete frame fragments of the Message, and the count value of msg_rcnt is increased by 1 every time one frame of the Message fragments is written.
The channel arbitration module arbitrates the SRAM buffer channels, ensures that Message frame data from the same source ID is written into the same SRAM buffer channel, reduces resource competition conflict, reasonably avoids competition and conflict of shared resources when a plurality of Message transactions access the SRAM buffer at the same time, uses msg_rcnt to count Message complete frame fragments to determine the occupation condition of the Message transactions, can improve concurrency performance, lays an auxiliary foundation for subsequent high-priority transaction processing, sets fragment offset, can divide data into a plurality of fragments for transmission when the data to be transmitted exceeds the capacity of a single Message transaction, and utilizes the fragment offset to mark the relative position of each fragment in the whole data so as to carry out fragment transmission, thereby improving the data transmission inclusion.
It should be noted that, the SRAM buffer is a high-performance memory for temporarily storing data, accelerating access and processing of data, in fig. 6, two variables of sram_addr [11:8] and sram_addr [7:0] represent address buses of the SRAM, 11:8 represents high-order 4-bit addresses, 7:0 represents low-order 8-bit addresses, together form a complete address of the SRAM, the sram_ wdata variable represents written SRAM data, the sram_ wen variable represents write enable signals of the SRAM for controlling write operations of the SRAM, and the sram_rdata variable represents data read from the SRAM.
In step S4, the corresponding mapping is performed on the SRAM buffer address and the fragment offset of the Message frame, the frame data is written into the corresponding space of the SRAM module to complete the automatic sequencing of the Message frame, as shown in fig. 7, the SRAM buffer module is used for buffering the Message complete frame, and as the single Message length supported by RapidIO is 4096 bytes, the maximum size of each packet is 256 bytes, a 4096 buffer space is opened for the SRAM buffer, the address addressing range sram_ waddr [11:0]:0x 000-0 xfff, the high 4-bit address sram_ waddr [11:8] of the SRAM buffer and the fragment offset msg_seg [3:0] of the frame are mapped one by one, i.e. the SRAM buffer is divided into 16 buffer spaces with 256 bytes. In the Message framing storage process, the framing data is written into the corresponding space of the SRAM module through the fragment offset, so that the automatic sequencing function of Message framing is completed;
when the Message frame count msg_rcnt is equal to the number of fragments of the Message frame msg_len, the SRAM occupied state is set to occupy _st [1:0] =2 'b10, namely the SRAM occupied state is set to a packet completion state, the output of the Message complete frame is waited, the Message complete frame after the completion is output, the frame count msg_rcnt is cleared, the SRAM occupied state is set to occupy _st [1:0] =2' b00, namely the SRAM occupied state is set to an idle state.
If the SRAM buffer channel occupied in the preset specified time is not completely ordered into packets, when a timeout interrupt is generated, the Message framing statistics count msg_rcnt is cleared, and the SRAM occupied state is occupy _st [1:0] =2' b00.
The Message output control module is configured to perform complete Message frame output after the SRAM module completes the packet grouping, and display and output Message data in the SRAM module when the SRAM is detected to be in a packet grouping completion state by polling the SRAM state occupy _st of each channel, where the display and output Message complete frame includes:
Msg_rid, msg_rlen, msg_rdata, msg_rdv, msg_rsof, msg_ reof, which are Message reception source ID, message reception length, message reception data valid, message reception data start flag, and Message reception data end flag, respectively.
It should be noted that, when the data is divided into multiple slices for transmission, the slice offset is the starting position or the relative position of different slices in the whole data stream, which is used to identify the position field of a specific slice in the whole data block, and the slice offset is used to correctly recombine the data into the original data by using the position of the slice in the whole data stream, so as to reduce the communication failure problem caused by Message transaction packet retransmission disorder due to competition between source nodes in the large RapidIO switching system through the RapidIO driving module and the multichannel sequencing module, and expand the subsequent application scene analysis of the RapidIO bus in the embedded system.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
Those of ordinary skill in the art will appreciate that the various illustrative modules and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and application constraints imposed on the technology. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Finally, the foregoing description of the preferred embodiment of the invention is provided for the purpose of illustration only, and is not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.