Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a zero-copy data transmission method, equipment, medium and product, and solves the technical problem that the current data transmission technology is difficult to meet the current use requirements in terms of transmission performance, transmission safety and transmission certainty.
In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:
In a first aspect, the present invention provides a zero-copy data transmission method, including:
responding to the need of the application program to send data, calling a function in a user space library by the application program, and requesting a memory buffer area for sending the data to a kernel module;
the kernel module allocates a physical memory buffer area according to the request and returns a descriptor of the physical memory buffer area;
The application program calls the function in the user space library, submits the descriptor to the kernel module and requests the address pointer of the memory buffer for writing data;
The kernel module verifies the descriptor, maps the physical memory buffer area to a virtual address space of an application program after verification is passed, and returns an address pointer of the virtual address space;
the application program writes the data to be transmitted according to the address pointer, and after the writing is completed, the application program calls the function in the user space library, submits the descriptor to the kernel module and requests to transmit the written data;
the kernel module verifies the descriptor, and after verification is passed, the DMA controller is configured and started;
And the DMA controller bypasses the CPU to directly read the written data from the physical memory buffer area and pushes the data to the network card for transmission, so that the zero participation of the CPU and the zero copy of the memory are realized.
Optionally, in response to the user space library receiving the call function request, the call function request is converted into an underlying request for the kernel module, and the underlying request is sent to the kernel module through the ioctl interface.
Optionally, the kernel module allocates a physical memory buffer according to the request, and returns a descriptor of the physical memory buffer includes:
the kernel module matches a physical memory buffer zone with proper size from a memory pool based on a partner algorithm according to the request, searches an idle entry in a built-in descriptor table to generate a corresponding descriptor, wherein the entry contains necessary information of the physical memory buffer zone, and the descriptor returns to an application program through a user space library.
Optionally, the descriptor includes a handle ID, a usage count, and a check code;
the handle ID is an entry index of the descriptor;
When the descriptor is released by the application program, the usage count of the corresponding descriptor recorded in the descriptor table is increased by one, when the application program uses the descriptor again to access the kernel module, the usage count in the descriptor is matched with the usage count in the corresponding descriptor in the descriptor table, and if the matching fails, the kernel module refuses to access;
the check code is used for preventing the descriptor from being tampered with maliciously.
Optionally, after the network card is sent, the DMA controller generates a DMA interrupt to notify the kernel module that the sending is completed, the kernel module sets the entry corresponding to the corresponding descriptor in the descriptor table to be idle, and generates a status notification of successful sending, and returns to the application program through the user space library.
In a second aspect, the present invention provides an electronic device, including a processor and a storage medium;
the storage medium is used for storing instructions;
The processor is operative according to the instructions to perform steps according to the method described above.
In a third aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method.
In a fourth aspect, the invention provides a computer program product comprising computer programs/instructions which when executed by a processor implement the steps of the method described above.
Compared with the prior art, the invention has the beneficial effects that:
According to the zero-copy data transmission method, device, medium and product provided by the invention, a security firewall is established between an application program and a zero-copy buffer zone of a kernel space through a kernel module and a descriptor mechanism. The method not only realizes the zero-copy high-efficiency transmission of the data from the memory to the network card through the DMA, but also thoroughly avoids the safety isolation problem caused by direct memory mapping. It makes the very fast performance safe and usable.
By eliminating memory copies of kernel space and user space and letting the DMA controller take charge of core data handling work, CPU is liberated from these inefficient, repetitive efforts, significantly reducing CPU occupancy. The data path is greatly simplified, the end-to-end delay is effectively shortened, and the data processing throughput of the system is greatly improved.
And the static pre-allocated memory pool is adopted for buffer management, and allocation and recovery operations are completed within predictable time, so that uncertainty delay is avoided.
In summary, the invention can improve the safety and certainty while ensuring the transmission performance.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present invention, and are not intended to limit the scope of the present invention.
Embodiment one:
as shown in fig. 1, the present invention provides a zero-copy data transmission method, which includes the following steps:
Step S1, responding to the need of an application program to send data, calling a function in a user space library by the application program, and requesting a memory buffer area for sending the data to a kernel module.
As shown in fig. 2, the application (App) needs to send data, and then calls the zc_alloc_buffer () function provided by the user space library (Lib) to obtain a block of memory buffer of a specified size.
And S2, the kernel module allocates a physical memory buffer area according to the request and returns the descriptor of the physical memory buffer area.
After receiving the call, the user space library converts it into an underlying request to the Kernel module (Kernel). The purpose of this underlying request is to have the kernel allocate a block of buffer for zero copies.
The kernel module matches a physical memory buffer zone with proper size from a memory pool based on a partner algorithm according to the request, searches an idle entry in a built-in descriptor table to generate a corresponding descriptor, wherein the entry contains necessary information of the physical memory buffer zone, and the descriptor returns to an application program through a user space library. The descriptor serves as the only "key" to the memory buffer obtained by the application operation.
In this embodiment, the descriptor includes a handle ID, a usage count, and a check code.
The handle ID is the entry index of the descriptor, and the handle ID is the core of the descriptor and is used as a direct index for accessing a large descriptor table in the core. Each entry of the descriptor table contains all necessary information such as pointers to the real physical memory buffers, buffer size, status (free/in use), owner process ID, etc. The direct index can avoid any searching or hash calculation, and after the kernel takes the ID, all information can be positioned only by one time of array access (table [ ID ]), so that the searching efficiency of O (1) is realized.
The use count is the number of times the descriptor is released by the application program, and is a key to achieving security and is used for preventing a 'dangling pointer' or a 'use-after-free' vulnerability. Working mechanism in the descriptor table of kernel, every entry has a correspondent "kernel use count value" besides buffer pointer. When a descriptor is created and returned to the user, it carries the current usage count value. When the application releases the descriptor (the application releases the descriptor at the moment that the data to be sent is sent completely, the application does not need to allocate a memory buffer area, and the application releases an address pointer at this time), the "kernel use count value" of the corresponding entry in the kernel is increased by 1. Verification that when an application accesses the kernel again using the descriptor (which may be expired, released), the kernel compares the usage count carried in the descriptor with the current usage count in the kernel table. If there is no match, access is immediately denied. This ensures that expired descriptors fail immediately.
The check code is used for preventing the descriptor from being tampered with maliciously. The check code may be the result of a handle ID and a simple exclusive or (XOR) or other fast check algorithm using counts. The kernel recalculates and compares the check code each time it is verified.
The descriptor may be set as an array of handle ID (32 bits), usage count (16 bits), check code (16 bits) according to actual needs.
And step S3, the application program calls a function in the user space library, submits the descriptor to the kernel module and requests an address pointer of a memory buffer area for writing data.
Applications, although having descriptors, have no direct access to memory. It needs to call the library function zc_get_pointer (desc_a) to get an available memory address pointer. The user space library initiates a request to the kernel, again through the underlying request, and submits a descriptor.
And S4, the kernel module verifies the descriptor, maps the physical memory buffer area to a virtual address space of the application program after verification is passed, and returns an address pointer of the virtual address space.
After the kernel module receives the request, it makes strict security check to verify whether the handle ID of the descriptor exists, whether the usage count matches the kernel record, and whether the check code is correct. After passing the verification, the kernel performs Just-in-Time (Just-in-Time) memory mapping. It temporarily maps this block of physical memory buffer into the virtual address space of the application program on demand, then returns the address pointer to the user space bank, via the user space bank to the application program.
And S5, the application program writes the data to be transmitted according to the address pointer, and after the writing is completed, the application program calls the function in the user space library, submits the descriptor to the kernel module and requests to transmit the written data.
After the application program obtains the address pointer, the application program can directly copy the data to be transmitted into the buffer area like operating the common memory. The process is performed completely in the user space, and is efficient and quick. After the data writing is completed, the application program calls the zc_send (desc_a) function, and notifies the kernel that the sending can be started.
And S6, the kernel module verifies the descriptor, and after the verification is passed, the DMA controller is configured and started.
The kernel module performs the last, same strict security verification. After the verification is passed, the kernel module configures and starts the DMA controller. It tells the DMA engine to "please transfer the data of the source address (physical address of buffer) to the destination address (send queue of network card)".
And S7, the DMA controller bypasses the CPU to directly read the written data from the physical memory buffer area and pushes the data to the network card for transmission, so that the zero participation of the CPU and the zero copy of the memory are realized.
After the network card is sent, the DMA controller generates a DMA interrupt to inform the kernel module of the completion of the sending, the kernel module sets the entry corresponding to the corresponding descriptor in the descriptor table to be idle, and simultaneously generates a successful sending state notification, and the state notification is returned to the application program through the user space library. So far, the complete zero copy sending flow is finished.
The kernel module mentioned in this embodiment is core logic that runs in the kernel space. And 1, managing the memory pool, namely realizing static pre-allocation of the memory pool based on a partner algorithm. 2. The descriptor table maintains, manages a huge descriptor array, handles allocation, release and usage count updates of descriptors. 3. The secure executor implements the "very fast kernel proxy interface" described below, validating each request from a user state (check code, usage count, process ID, etc.). 4. And after the verification is passed, the hardware interaction is responsible for configuring the DMA controller, interacting with the network card and initiating real data transmission.
The user space library is referred to in this embodiment in order to make it easy for the upper layer application developer to use, we package all the complexity in one library, provide multiple APIs,
For example, handle = zero copy init (): initialize library, establish a connection with the kernel module.
Desc=zc_alloc_buffer (size) applies for a buffer for transmitting data, returning a descriptor.
Void=zc_get_pointer (desc) (key step) obtains an available memory pointer from the descriptor.
Int ret=zc_commit_write (desc, length): notify the kernel that the data has been written.
Int ret=zc_send (desc): transmit data.
Desc=zc_ receive (timeout) -received data, return a descriptor containing the new data.
Void zc_free_buffer (desc) releases a descriptor and its corresponding buffer.
In this embodiment, in response to receiving a request for calling a function from the user space library, the request is converted into a bottom layer request for the kernel module, and the bottom layer request is sent to the kernel module through the ioctl interface. Standard system call (syscale) overhead is large. For performance, a dedicated, lightweight ioctl interface is designed, or a more modern asynchronous interface such as io_ uring is utilized on a supported system (e.g., linux). The ioctl interface can realize 1. Batch processing, the interface should support submitting a plurality of commands at a time (for example, apply for 3 buffers and send 5 buffers), so as to reduce the switching times of the user mode and the kernel mode. 2. The design of the interface ensures that the kernel only needs to access the data of the minimum set when processing, thereby avoiding unnecessary memory access and calculation and simplifying the whole processing flow.
The technical scheme of the patent has the following advantages:
1. Has both performance and efficiency
The performance advantages of the present invention are significant compared to conventional TCP/IP datapaths.
The CPU load is obviously reduced, namely, the CPU is liberated from heavy data handling work by eliminating data copying between a kernel space and a user space and utilizing DMA to carry out hardware autonomous transmission. In avionics, this means that valuable CPU resources can be used for more critical flight control, navigation computation, or sensor data fusion tasks.
The data delay is greatly reduced, the path of data from the application's buffer to the network card is greatly reduced and simplified. The information does not need to "travel long" among multiple memory areas, so that the end-to-end data transmission delay is significantly reduced, which is important for real-time applications (such as alarm information transmission and real-time video monitoring) requiring quick response.
And greatly improving the data throughput, namely greatly improving the data quantity (namely throughput) which can be transmitted in unit time because the whole data path becomes smoother due to delay reduction and elimination of CPU bottleneck. This has decisive significance for the application scenario of processing high bandwidth sensor data streams of radar, photoelectricity and the like.
2. Providing 'authenticatable' strong security and isolation
This is a central advantage of this patent. It solves the fatal defect of standard zero-copy technology (such as direct mmap) and makes it safely applicable to avionics field.
The invention does not directly expose the kernel memory to the application program, but takes the agent by a kernel module as a forced access control point. Any application's access to the shared buffer must pass the descriptor and go through the kernel module's rigorous scrutiny. The method radically eliminates the risks of out-of-range access of application programs and tampering with other applications or kernel data, and provides strong memory isolation.
The built-in security vulnerability protection is a security feature built in the architecture. The method can accurately identify and reject the application program from using the released outdated descriptors, and effectively immunize the high-risk memory security hole of 'reuse after use'.
The software airworthiness certification is explicitly supported, and since the security and isolation logic is mainly implemented by software (kernel agent module), the rules and behaviors are explicit, traceable, testable and verifiable. Such a software-defined security scheme is easier to formalize and code review than relying on complex hardware (e.g., IOMMU) configurations that may have "black box" behavior, thereby more easily meeting the requirements of the air software airworthiness standards of DO-178C, etc.
3. Ensuring high real-time performance and certainty
For avionics systems, and in particular flight critical systems, "predictable" is more important than "fast". The invention ensures the certainty of the system by design.
Predictable memory management, namely adopting a static pre-allocation memory pool based on a partner algorithm, and ensuring that the application and release of a memory buffer area are completed in a bounded time. This avoids unpredictable delays that may be incurred by standard dynamic memory allocation (e.g., malloc), ensuring the responsiveness of real-time tasks.
And controllable data transmission, namely, by setting priority or quality of service (QoS) strategies for DMA transmission of different data streams, the data streams with highest priority such as navigation, flight control and the like can always obtain bus resources preferentially, and the certainty of the data transmission is ensured not to be blocked by non-critical tasks (such as log records).
Embodiment two:
Based on the zero-copy data transmission method provided by the first embodiment, the embodiment of the invention provides electronic equipment, which comprises a processor and a storage medium;
the storage medium is used for storing instructions;
The processor is operative according to the instructions to perform steps according to the method described above.
Embodiment III:
Based on the zero-copy data transmission method provided in the first embodiment, an embodiment of the present invention provides a computer-readable storage medium having a computer program stored thereon, which when executed by a processor, implements the steps of the method described above.
Embodiment four:
Based on the zero-copy data transmission method provided in the first embodiment, an embodiment of the present invention provides a computer program product, including a computer program/instruction, which when executed by a processor, implements the steps of the method described above.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.